Historically, the financial services industry was built around a terminal-based business model. However, industry participants now increasingly rely on bulk data feeds, giving them access to vast quantities of data. This process is filled with numerous challenges, not the least of which is the time required to get a customer up and running with the data feeds they require. The growth of data sharing brings the promise of efficiencies in data transportation and ingestion.
To implement a new data feed or FTP (File Transfer Protocol) process, firms are required to do some or all of the following:
Ultimately all of this takes time—precious time that could be used by the customer to explore the content in a trial scenario or to generate value for their firm.
With a myriad of database technologies available, each with its own proprietary method of loading, many data providers are starting down the path of creating a custom ETL process for each technology system they want to support for their clients. To compound this, even within systems that allow easy and relatively open access between a provider (such as FactSet) and their consumers/customers, geographical differences in where they source their data and where they want to consume it can further exacerbate the problem. It often makes sense to explore means of replicating data or ETL regionally to be able to provide quick and easy access to those customers.
There are additional obstacles:
Fast forward to today and data providers are expanding their delivery capabilities by transporting their content to clients via a data share. Conceivably, there will be a time in the future where no one wants to manage ETL and hardware themselves for a third party’s content. Is data sharing the future of data enablement?
The ease of onboarding (and offboarding), the ability to quickly connect disparate content sets, and the availability of content that would have been considered “niche” by connecting to content anywhere/anytime may prove too much of a benefit for customers to ignore. Combine that with the ability to tell customers that (1) they can easily access data in any system that implements the open standard, and (2) they’re not locked into having to do everything in a single platform or location (government regulations notwithstanding), and we may be witnessing an industry sea change.
FactSet partner Databricks just announced the open-source protocol called Delta Sharing that promises to make it easy and secure to share existing, live data in data lakes/lakehouses with support for a wide range of clients by using existing, flexible data formats, strong security, auditing, and governance while efficiently scaling to massive datasets. FactSet believes that Delta Sharing will make it easier for our clients to ingest our content, regardless of the platform or tools they are using. By having an open standard adopted by major players in the market, there will be even less of a fuss if a client comes to us requesting any of the systems that have implemented the open standard of Delta Sharing. It will “just work.”