Featured Image

Data Managers: 4 Key Questions for Building a Connected Data Pipeline

Data Science and AI

By FactSet Insight  |  October 8, 2024

To drive business growth, data-management teams at financial institutions are working with substantially more data than in past years. There are material benefits, challenges, and costs that factor into the trend, making it a popular topic of discussion.

The purpose of this article is to highlight the operational elements that data teams are sorting out. They need to simplify complexities and prioritize tradeoffs as they drive toward a single connected data pipeline from disconnected data silos. Given the varied individual requests for data over the years, the latter situation is fairly prevalent today across many organizations.

For illustrative purposes, we also share a case-study perspective from a sovereign wealth fund that connected data from several vendors.

4 Key Questions for Data Managers to Answer

As managers know, the scope of data work is vast and deep. They are responsible for ingesting, quality-checking, mapping, transforming, enriching, delivering, and monitoring data across their firm’s data pipeline. Collecting data from disparate datasets (all of which may use different identifiers), their goal is to make the data work together and move it quickly (and cost effectively) through their pipeline to the final users.

Every firm has a unique data environment, but the common denominator is this: A significant amount of time and money is spent connecting, manipulating, and transforming disparate datasets into the appropriate formats. As the amount of data grows—and the complexity around tracking relevant securities and entities ever increases—there are operational questions for data managers to answer.

  • What is the most timely, efficient way to ingest data from internal and external sources, transfer technologies, and vendor processes?

  • What is the process to validate the accuracy of new data before integrating it into the firm?

  • How do you connect existing and new datasets at scale when each has unique identifiers and may require manual formatting before delivery to end-user systems and cloud-based platforms?

  • Who will request and manage support from various data providers?

The answers to those questions will help inform each firm’s data-management strategy. Costs, business priorities, and resources—such as which work to prioritize among highly skilled data scientists—all factor into decision making.

Generative AI also factors into the picture. As firms either build or buy platforms using AI, a single connected pipeline of accurate, unified data—a single source of truth—can make it easier for the metadata to direct where user questions go and provide accurate answers. In the shift to free-form prompts (compared to scripted prompts), a streamlined data pipeline is foundational.

Case Study

Following is an example of the real-world challenge of managing data processes.

The data-management team at a large APAC sovereign wealth fund sought to understand the risk exposure of its portfolio through ESG data from several vendors.

However, since the market identifiers and levels of quality varied across the sources, the content couldn’t be easily connected. The fund spent a considerable amount of time developing its own custom logic in an attempt to connect the sources before offloading this labor-intensive task.

Brought in to assist with the project, we evaluated each record in the file according to the relevant, client-specific rule and completed the end-to-end mapping and concordance. The mappings were then transformed into a standardized, consumable format that has enabled efficient consumption across the firm’s data-management team.

Our entity and security master, which was already available through the firm’s enterprise data-management system (GoldenSource, also a FactSet partner), was also an essential piece of the data connectivity process.

Overall, this approach has saved time and provided the sovereign wealth fund with a regularly scheduled mapping file that serves as the single source of truth to efficiently combine their portfolio and ESG data.

FactSet Data as a Service 

If you’re considering a data-management solution for your firm, visit FactSet Data as a Service to learn how we can help you. Our solution combines the best of our technology and data-management capabilities to facilitate and support numerous workflows, trusted entity and security mastering, and seamless vendor consolidation. 

 

This blog post is for informational purposes only. The information contained in this blog post is not legal, tax, or investment advice. FactSet does not endorse or recommend any investments and assumes no liability for any consequence relating directly or indirectly to any action or inaction taken based on the information contained in this article.

Comments

The information contained in this article is not investment advice. FactSet does not endorse or recommend any investments and assumes no liability for any consequence relating directly or indirectly to any action or inaction taken based on the information contained in this article.