Featured Image

Enterprise Architecture Centralization – Reality or Pipe Dream?

Data Science and AI

By Akash Nakarja  |  March 17, 2020

As we enter a new decade in financial services, we face an inevitable shift towards the critical importance of data in the investment process, thanks to a proliferation in digital capabilities. However, data alone does not suffice—firms need to derive business intelligence, or rather actionable insight, to gain a competitive advantage against a backdrop of squeezed margins and regulatory pressures. To assist with this challenge, data scientists and a new set of tools have emerged.

Consider the IT landscape. Today we are seeing less emphasis on the demand for esoteric business workflow solutions and more on using data mining techniques that include artificial intelligence (AI), machine learning (ML), and deep learning (DL) to help derive actionable insights. This represents the new alpha-generating frontier. With today’s computational power, machines can process mass data that would otherwise take humans years to arrange and correlate. This new technology, partnered with a firm’s willingness to share siloed information across varied systems and stakeholders, will dictate how well it can reduce operational overheads and foster innovation.

The Importance of an Enterprise Architecture

Defining an enterprise architecture (EA) that enables communication between diverse systems and stakeholders is critical to the successful modern investment firm. Financial services firms continue to grow in sophistication, prompted by geographical expansion, diversification of investment strategies, required technology innovation, and regulatory obligations. This sophistication leads to the need to evolve enterprise architecture, which inherently creates a complex web of legacy and new third-party/proprietary platforms that require seamless interoperability. New communication techniques such as APIs and common data models are introduced.

It should be noted that defining a strategy around EA can be cumbersome, costly, and typically requires significant buy-in from multiple business units and executives. However, enterprise architecture is not an IT exercise alone; it must also consider business processes, application systems, data modeling, data delivery, underlying technology, and all-important budgets. There are some acknowledged enterprise architectural frameworks in existence that institutions have employed (i.e., TOGAF, Zachman, FEAF, and Gartner), while other firms have devised their own such strategies and models. By embracing a specific framework or other structures, organizations inherently give rise to centralized modeling concepts such as the canonical data model (CDM). CDMs can be used in conjunction with business process management (BPM) tools, service-orientated architectures (SOA), and enterprise service buses (ESB).

Together with CDMs, ESBs can help firms move away from legacy point-to-point integration patterns where all systems talk to one another in a very redundant and inefficient manner. The ESB ensures that applications need to speak only once to the bus—it is then the bus’ responsibility to deliver data to all recipient applications and platforms. The receiving process might do a simple calculation and re-insert onto the bus, or display some data to a user, or even perform a complex extract transform load (ETL) process. When there is an ESB, there usually is also a CDM, which abstracts custom or proprietary data formats to a canonical or common format that can be delivered through an ESB.

ESBs give rise to many sought-after operational benefits including centralized monitoring and alerting, abstraction from any specific interface, audit logging, and more. The use of interoperable data buses and microservices also enables software vendors, who are embracing these concepts for their applications, to move services to cloud technology. The topic of centralization itself and the use of a CDM is much-debated, so let’s explore this a bit further.

Creating Seamless Interoperability

Before we dive into the topic of centralization, some clarity on the definition of interoperability could be helpful since interoperability can occur on multiple levels. Broadly, we want to distinguish between server-to-server and desktop-based communications. With enterprise integration, we typically think about server-to-server based communications, which is where the back-end databases of the various systems must harmonize with one another. Often this communication will involve batch and event-driven processing and large datasets as the underlying platforms may serve hundreds if not thousands of end-users. Therefore, latency, bandwidth, processing power, and processing time are factors to consider.

A CDM is used to create seamless interoperability across an enterprise by standardizing the communication between connected systems through a common (canonical) language. Thus, if system A wants to send data to system B and vice-versa, A and B do not to be concerned with each other’s “language.” System A will convert its data to the canonical form and system B will receive data in the canonical form for conversion into its own format/schema. This way multiple systems can disseminate their data by only having to convert back and forth between their language and the common or canonical language. There are many advantages to this paradigm including:

  • Ease of integration and interoperability
  • Reduction in dependencies between two or more systems
  • Ease of displacing or changing legacy systems without disrupting the data model or existing integration points
  • Cleaner workflow understanding across technical and non-technical stakeholders
  • Increased scalability versus point-to-point integration

The graphic below depicts some archetypal systems that current investment firms have embedded in their enterprises, requiring seamless interaction with adjacent platforms. As outlined, each system has its own “language” (data format) and thus communication protocol (think SFTP, MQ, FIX, TCP, etc.) for disseminating and processing data. First, the ESB ensures efficient data delivery through the concept of sending once to the bus for delivery to multiple recipients concurrently and second, the adoption of a CDM fosters simplified data translation, consistency, and scalability. Each system is not concerned about converting between its “language” and all others across the enterprise, even when new systems enter the ecosystem. For example, the CRM in this case only needs to manage its translation back and forth from XML to the chosen canonical format.

ESB and CDM in Practice

ESB and CDM in Practice

Sometimes the ESB will manage the ETL processing across interconnected systems depending on how enterprise integration has been implemented. Overall, this allows for system displacement and technology change/innovation without causing wholesale disruption to the workflows implemented across the ESB.

Challenges to Consider

An easy trap that companies often fall into is the use of a specific business-oriented central system to define their CDM such as an accounting, EDM, back office, or data warehouse platform. The use of a specific application can have downstream repercussions as standards and versions become obsolete or business changes drive replacement of that application. A better approach might be to abstract this altogether from any one system or platform. A valid counterpoint, however, is that CDMs can sometimes be more theoretical than practical and attempting to define a “one-size-fits-all” data model for an entire enterprise, as well as its business units and systems, is a flawed approach.

Furthermore, the potential seismic effort to implement a CDM at a large scale, including the required business and management buy-in, can often appear too great a hurdle when weighed against their perceived value. The reticence of business teams, who are likely to see less immediate transformation, also adds strength to the naysayers’ argument. There is also the internal debate in many corporations on whether such initiatives lie within the remit and control of business or IT budgets. Other questions also arise, namely how to deal with the risk associated with exceptions from the model in situations where the needs of a specific business process are not being addressed or when the cost and effort associated with conforming to the canonical standard outweigh the business benefit to be reaped.

One approach addressing this problem is known as domain-driven design (DDD) and bounded contexts as described by Eric Evans. By limiting the scope, models are confined to the context in which they reside and CDM principles can still apply when contexts share overlapping common attributes. It is a fallacy to think that a canonical format can elegantly model all aspects for all business areas of a complex organization. This will lead to a very distorted view of certain data objects; as a simple example, consider how a buy or sell transaction might vary in definition between asset classes. There is a reason that FIX has more user-defined “custom fields” than there are standard fields.

Deciding on formats and standards at the highest level can also be a challenge. There are benefits to using standards such as FIX and XML that are widely used in the industry, although incorporation of FpML and JSON offer viable alternatives or bolt-on options. Just think about the adoption of ISO 20022 in financial services and the impact this may be having on any centralized CDMs and why there’s a valid argument for business-centric models. It is often argued that the opposite of the top-down approach is what would be more conducive to standardization. That is, let the business units adopt common attributes from an overarching and less-restrictive CDM, enabling each business unit to maintain their specific nuances for use cases, somewhat of a bottom-up hybrid approach.

Conclusion

Seldom do we see overly centralized data models or architectures. The reality is more a blend of practices across investment desks, divisions, or geographies where corporate data modeling principles will play their role in shaping the business-specific definitions. This helps to maintain a balance in enterprise architecture design between horizontal policies and vertical needs. Firms are required to continually review this challenge as the need for well-concorded, highly available data with appropriate business governance continues to be the fuel for investment outperformance and regulatory compliance.

Maintaining a seamless flow of quality data across an enterprise, feeding data-hungry machines, and defining critical business-intelligence processes will be a key differentiator for successful financial services companies. Data is the lifeblood of the modern enterprise—structured data, unstructured data, fundamental data, reference data, pricing data, alternative data, derived analytical data, and operational data. The deliberate definition and lineage tracking of how all this data is identified, collected, cleansed, stored, transformed, and distributed is paramount. An enterprise architecture needs to ensure that the pipeline from source data to authorized data is practically designed for scale and agility.

The modern enterprise architecture is one of the key foundations of a digital transformation. Firms that are looking at their enterprise reporting, data governance, and investment performance will find that the choices made in their enterprise architecture strategy will set the stage for future innovation and operational efficiency.

OpenFin FactSet Video CTA

Akash Nakarja

VP, Director, Client Solutions, Sales Engineering & Technology Services

Comments

The information contained in this article is not investment advice. FactSet does not endorse or recommend any investments and assumes no liability for any consequence relating directly or indirectly to any action or inaction taken based on the information contained in this article.