Co-authors of this article are Tom Abrams, Associate Director for Deep Sector Content at FactSet, and Nina Piper, Vice President, Director of Strategy and Product Development for the Cognitive Computing group at FactSet.
The financial community has been using AI for over a decade with varying degrees of success, powered by a string of important technological developments. For many in the market, however, AI is still a mystery with new vocabulary like transformer architecture, bots, embeddings, large language models (LLM), generative AI, prompt engineering, and vector databases. While some firms are diving into large language models to gain insights or efficiencies, others may follow more slowly until security and accuracy concerns are addressed.
In the following remarks we disaggregate the new AI ecosystem into those with the AI models or bots and those with the data in conjunction with models—either for training or to define an acceptable scope of results.
Leaders in the modeling world should have an advantage in processing with component software. Those controlling data should have an advantage as far as model inputs. Partnerships, perhaps, will leverage both strengths. While those with scale will likely be positioned well, smaller firms with specific data or a specific application may thrive without having to develop their own LLM processing capabilities.
Investment professionals need good ideas and timely, accurate explanations of what has been and is happening. Investment decision-making is a low signal-to-noise business—innumerable amounts of numeric and textual data but only a few actionable items among them—compared to many applications of AI that are higher signal to noise (image recognition, automated vehicles, speech recognition).
Filtering out the noise without filtering out the nuance is the challenge in investment decision-making. Many experts believe AI can or will soon help with certain functions in the investment process, but also that AI may have perpetual difficulty in making unfettered investment decisions.
As AI adoption spreads, it will be useful to understand what kind of functions the technology could more easily replace rather than believing in the blanket statement that all jobs will be lost. We see validity in the concept that the technology will be a co-pilot rather than a pilot in many instances.
Software coders, for example, would still code, but the initial components might be AI-generated building blocks. Also, repeatable tasks related to text data—like extracting information from websites, news, and financial documents—could be expedited with AI models.
In fact, FactSet has been doing this for several years to gain efficiencies in content collection. The next wave of tools powered by Generative AI will include co-pilots that help draft reports or build models. Writers, researchers, and students could also leverage these tools.
Assessing and managing investment volatility and risks are not given as much emphasis in the media as pure performance, but risk management is a key component of the investment process. AI can already give some indications by analyzing historical data and relationships, tracking market trends for statistically similar past situations, and point to portfolio characteristics that could be problematic.
Other possible investment management uses are regulatory understanding and compliance via processing legal documents, case histories in legal proceedings, and automated fund reporting as well as optimized customer relationships. So far, the ability to ask the right questions of the technology seems critical, but a machine helping with research into a broader set of documents or data should be leverageable. FactSet, for example, offers entity detection, key-topic identification, and the ability to score positive and negative sentiment in transcripts.
It has been said that data is the new oil. This data can be text or numeric, though AI models to date have had better luck with text. The Internet has ironically been a 30-year effort at putting all information on the web for LLMs to access today.
Much of that information has been offered free or at prices that don’t reflect derivative value such as the training material for LLMs. Cost of models built on free information may face issues in the future—copyright infringement, paying for derivative use, and private information contracts and laws will be tightened up in coming years. Restrictions on the use of data could turn today’s low-cost LLM queries into a very different economic proposition.
In addition, just having data, particularly publicly available data, may no longer be sufficient for data providers. Investment managers want the stories around that data, and AI can help with that. Commodity data providers will have to gain advantage from concordance, access/discoverability, delivery methods, comprehensiveness, and organization.
Considering both data and models together, rather than a winner-take-all approach, there may be bots developed for specific applications such as financial analysis, trading, health care, HR, construction, or education. Each will be trained using different types of data and, in many cases, the data will be proprietary to certain models. Having good data will help with the result quality. However, to stay up-to-date, models will need periodic retraining or will need to be used in conjunction with high-quality, current information.
Math capability. Today’s large language models are not well-equipped to make predictions or offer financial guidance. One reason is because generative AI is great at understanding human words, but not good at math. The investment decision-making process is math intensive, relying on financial statements, ratios, volatility, normalizations (e.g., there are multiple ways to calculate averages), and first-, second-, and third-degree rates of change.
To date, AI can return numeric values but cannot interpret or manipulate them in sophisticated ways. Quant managers have been fine tuning statistical performance and risk models for many years to find data anomalies. AI systems need to be able to process large amounts of numeric data to identify trends, make predictions, and respond to market events. It is not clear yet if generative AI solutions can add value to these activities.
Multiple model coordination. Multiple AI models operating in parallel could conceptually accelerate and increase the number of datasets considered in the testing of data and the updating of relationships between data points. Textual data could also be brought into the mix.
That is, one bot could work on one dataset while another bot works on a second textual one while a third considers numeric tables. The output of each could be brought together by a fourth bot to analyze and compile the results.
In the future, this may reduce the need to coordinate vast amounts of datasets before processing, but will require significant software coordination—for example, to handle exceptions and dataset changes.
Data biases. Users will have to consider any biases in data collected, sometimes unintentional, such that the AI model biases the output as well. Even the timeframe selected can be unintentionally impactful if it is shorter than a specific cycle or includes a few dramatic historical incidences that throw off “normal” precedent relationships.
Real time. Investment management is also frequently a real-time exercise, with data constantly changing on the margin and in somewhat random and uncorrelated sequence. AI models, having been optimized on past information, will constantly have to be aware of new information that may not fit the learned patterns in the corpus of information on which the model trained.
Investment and trading algorithms will need to efficiently understand multi-market complexity and liquidity on a real-time basis. Computing power will have to greatly increase to process real-time models, and with complexity comes more chances for garbage in, garbage out.
Is it the trend or the change that matters? Some managers are trend followers and will continue with a position until that trend changes. Others look to get in and out ahead of change. In these two instances, the investment nugget may be tracking the normal pattern (which AI will be better at initially) but may be weaker in sensing change, exceptions, and the nuances of new data.
There has been a trend toward automation for many years, and LLMs are another tool in that evolution. There will still be need for expertise. Processes that include LLMs need to be designed with enough subject matter expertise to ask the “right” questions of the system.
For example, when evaluating documents, frequency of mention may not necessarily be as important as the outlier comment that portends a change in direction. In addition, when considering document or article summaries, a lot of news is not newsworthy or is a repeat of a story on the wires.
Generative AI processes will need continuous feedback, and operators need a thorough understanding of “right” and “wrong” answers. Because it is time consuming and expensive to retrain today’s LLMs, this will mean a focus on model prompts and the addition of human-in-the-loop processes, rather than updates to the models themselves.
Manager fiduciary duty, in part, means understanding risks and biases in one’s process and the ability to explain it to clients and sponsors. Black boxes that spit out investment guidance may help with performance, but that may not actually be sufficient for asset managers.
As a result, processes built for financial professionals (including generative AI) will require fully defined guardrails, compliance alerts, methods to determine what is fact vs. derived data, source citations, and audit trails.
Caveat emptor on the use of AI today. For competitive reasons, many technologists want to charge unfettered and full speed ahead with generative AI. However, with recent stories of AI models being sued for wrongful accusations, hallucinations, and simply wrong output, AI is under pressure in some quarters, including regulatory authorities around the world.
Regulations will eventually arise to minimize the impact of nuisances like rogue algorithms or data introduced into bots to skew output. Other challenges such as where data resides, who has rights to data, and how to receive full value for data provided could each result in expensive security protocols.
Another issue is that many data providers will both want compensation for new uses of their data and security when sharing proprietary information with external models. Secure sharing should become common, however, in the same way that sensitive corporate information has been shared with enterprise software providers for years with controlled risks.
Putting these thoughts together, what might describe AI winners in the future? Strengths will be what is developed on a proprietary basis and not available to others. Moats will be having the right partners with you on the modeling or data side. We also see a need for subject matter experts who understand client processes, can develop useful prompts from the data, can organize (concord, link entities, translate) data, and can stand up the product for clients.
Many quantitative and machine learning models serve larger cap developed markets, so there will be opportunities for AI to reach currently underserved markets such as smaller, private, or global markets. Firms with a long history of proprietary internal research documents could be an interesting strength as input to a model.
We may also see a trend toward multiple leading providers in multiple sectors, including investment management. Fintech offerings will differ between providers based on the number and types (numeric, text) of datasets as well as how those datasets are processed.
Generative AI is yet another step in computing power that will sweep across all industries in time. Some functions lend themselves to AI more than others, and some may never see complete answers for some aspects of the investment-management business.
The business needs of investment management will require auditability and transparency for many key functions; will have to incorporate numeric datasets; will likely still require expertise to work with the technology; and will have to work within the bounds or regulations likely to evolve over the next few years, both in the US and abroad.
Tom Abrams is the Associate Director for deep sector content at FactSet. In this role, he is responsible for integrating additional energy data onto the FactSet workstation, including drilling, production, cost, regulatory, and price information. Prior, he spent over 30 years working at sell- and buy-side firms, most recently as the sell-side midstream analyst at Morgan Stanley. He also held positions at Columbia Management, Dreyfus, Credit Suisse First Boston, Oppenheimer, and Lord Abbett. Mr. Abrams earned an MBA from the Cornell Graduate School of Business and holds a BA in economics from Hamilton College. He is a CFA charterholder and holds certificates in ESG investing, sustainable investments, and real estate analysis.
Nina Piper is Vice President, Director of Strategy and Product Development for the Cognitive Computing group at FactSet. In this role, she is responsible for connecting business leaders across the organization with Machine Learning engineers to develop inventive solutions that create internal efficiencies and improve client workflows. Nina has worked in the financial-software industry for over twenty years and is an expert in financial news and document search functionality. She has a B.A. and M.A. in American History from the University of Texas.
This blog post is for informational purposes only. The information contained in this blog post is not legal, tax, or investment advice. FactSet does not endorse or recommend any investments and assumes no liability for any consequence relating directly or indirectly to any action or inaction taken based on the information contained in this article.
Scaling M&A Capacity: How AI Tools Are Optimizing Junior Banker Performance
Given potential for more M&A activity under the new administration and Congress, investment banks that thread AI tools into their...
Unifying Investment Research Workflows for Centralized Collaboration and Compliance
Given the amount of data, sources, formats, systems, and compliance elements that asset management firms are working with,...
Data Managers: 4 Key Questions for Building a Connected Data Pipeline
To drive business growth, data-management teams at financial institutions are working with substantially more data than in past...
Using Large Language Models to Converse with Your Data
The emergence of generative AI has amplified the importance of reliable data in factual, data-centric decision-making processes....
The information contained in this article is not investment advice. FactSet does not endorse or recommend any investments and assumes no liability for any consequence relating directly or indirectly to any action or inaction taken based on the information contained in this article.