There is more unstructured data in existence today than ever before. According to Forbes, 90% of the world’s data was generated in the last two years alone, and more data means a need for more brainpower to process that data. The human brain can only take in so much, but technology has allowed us to increase that processing power by using computers as our cognitive processing tool. Today, financial professionals are looking towards artificial intelligence (AI) solutions to help them spend less time on data discovery and more time acting on the insights from data. One of the tools in the AI arsenal is named entity recognition (NER). NER is a machine learning, natural language processing (NLP) service that helps create structure from unstructured textual documents by finding and extracting entities within the document.
NLP is a subfield of AI that seeks to help machines understand human language. Natural language has been studied for centuries, and while humans are great at using language to express ourselves and our intentions, we are not so great at formally understanding and describing the rules that govern language. This makes training a machine to interpret human languages a difficult challenge to tackle, but the potential insights NLP can unlock from unstructured text is worthy of the challenge. Luckily you don’t have to build your own NER model on financial and corporate documents.
Sample Functionality of NER
Let’s start with a sample document/text:
“UBS hopes the flexibility will boost its attractiveness as an employer in the banking sector. It has not yet set a date for employees’ return to the office. Only UBS workers in roles that require them to be in the office, such as those in supervisory positions, or in trading and branch roles, will have less flexibility, the bank said. However, an internal analysis of the 72,000 UBS employees globally showed that around two-thirds are in roles that would allow them to combine working remotely and in the office. The Swiss bank’s approach stands in contrast to some of the major Wall Street banks. Goldman Sachs, for example, asked its employees in the U.S. and U.K. to come back into the office this month. JPMorgan Chase also told its U.S. workers that it was aiming to get half of its employees rotating through the office by July. JPMorgan CEO Jamie Dimon has said he believes that by “sometime in September, October it will look just like it did before.” Morgan Stanley CEO James Gorman has also been outspoken on the matter. “If you can go into a restaurant in New York City, you can come into the office and we want you in the office,” Gorman reportedly said.” CNBC
Different NER services can extract different types of entities. The example below reflects an NER service that extracts companies, people, locations, dates, and numbers from text. In the sample output image below, items in orange are companies, items in blue are person names, items in red are locations, items in brown are dates, and items in green are numbers.
For financial professionals, the most useful NER services can both extract and map the entities to well-known identifiers, allowing users to connect companies, people, and places to any existing content set.
Financial Use Cases of NER
Unstructured text contains abundant information; the challenge is to find what’s relevant. An estimated 80-90% of financial data is unstructured and the ability to analyze and act on this data presents a huge untapped opportunity. Unlocking the potential from unstructured data begins with recognizing and tagging entities within the data. Let’s explore five main use cases for NER:
1. Extract Structure from Unstructured Text Data
- Using NER allows you to parse data from documents and increase the speed and scale of content collection by extracting data such as numbers from earnings reports and link company names to other content databases. Companies that want to organize and analyze unstructured data can use NER to create a structured database. In the private markets and loans space, extracting data from large volumes of PDFs and websites is tedious, time consuming, and prone to human error. Using NER to tag and classify relevant data to extract information can aid and accelerate the process of assessing profitability and credit risk. Media and news companies can also use NER to identify companies on their own websites to enrich with additional data.
- NER can also help you monitor social media trends, and extract entities from Twitter and Reddit forums. Social media has changed the way we interact and consume information. As we saw with the meme stock drama this year, individual investors on a social media platform are not to be dismissed. Reddit and Twitter can be a powerhouse force driving stock price movements, and NER can help extract companies, people, and places mentioned on social media.
Whether you’re building a data lake or monitoring tweets for trending stocks, NER is a useful tool to help extract and tag entities, and allow users to easily connect extracted entities to other content sets to create structure from unstructured data.
2. Enable Smarter Search
Personalization is key and part of that personalization is enhancing a user’s search experience. NER can be used as an early step in developing efficient search algorithms. NER can be run on all documents to extract entities associated with the documents and be stored separately. This effectively tags each document, and the next time a user searches for a term, that search term would be matched with a smaller list of entities in each document, which leads to faster search execution.
3. Augment Research
Financial professionals find themselves inundated with more data than ever before, and NER can help increase research efficiencies when conducting online research. Investment ideas can come from anywhere, and NER can be used to identify all companies, people, drugs, and health conditions on a web page, and link directly back to an identifier that will be recognized by other financial tools that allow users to pull additional information and link to other companies. For example, exploring industry and competitive analysis changes over time; running NER on transcripts over a long period of time can extract how the competitive landscape has changed over time in a certain industry. NER can also be used to extract companies from internal research, streamlining the process of connecting ideas to content and data without the need for an additional step to link the commentary and notes to relevant companies.
4. Power Recommendations
Another part of personalization involves recommendations. Today, recommendations often drive how we discover new content, products, and ideas. NER can aid recommendation algorithms by extracting entities from one document and storing these entities in a relational database. Data science teams can then create tools to recommend other documents that have similar entities mentioned. This can power news, application, and workflow recommendations.
5. Leverage AI Support
Documenting and storing data around customer support is critical in gathering feedback and improving products. One way in which NER can address this process is in tagging entities and using those tags to categorize comments. Did Amazon get mentioned twice as much as Target? Is the mentioned company a tech company or financial company? Once accurately categorized, we can then assign the request to the relevant team that should be handling the question. And even beyond assigning relevant teams, NER can help power chatbots and enable chatbots to answer questions on recognized entities.
The potential for NER and other NLP tools to help financial professionals is just beginning to surface.
Get Started with FactSet NER API
You can see NER in action using FactSet’s RESTful API. FactSet's NER—trained on business and financial documents—not only extracts entities, but also links these entities directly to known FactSet identifiers, allowing you to connect companies and people to existing content sets. Input your own text and begin extracting entities now.
Disclaimer: The information contained in this article is not investment advice. FactSet does not endorse or recommend any investments and assumes no liability for any consequence relating directly or indirectly to any action or inaction taken based on the information contained in this article.