Natural language processing (NLP) and machine learning (ML) are approaches that are becoming increasingly popular tools used to add value in the investment process. These approaches give us the ability to analyze unstructured data such as news, corporate filings, social media, and other sources to derive meaningful content.
Recently, this same technology has been used in an application that has historically been the job of thousands of analysts over countless hours around the globe: analyzing corporate earnings calls. Drawing from FactSet’s dataset of over 1.7 million corporate documents, we apply and compare three popular NLP approaches: FinBERT, Loughran McDonald (LM), and Alexandria Technology to identify any agreement among them and to compare their historical performance.
Extracting Facts, Figures, and Sentiment from Earnings Calls
Earnings calls have always been a source of integral information for investors. During these calls, companies discuss financial results, operations, advancements, and hardships that will steer the course and future of the company. This information is a key resource for analysts to forecast the ongoing and future financial performance of the company. The calls are split up between Management Discussion (MD) where executives present financial results and forecasts, followed by Questions and Answers (Q&A) where participants such as investors and analysts can ask questions regarding the MD results.
Using NLP and ML, we can analyze these calls in near real time. By parsing and scoring them for sentiment, we have an overall view of thousands of calls as well as a topic-by-topic (or sentence-by-sentence) detailed view of what was said, who said it, and what the sentiment was without ever having to dial in or read through the transcripts. Armed with the quantitative information derived from these calls, we can then explore various ways to incorporate and enhance our investment processes.
Previous research shows that using NLP for earnings calls can add orthogonal alpha not explained by traditional risk and return factors, but the question becomes, which method of NLP we should be using to analyze this data, and does it make a difference? To answer this question, we take a look at three NLP approaches and use them to apply sentiment to transcripts sourced from FactSet’s document distributor.
Machine Learning vs. FinBERT vs. Loughran McDonald
The first NLP model is the lexicon or “bag of words” approach of Loughran McDonald that uses a dictionary of words or phrases that are labeled with sentiment. The second is the popular machine learning model for finance FinBERT, based on the Google BERT language model but trained specifically for financial contexts. Finally, we look at Alexandria’s machine learning approach which is uniquely trained from analyst earnings call labeling. We apply each model to our sample universe of the S&P 500 for a period between 2010-2021 to gather sentiment scores at the individual sentence/topic level and the aggregate security level.
For the topic level, we found very low correlations between Alexandria and both LM and FinBERT (0.14 and 0.17 respectively), and the highest correlation to be between LM and FinBERT at 0.38. Once aggregated to the security level, we found slightly higher correlations between the three, as summarized in the second table below. We observe that despite all being NLP models specifically for the financial domain, there is large disagreement among them, particularly at the topic and sentence level.
Section Classification Correlation
|
Alexandria
|
Loughran McDonald
|
Loughran McDonald
|
0.14
|
n/a
|
FinBERT
|
0.17
|
0.38
|
Net Sentiment Correlation
|
Alexandria
|
Loughran McDonald
|
Loughran McDonald
|
0.45
|
n/a
|
FinBERT
|
0.30
|
0.41
|
Source: Alexandria Technology
Applying a Long/Short Trading Strategy
From here, we were curious to see how each approach would perform if we subjected the data to a simple long/short trading strategy, where we would be long the most positive companies and short the most negative. We ran this analysis over the same sample period of 2010-2021, rebalancing monthly, going long the most positive and shorting the most negative. Our net sentiment score for each security is calculated as follows:
Net Sentiment = Log ( Positive Count + 1 / Negative Count + 1 )
For the net sentiment, we use a six-month lookback window that captures two earnings periods for each security (two calls, each of which on average has over 200 data points). Both sections of the call MD and Q&A are analyzed separately and then combined in the net sentiment score. We break the data up into quintiles with quintile 1 containing the names we will be long and quintile 5 the ones we will short (ignoring the approximately 300 names in the middle of the sample that are the most neutral as derived from each approach’s sentiment scores).
In the sample period, Alexandria outperformed both LM and FinBERT in all years besides 2010 and 2011. The Alexandria approach had positive performance every year apart from 2016, where all strategies performed poorly.
Long/Short Q1-Q5 Annual Returns
|
Loughran
|
Alexandria
|
FinBERT
|
2010
|
6.75%
|
3.60%
|
-1.94%
|
2011
|
5.74%
|
5.47%
|
4.43%
|
2012
|
-5.29%
|
11.26%
|
0.01%
|
2013
|
1.93%
|
8.69%
|
3.52%
|
2014
|
1.02%
|
16.15%
|
1.51%
|
2015
|
9.30%
|
28.81%
|
20.37%
|
2016
|
-6.53%
|
-4.04%
|
-13.82%
|
2017
|
3.48%
|
22.88%
|
9.88%
|
2018
|
-0.32%
|
14.24%
|
4.17%
|
2019
|
2.35%
|
4.03%
|
-4.38%
|
2020
|
4.57%
|
10.69%
|
3.78%
|
2021
|
-4.64%
|
6.66%
|
-10.44%
|
Source: Alexandria Technology
There was a stark difference in the observed returns from the strategy over the period across the three approaches. Alexandria saw cumulative performance of 221%, LM was 19.8%, and FinBERT 16%.
Conclusion
NLP and ML allow investors a way in which to turn vast amounts of unstructured content into data that can be analyzed and incorporated into various areas of the investment process. Leveraging FactSet data, we compared three popular approaches to show the low correlation among them when applying sentiment for earnings calls. Running a simulation against the S&P 500 over the last decade shows that Alexandria’s NLP technology significantly outperforms against popular approaches FinBERT and Loughran McDonald. Adopting NLP approaches for earnings calls appears to add significant value and we will further explore applications for its use in the financial domain.
Download a full copy of this research paper from the Alexandria Technology web site.
Disclaimer: This blog post has been written by a third-party contributor and does not necessarily reflect the opinion of FactSet. The information contained in this article is not investment advice. FactSet does not endorse or recommend any investments and assumes no liability for any consequence relating directly or indirectly to any action or inaction taken based on the information contained in this article.