Featured Image

Innovations of Tomorrow: Analyzing Intercompany Research Partnerships for Investable Themes and Beneficiaries

Data Science and AI

By Hiroki Miyahara  |  December 13, 2022

With the growing interest in thematic investments, many investors are looking for ways to optimize their exposure to established megatrends (e.g., climate change). They’re also keeping an eye out for emerging new trends, disruptive technologies, and innovations that could shape markets in the future.

New technologies and products are often born out of the collaboration between companies operating in more established industries. For example, Pfizer and BioNTech co-developed the mRNA-based COVID-19 vaccine. Many companies also collaborate with other firms, government entities, and universities on research involving innovative technologies and other potential breakthroughs.

When viewed in aggregate (and through time), the subject areas cited by these research partners can provide unique insights into where these firms collectively see the greatest opportunities in the years to come. In this article, we demonstrate how business-to-business relationship metadata can reveal the innovations of tomorrow and the market participants working to bring them to life.

Relationship Keywords

We’ll get started with FactSet Supply Chain Relationships, a database that systematically catalogs 13 distinct types of company-to-company relationships, along with contextual keywords and relationship metadata, through time.

Let’s look at an example. Figure 1 illustrates a subset of research collaboration partners of Moderna and BioNTech as of December 2019. Each relationship has multiple keywords collected from source documents by FactSet analysts.

For example, you can find that Moderna had a research partnership with AstraZeneca to discover, develop, and commercialize mRNA-based medicines to treat cardiovascular and cardiometabolic diseases. If you look at the keywords of those relationships, you may notice that mRNA (or RNA) appears in multiple relationships. In fact, as mentioned earlier, both Moderna and BioNTech developed mRNA-based COVID-19 vaccines in 2020.

In this article, we will count the frequency of keyword appearance across all research collaboration partnerships and attempt to identify trending new themes. Here, we focus on all research collaboration partnerships with keywords as of the end of October from 2019 to 2022.

Figure 1: Research Collaboration Partners of Moderna and BioNTech as of 2019 with Relationship Keywords

The diagram shows four research partners of Moderna and BioNTech as of December 2019 (pre-pandemic). The partners and keywords shown are subsets of the total number available in the database.


Source: FactSet (data as of November 30, 2022)


Source: FactSet (data as of November 30, 2022)

Data and Data Cleansing

Since the keyword is shown as it appears in the source document, we need to make some modifications in order to consolidate key trends with different wording. Firstly, each word is lemmatized to standardize the form (e.g., cats, cat’s, and cats’ to “cat”). Other keywords have commonly used abbreviations (e.g., Artificial Intelligence to “AI,” Digital Transformation to “DX”), which are consolidated. Keyword phrases are tokenized into single words, where possible (e.g., “cat food” to “cat” and “food”). Finally, we remove overly generic words in the context of research collaboration partnerships (e.g., service, product, project) to home in on the distinct concepts we will be evaluating. From here on, we’ll refer to these keyword-based concepts as “themes.”

By analyzing the occurrence of each theme on a year-over-year basis, we can identify emerging trends and the market participants involved. As a company may report multiple relationships with the same keyword, we count the unique number of reporting companies per theme each year and calculate the growth trend.

Popular Themes and Implications

Figure 2: Number of Unique Source Companies

The number of companies reporting the research collaboration using each theme from October 2019 to 2022 is shown below. The percentage shown on the label is the growth rate from 2019 to 2022.


Source: FactSet (data as of November 30, 2022)

Figure 2 shows the fastest-growing themes over the last three years. Since growth is expressed in percentage terms, themes mentioned by less than 30 reporting entities are excluded to avoid distortions. “Hydrogen” and “recycle” gained the highest number of research collaboration participants in the last three years, indicating a growing demand for advances in environmental technologies.

By transforming the underlying (company-disclosed) keywords into consolidated themes, we were able to observe the emergence of these thematic concepts over time. Hydrogen and recycling are at the center of research engagements involving dozens of distinct use cases and subindustries. To unpack the diverse circumstances surrounding these thematically related partnerships, we’ll break out the complete set of keywords disclosed by the source companies involved in these research collaborations.

Due to the increasing demand for clean energy, more and more companies started research collaboration related to “hydrogen.” As of 2022, more than 150 companies report a research collaboration partnership related to this theme. Figure 3 shows the common words mentioned with hydrogen. Some of the most popular terms speak to the core technology and outcome (e.g., fuel, cell, renewable, green, energy). Other peripheral terms highlight specific end-use cases (e.g., truck, engine, vehicle, mobility) and functions (e.g., production, station, storage, battery).

Figure 3: Common Themes Appearing with "Hydrogen"

Color and size represent the number of words that appear with “hydrogen” in the relationship keywords.


Source: FactSet (data as of November 30, 2022)

“Recycle” is another popular theme and a key component in achieving sustainable consumption and production. The word cloud in Figure 4 shows two major areas for recycling research: batteries and plastics. The former ties in with the broader theme of renewable energy (also mentioned in relation to hydrogen), while the latter is likely a reflection of the growing concern about plastic waste, as reported by the OECD.

Figure 4: Common Themes Appearing with "Recycle"

Color and size represent the number of words that appear with “recycle” in the relationship keywords.


Source: FactSet (data as of November 30, 2022)

Research Participants across Different Industries

As we’ve seen, hydrogen has been one of the fastest-growing themes in the last three years. Accompanying keywords suggest the goals of these research projects are centered around the development of alternative energy or related products and technologies. One might think these partnerships are confined solely to Energy sector participants. Figure 5 is a network visualization of hydrogen research partnerships. The color of nodes and edges represent the companies’ RBICS Level 4 industry. In total, 491 companies from 91 distinct industries have participated in a research collaboration involving hydrogen as of October 2022.

As expected, many participants come from the Power Generation and Utilities industries (e.g., Power Generation and Support Products, Electric Utilities, and Wholesales Power) as well as upstream energy and materials sectors (e.g., Oil and Gas Exploration and Metal Mining). Perhaps more interesting are the downstream, fuel-dependent industries and products participating, such as Consumer Vehicle Manufacturing, Aerospace Equipment, and Transportation Equipment Manufacturing. Clearly, the variety of companies (nodes) and industries (colors) collaborating on hydrogen research underscores its perceived potential among market participants.

Figure 5: Network Visualization of Research Partnerships for "Hydrogen"

Dots represent the company, while lines indicate the research partnership. Colors denote RBICS L4 industries though there are some repeats due to a limited number of color variations. Only companies that have more than 10 partners are shown.


Source: FactSet (data as of November 30, 2022)


Keywords of research collaboration partnerships provide an interesting insight into what the market is trying to accomplish. While the developments in question may be years from productization, the extracted themes are highly correlated with the market’s (often unrealized) needs.

We leveraged natural-language processing (NLP) techniques to derive core themes from research partnership phrases and analyzed the year-over-year trend. Our subsequent examination of the surrounding keywords, as well as the companies and industries involved, revealed additional context around the desired end use cases and markets. Although we focused on research collaboration partnerships, future research could examine similar thematic dynamics occurring in and across other relationship categories (e.g., in-licensing, joint ventures, manufacturing, or distribution). Geographic breakdowns of key themes may also add an interesting dimension, as it was clear from our sample that Japanese firms are at the forefront of hydrogen research.

 Many investors struggle to systematically identify today’s most impactful trends and optimize their portfolio exposure accordingly. When it comes to forecasting the investable trends of tomorrow and their potential beneficiaries, one might find themselves reaching for a crystal ball in lieu of any real, data-driven insights. While no one can predict the future, the business-to-business relationship graph may offer some valuable clues.

Terence Kempf, Director of Product Management for Content and Technology Solutions, also contributed to this article. 

This blog post is for informational purposes only. The information contained in this blog post is not legal, tax, or investment advice. FactSet does not endorse or recommend any investments and assumes no liability for any consequence relating directly or indirectly to any action or inaction taken based on the information contained in this article.

Subscribe to FactSet Insight

Hiroki Miyahara

Senior Product Manager, Japan

Mr. Hiroki Miyahara is a Senior Product Manager at FactSet, based on Tokyo, Japan. In this role, he covers the Asia Pacific region for FactSet proprietary content sets including supply chain, RBICS, GeoRev, shipping, and FactSet Data Management Solution. Mr. Miyahara joined FactSet in 2011 and previously held roles as an account executive and product developer. He earned an MSc in economics from the University of Essex.


The information contained in this article is not investment advice. FactSet does not endorse or recommend any investments and assumes no liability for any consequence relating directly or indirectly to any action or inaction taken based on the information contained in this article.