Generative AI, while powerful, comes with many risks. So, how can you leverage it in a smart way to minimize those risks?
This is far from a new question. When leveraged properly, generative AI products’ ability to process incredible amounts of information and perform complex analysis can be intimidating—particularly from a standpoint of security and compliance. Business heads find themselves wondering:
-
Is our company data safe?
-
Is it being used and distributed externally?
-
Are people using it for work-related tasks?
While these are natural and understandable concerns with any newly prominent technology, they are especially relevant given today’s rampant risk of data breaches and misuse. When used correctly, however, generative AI can become a powerful tool for your employees to learn and grow. It can also be a force multiplier for daily work, enabling individuals to delegate the mundane and focus on interesting and high-value tasks—even pick up a few new skills along the way.
FactSet quickly and fully embraced generative AI as a learning tool for its employees. The company encourages exploration of all that generative AI has to offer and has built a tool that enables and empowers all employees to learn in a safe, secure environment.
In May, we held our 11th annual Hackathon, a 24-hour event at FactSet facilities around the world, open to all employees. Participants get involved in purely experimental projects of their choosing. They can operate either solo or within a team, and many projects go on to full production implementation. Peer voting and judging panels across multiple subject areas select projects that are “Best in Show,” “Most Innovative,” and “Best Project Leveraging Generative AI.” Additionally, a diverse panel of senior colleagues across the company select “Business Impact” awards for the projects offering the most potential value to FactSet’s business.
In preparation for this year’s event, we built our own UI wrapper around Large Language Models (initially OpenAI ChatGPT 3.5Turbo, to be specific). This provided an easy user interface for Hackathon attendees instead of requiring them to interact with the model programmatically. As mentioned, all employees, not just engineers, are encouraged to participate in the Hackathon as great ideas can truly come from anywhere. And since we deployed models using Azure Services within our private tenant, we can guarantee safety for company data, and our prompts or responses would not be used for future model training.
All of this and the significant contributions by Jonathan Gilchrist1, a Principal Software Engineer at Factset, is how chat.factset.io (shown below and referred to as “Chat” for the remainder of this article) came into existence back in May 2023.
What is Chat?
Chat, at its core, is a simple idea. It’s an LLM-agnostic UI wrapper—meaning the UI works with any conversational Large Language Model—that provides the same experience we had already become accustomed to while using the public version of OpenAI’s ChatGPT.
It gives safe access to an environment in which FactSet employees can explore and utilize the immense power and potential behind LLMs. It builds an experiential learning environment to empower everyone to learn by doing. It accelerates FactSetters’ workflows and helps them capitalize on precious learning time. It also enables employees to master how to interact with these tools and benefit from the massive corpus of knowledge hidden behind the perfectly written prompt.
How Many are Using Chat?
Once we launched Chat, we were greeted with an outpouring of positive feedback. Against the backdrop of nearly daily headlines about the growing number of firms blocking ChatGPT access for employees, we instead opened our arms and made it easy and accessible for ours, in a safe and secure way.
Chat reached 1,000 users within the first 48 hours, and it has since grown in the few short months to nearly 6,000 users (~50% of employees) and processed 400 million tokens at the time of writing this article.
You can also note the drop in usage in September for the number of sessions, conversations, and messages as the user count continues to grow. Coupling with the data points that average number of messages per session has dropped by 20% (5 to 4) and input prompt size has increased by almost 200% (43 tokens to 122 tokens), users are becoming more efficient and gaining the benefit of these tools at a shrinking cost. That further highlights the advantage of giving our employees exposure to generative AI early and with proper training around it.
Taking into consideration the notable improvements in task completion—our Chat community reports as much as 20% improvement on research time when working in an unfamiliar area—the ~$5,000 we have spent in total since launch on both Azure OpenAI costs and infrastructure costs is a worthy tradeoff.
We currently have four models deployed on the app—GPT3.5 Turbo, GPT-4, GPT-4 32k, and AWS Titan—to deliver multi-model solutions rather than a one-size-fits all approach. Additionally, we intend to deploy Claude, Llama 2, and Jurassic as well as some specialty models soon. This enables FactSetters to test differences between models to find the best solution for their use case.
How are FactSetters Using Chat?
Let’s circle back to the concern about whether people are using the tools for work-related tasks or for something that is not related to their role. From the beginning, we communicated these tools are not free for FactSet. Every prompt has a cost and, being believers in transparency, we put that cost right on the page for users to see.
Supplemented with training how tokenization works, FactSetters learn through doing that certain activities are more expensive than others. And while we don’t discourage the costlier use cases, we empower users to make that decision for themselves after understanding the costs involved. A good metaphor here is that when learning to drive a car, you eventually figure out that throttling the pedal results in worse gas mileage.
As shown in the following chart, we see a wonderfully diverse array of job families across the company leveraging Chat as a daily co-pilot and learning tool across an even wider array of categories.
To gain insight into how FactSetters are using the tool, we simply ask them and leverage generative AI with clever prompt engineering to help identify themes. We pass prompts back into the system, drop the temperature to 0 (which reduces the model’s tendency to be “creative”), and ask it to categorize them into key themes such as Technical, Finance, Management, Communication, Research, Professional Skills, Leadership, Soft Skills, and Personal. At the time of writing, < 2% of the almost 300,000 input prompts are considered personal. We’ve classified all the rest as related to job functions, including code generation, communication/writing assistance, language translation, idea generation, and research, to name a few.
While many types of data are allowed to be used in our private Chat, such as first party data, internally developed code, and open-source code under certain permissive licenses, we do not allow FactSetters to use third party, sensitive HR, financial, material non-public information or client data without prior approval.
How is This Shifting the Learning Process?
Everyone learns in a different way. Some prefer structured learning in a classroom, some prefer reading physical books, some prefer asynchronous learning such as watching video recordings and so on. In a corporate environment, it can prove difficult to capitalize on learning time. How do you quickly find content that is relevant, helpful, and high quality? When you find that precious hour to spend for professional development, you don’t want to waste it hunting for material.
This is why we believe Chat will revolutionize the way FactSetters learn, in fact we already have thousands of people every month turning to Chat to learn and explore, rather than going to other sources like Google.
For example, since we launched Chat, dark learning (i.e., using non-trackable sources such as Stack Overflow or Investopedia) has dropped 19%. This is a valid method of learning, but largely what people are learning on those platforms cannot easily be tracked. We also are seeing a decline in traffic to sites that Palo Alto Labs categorizes as “Training and Tools,” and we expect further decline as adoption and data coverage increases.
All this said, we’d like to take Chat a step further. Like many companies, we have vast amounts of internal learning material that, despite our best efforts, can become buried under a sea of SharePoint sites and a myriad of other systems.
To better elevate and source our internally developed material, we intend to develop systems that leverage RAG (Retrieval-Augmented Generation) and Fine Tuning (specifically, Parameter Efficient Fine Tuning). This initiative, as with Chat, will help redefine how FactSetters learn and research on the job.
We believe generative AI can help you increase productivity, enhance client and employee experiences, and accelerate business priorities over time. Hopefully our approach and insights to opening LLMs to FactSetters will help you consider an approach that works for your organization. If you’re already doing this, what are your key learnings?
1 Mr. Jonathan Gilchrist is Principal Software Engineer at FactSet, based in London. In this role, he works on FactSet's Research Management Solutions (RMS) product suite, and more recently, on its internal LLM chat tool. Prior to FactSet, he held roles at IBM and BAE Systems. He earned a MEng in Computer Science from the University of York.
This blog post is for informational purposes only. The information contained in this blog post is not legal, tax, or investment advice. FactSet does not endorse or recommend any investments and assumes no liability for any consequence relating directly or indirectly to any action or inaction taken based on the information contained in this article.