In this final installment of our six-part series on overcoming generative AI challenges, we share observations and perspective on the legal and ethical landscape of this evolving technology.
Both the creators and users of Large Language Models can face legal and reputational consequences from LLM hallucinations. In one example, a model made up a false sexual harassment case against a real professor and cited a seemingly credible—but nonexistent—article from the Washington Post.
In another case, OpenAI was sued for defamation, and in another two lawyers were fined for including fake AI-generated citations in a legal briefing. One of the lawyers stated he had assumed ChatGPT was “a super search engine.” As described earlier in our series, LLMs should not be treated as encyclopedias, search engines, or databases.
In another incident, an Australian mayor named Brian Hood threatened a defamation lawsuit against OpenAI unless it corrected false claims that he served time in prison for bribery. In reality, Mr. Hood had been a whistleblower in a bribery scandal. But because of the way LLMs work, ChatGPT incorrectly generated statements that he was the guilty party. More on that in the next section.
Although there is not a one-size-fits-all solution to prevent those types of outcomes, there are some options for technology makers and users to consider.
Large Language Model makers
Makers have perhaps the greatest liability exposure. Examples of LLM makers include OpenAI, Google, Meta, Anthropic, and Hugging Face. Makers can automatically check their models’ generated responses for instances of specific text they are concerned about and modify or eliminate that output before presenting it to users. As of the time of this writing, asking ChatGPT various questions about Brian Hood returns an error message: "I’m unable to produce a response.”
Today’s technology doesn’t reliably allow LLM makers like OpenAI to surgically extract undesirable learnings from an already-trained model, although research in this area is ongoing.
Technology firms
Companies developing solutions based on third-party LLMs can use Retrieval-Augmented Generation (RAG) to ground their answers in known facts, rather than relying on the LLM to generate the entire answer. The LLM can still be used to generate conversational wording for the response, but not for producing the factual answers themselves.
Incorporating human review and disclosing AI-generated content are additional strategies that can reduce risk for these firms.
Individual users
Individuals must be aware of the risks of hallucinations and not expect factual answers from predictive text. If they do use LLMs for fact-based answers, they must aggressively fact-check those answers, especially in high-stakes situations. Here are examples to help guide your approach:
As a general rule for individual users, generative AI is well-suited to:
Creative writing that’s unbound by factual constraints
Brainstorming and generating ideas
Proposing alternate wording for style or clarity
Jogging your memory with the forgotten name of a book or noteworthy figure you describe
Another aspect of generative AI concerns the intellectual property of the training data LLMs have used. Meta, Microsoft, Anthropic, OpenAI, Stability AI, Midjourney, and others have been sued over their use of copyrighted materials in training their generative AI models. In addition, the 2023 actor and writer strikes in Hollywood included concerns that AI would potentially infringe on ownership of their images and written content.
Following is perspective across use cases, including details on FactSet’s practices.
LLM makers
As in above, LLM makers may need to filter results before presenting them to users. Depending on how courts rule, they may also need to go through an expensive process of building new models with training data sourced completely from public and permissive license domains.
Technology firms
Those companies will need to carefully determine which LLMs can be safely used for their needs. Firms using LLMs to provide their clients or employees access to proprietary or enterprise content should integrate RAG techniques. This not only allows these individuals to converse with the firm’s data, but it does so in a way that provides vetted company data, explainability, and the ability to limit access based on specific permissions. Several large technology firms (e.g., Microsoft, Anthropic, IBM, Amazon, Adobe, OpenAI, and others) have even promised to indemnify enterprise customers for any legal claims arising from using their generative AI products.
FactSet is a technology firm, and protecting sensitive data is of the utmost importance to us. We are committed to ensuring data privacy and security across all our solutions. At a summary level:
All queries entered by users into FactSet generative AI experiences are confidential and will not be used to train or fine-tune our models in an automated manner
Access to user queries and responses is governed and restricted
All models used by FactSet are private
We also set clear restrictions around the types of data FactSet employees can use with Large Language Models. And we provided our employees with an enterprise-safe model from which employee inputs and chatbot responses never leave our firm.
Individual users
Individuals will need to research any results they receive for potential copyright infringement before publishing them.
An ethical aspect of generative AI is the potential for biased responses. When ChatGPT first came out, its ethical guardrails were intended to prevent it from parroting any biased notions from its training data. For example, when asked what race and gender make a good scientist, it said those characteristics are irrelevant. But the system could be tricked by couching the same question as a request for a poem or a code function. In that instance, the underlying training data bias was revealed. However, OpenAI has continued to improve its guardrails, and that specific example can no longer be reproduced.
Early on, malicious actors and curious experimenters could instruct models to ignore their safeguards. That tricked generative AI systems into providing harmful answers, such as instructions for producing meth and napalm. AI companies quickly fixed that flaw. But researchers continue to find ways to make LLMs bypass their guardrails.
Addressing the ethical aspects of LLMs is challenging. As with legal concerns, reducing bias requires both technical and non-technical solutions involving cross-collaboration. AI firms can use various technical methods to lessen bias during model training. But it's not a magic fix as they require careful adjustments and can unintentionally create new biases. Moreover, humans famously disagree on exactly where the boundaries lie with respect to biased or unsafe content.
Having diverse teams working on LLMs can help ensure broad perspectives and reduce the chances of overlooking bias. Both internal and external regulation and oversight are needed, and governments globally have begun leveraging technology experts to advise on effective steps to move forward.
Of course, individual users must be aware of the potential for biased content and review generative AI output accordingly.
Misinformation is another ethical concern. Bad actors can spread deep fakes and other political misinformation more rapidly with generative AI. Content farms create high volumes of unreliable or fabricated news to generate programmatic ad revenue. Advertisers will need to take proactive measures to reduce the economic incentive for this kind of misinformation.
Even reputable news agencies have inadvertently published stories with false information. CNET did a full audit of 77 AI-written news stories after finding factual errors in one. They discovered many more stories needed corrections and decided to pause use of its AI engine. Some of the problems they identified were minor, but they also found plagiarism and substantial factual errors. They also took heat for not disclosing that the articles were AI generated.
Another option to mitigate misinformation is to require human reviews of AI-generated content before publication and/or to disclose AI-produced content.
Diligence in providing sufficient context in AI prompts is also important. For example, generating a text by providing an AI with human-vetted context and facts will produce more accurate results than asking an AI to write about a topic from scratch—but it’s not 100% foolproof. It is still critical to have humans review content before publication.
Technology and business leaders who know how to leverage AI while overcoming its risks will be well-positioned to accelerate their internal operations and elevate product usability and discoverability to a new level. A trusted partner like FactSet can help by providing solutions that address these risks.
Throughout this six-part series, we explored a range of considerations with generative AI and provided strategies for overcoming common challenges to unlock the full potential of generative AI. Consider the following toolbox of strategies for safer and more effective use of generative AI:
Awareness: A solid understanding of how LLM technology works—and how it doesn’t—is key to using it properly. This understanding will help you focus on leveraging generative AI for its strengths as a language model and for creativity, as opposed to using it for functions it is not built for such as math, fact retrieval, or research.
Model choice: Depending on your use case, different LLMs may be more appropriate or better tuned to your specific needs.
Providing instructions: You will get better answers by tuning your prompts with additional instructions and insisting on accuracy within your prompt.
Providing examples: Enhancing your prompt to include examples is another prompt engineering technique that will usually improve your results.
Provide context: LLMs will hallucinate less often when you provide more context for your prompt question.
Validating outputs: Checking an LLM’s responses for accuracy is especially important when asking an LLM to function beyond its core strength of language manipulation.
Retrieval-Augmented Generation: When building software products with LLMs, implementing Retrieval-Augmented Generation is an important technique to ground responses with content that is accurate, current, properly licensed, and explainable.
Enterprise safety: Always use an enterprise-safe model when your prompts include non-public data. If your only option is a public model, be sure to redact any private information from your prompts.
Disclosure: Content creators publishing LLM output should include human governance and disclose content that was created by an AI model.
Attention to detail. In the age of AI, audiences will need to be more even critical readers of the online content they consume.
AI Strategies Series links
AI Strategies Series: How LLMs Do—and Do Not—Work
AI Strategies Series: 7 Ways to Overcome Hallucinations
AI Strategies Series: Explainability
AI Strategies Series: Inconsistent and Outdated Responses
AI Strategies Series: Security and Data Privacy
This blog post is for informational purposes only. The information contained in this blog post is not legal, tax, or investment advice. FactSet does not endorse or recommend any investments and assumes no liability for any consequence relating directly or indirectly to any action or inaction taken based on the information contained in this article.