large language models (LLMs) like ChatGPT and GPT-4 have transformed how we interact with technology. From answering questions to helping write essays or code, These models excel at interpreting, generating, and mimicking human language. However, while they seem almost magical in what they can do, LLMs have several limitations that it's important to understand. Knowing these limitations can help you use them better and avoid common problems.
In this doc, we’ll walk you through the key challenges LLMs face, and how to work around them:
Before diving into the limitations, let's quickly recap what LLMs are. LLMs are AI models that are trained to understand and generate human-like text. They can answer questions, hold conversations, write content, and much more. They work by predicting what comes next in a sentence based on patterns they’ve learned from vast amounts of text data (like books, websites, and articles).
While LLMs are impressive, they aren't perfect. Let’s explore some of their main limitations:
One weird thing about LLMs is that when they don’t know the answer, they often won’t admit it. Instead, they’ll confidently make up something that sounds believable. This is called a "hallucination." For example, if you ask for a fact about a historical event that wasn’t in the data it was trained on, the LLM might invent details or events that never happened.
Even though LLMs can seem very smart, they often struggle with basic math. This is because they weren’t really designed to solve math problems. While LLMs are good at understanding and generating sentences, they’re not great at solving complex problems. For example, if you ask an LLM to solve a multi-step math problem or a puzzle, it might get confused and make mistakes along the way.
Each time you use an LLM, it starts with a blank slate—it doesn’t remember your previous conversations unless you remind it in the current session. This can be frustrating if you’re trying to have an ongoing discussion or work on a project over time.
LLMs are trained on data from the past. It means that if LLMs don’t have access to the internet or any way to look up information in real time, they don’t know anything that happened after their training data was collected. If you ask about recent events, they won’t be able to provide accurate answers.
LLMs learn from the text they’re trained on, and that text comes from the internet, a place that can contain biased, harmful, or prejudiced content. As a result, LLMs can sometimes reflect the same biases in their responses. For example, they might produce content that is sexist, racist, or otherwise problematic.
LLMs can be tricked or “hacked” by clever users who know how to manipulate prompts. This is called prompt hacking. For example, someone might be able to word a prompt in such a way that it gets the LLM to generate inappropriate or harmful content, even if the system is supposed to block such responses.
How to handle it: When using LLMs in public or for others to interact with, make sure there are filters and safety measures in place to prevent inappropriate use.
While LLMs have clear limitations, there are ways to mitigate their impact:
LLMs are incredibly powerful tools, but they’re far from perfect. Understanding their limitations—such as their tendency to make things up, and their struggles with bias and math—will help you use them more effectively. As AI continues to evolve, these issues will likely improve, but for now, it’s important to be aware of them and use LLMs responsibly.
LLMs can struggle with things like citing sources, showing bias, generating false information (hallucinations), performing math, and being manipulated (prompt hacking).
Knowing their limitations helps you use LLMs more effectively and avoid errors when working with them, especially in important tasks where accuracy matters.
Valeriia Kuka, Head of Content at Learn Prompting, is passionate about making AI and ML accessible. Valeriia previously grew a 60K+ follower AI-focused social media account, earning reposts from Stanford NLP, Amazon Research, Hugging Face, and AI researchers. She has also worked with AI/ML newsletters and global communities with 100K+ members and authored clear and concise explainers and historical articles.