What's in a Prompt?
When crafting prompts for language learning models (LLMs), there are several factors to consider. The format and labelspace both play crucial roles in the effectiveness of the prompt.
The Importance of Format
The format of the exemplars in a prompt is crucial. It instructs the LLM on how to structure its response. For instance, if the exemplars use all capital words as answers, the LLM will follow suit, even if the answers provided are incorrect.
Consider the following example:
What is 2+2?
FIFTY
What is 20+5?
FORTY-THREE
What is 12+9?
TWENTY-ONE
Despite the incorrect answers, the LLM correctly formats its response in all capital letters.
Ground Truth: Not as Important as You Might Think
Interestingly, the actual answers or 'ground truth' in the exemplars are not as important as one might think. Research shows that providing random labels in the exemplars (as seem in the above example) has little impact on performance. This means that the LLM can still generate a correct response even if the exemplars contain incorrect information.
The Role of Labelspace
While the ground truth may not be crucial, the labelspace is. The labelspace refers to the list of possible labels for a given task. For example, in a classification task, the labelspace might include "positive" and "negative".
Providing random labels from the labelspace in the exemplars can help the LLM understand the labelspace better, leading to improved results. Furthermore, it's important to represent the distribution of the labelspace accurately in the exemplars. Instead of sampling uniformly from the labelspace, it's better to sample according to the true distribution of the labels. For example, if you have a dataset of restaurant reviews and 60% of them are positive, your prompt should contains a 3:2 ratio of positive/negative prompts.
Additional Tips
When creating prompts, using between 4-8 exemplars tends to yield good result. However, it can often be beneficial to include as many exemplars as possible.
In conclusion, understanding the importance of format, ground truth, and labelspace can greatly enhance the effectiveness of your prompts.
Sander Schulhoff
Sander Schulhoff is the CEO of HackAPrompt and Learn Prompting. He created the first Prompt Engineering guide on the internet, two months before ChatGPT was released, which has taught 3 million people how to prompt ChatGPT. He also partnered with OpenAI to run the first AI Red Teaming competition, HackAPrompt, which was 2x larger than the White House's subsequent AI Red Teaming competition. Today, HackAPrompt partners with the Frontier AI labs to produce research that makes their models more secure. Sander's background is in Natural Language Processing and deep reinforcement learning. He recently led the team behind The Prompt Report, the most comprehensive study of prompt engineering ever done. This 76-page survey, co-authored with OpenAI, Microsoft, Google, Princeton, Stanford, and other leading institutions, analyzed 1,500+ academic papers and covered 200+ prompting techniques.
