Announcing our new Course: AI Red-Teaming and AI Safety Masterclass
Check it out →Importance of Self-Calibration: Self-calibration allows LLMs to evaluate their own answers, reducing misinformation.
Practical Implementation: You can use self-calibration by prompting the LLM to assess its own response for correctness.
Effectiveness in Larger Models: Larger and more complex models perform better at self-calibration, improving accuracy.
If you ask any question to a large language model (LLM), in most cases, it will likely generate an answer, be it correct or incorrect. This is undesirable as it can propagate incorrect information to the users. Recently, Air Canada had to compensate its customer for providing incorrect information regarding bereavement fares. Such actions can cost companies significant money and tarnish their reputation.
Self-Calibration prompting1 is a self-evaluation technique that asks the model to evaluate its output after generating it. Experiments show that, while challenging, models can self-evaluate their answers as either true or false.
Employing self-calibration is a two-step process:
Let's see an example.
Step 1:
Ask the model, "Who is the first president of the United States?"
Step 2:
Ask the model to self-evaluate if the proposed answer is true or false.
You can also use Few-Shot Prompting:
Step 1: Get the initial response
Step 2: Validate the response
Self-evaluation results for Lambada 52B1
Larger models make fewer self-calibration errors1
The study clearly shows that LLMs are capable of evaluating their own response using a simple prompt. This can be used to minimize the number of false positive and false negative responses from the model. It also helps to establish confidence in the model.
Saurav Kadavath. (2022). Language Models (Mostly) Know What They Know. https://arxiv.org/abs/2207.05221 ↩ ↩2 ↩3