🧠 AdvancedSelf-Criticism🟢 Self-Calibration

🟢 Self-Calibration Prompting

🟢 This article is rated easy

Reading Time: 2 minutes

Last updated on September 27, 2024

Bhuwan Bhatt

Takeaways

Importance of Self-Calibration: Self-Calibration allows LLMs to evaluate their own answers, reducing misinformation.
Practical Implementation: You can use Self-Calibration by prompting the LLM to assess its own response for correctness.
Effectiveness in Larger Models: Larger and more complex models perform better at Self-Calibration, improving accuracy.

What is Self-Calibration Prompting?

If you ask any question to a Large Language Model (LLM), in most cases, it will likely generate an answer, be it correct or incorrect. This is undesirable as it can propagate incorrect information to the users. Recently, Air Canada had to compensate its customer for providing incorrect information regarding bereavement fares. Such actions can cost companies significant money and tarnish their reputation.

Self-Calibration prompting^{1Saurav Kadavath. (2022). Language Models (Mostly) Know What They Know. https://arxiv.org/abs/2207.05221} is a self-evaluation technique that asks the model to evaluate its output after generating it. Experiments show that, while challenging, models can self-evaluate their answers as either true or false.

How to Use Self-Calibration Prompting?

Employing Self-Calibration is a two-step process:

Get the initial answer.
Ask the model whether the proposed answer is true or false.

Let's see an example.
Step 1:
Ask the model, "Who is the first president of the United States?"

Step 2:
Ask the model to self-evaluate if the proposed answer is true or false.

You can also use Few-Shot Prompting:

Step 1: Get the initial response

Step 2: Validate the response

What Are Self-Calibration Prompting Results?

Model are apt to self-evaluate their own samples. In most cases, they can correctly identify whether their predictions are correct or incorrect.

Self-evaluation results for Lambada 52B^{1Saurav Kadavath. (2022). Language Models (Mostly) Know What They Know. https://arxiv.org/abs/2207.05221}

Larger models are better at Self-Calibration.

Larger models make fewer Self-Calibration errors^{1Saurav Kadavath. (2022). Language Models (Mostly) Know What They Know. https://arxiv.org/abs/2207.05221}

Limitations of Self-Calibration

The authors focus on pre-trained language models and exclude finetuned models. Hence, the technique may not work well for fine-tuned models.

Conclusion

The study clearly shows that LLMs are capable of evaluating their own response using a simple prompt. This can be used to minimize the number of false positive and false negative responses from the model. It also helps to establish confidence in the model.

Footnotes

Saurav Kadavath. (2022). Language Models (Mostly) Know What They Know. https://arxiv.org/abs/2207.05221 ↩ ↩² ↩³

Bhuwan Bhatt

Bhuwan Bhatt, a Machine Learning Engineer with over 5 years of industry experience, is passionate about solving complex challenges at the intersection of machine learning and Python programming. Bhuwan has contributed his expertise to leading companies, driving innovation in AI/ML projects. Beyond his professional endeavors, Bhuwan is deeply committed to sharing his knowledge and experiences with others in the field. He firmly believes in continuous improvement, striving to grow by 1% each day in both his technical skills and personal development.

Edit this page

🟢 Introduction

🟢 Chain of Density (CoD)

Master Generative AI with Our Courses

Need Business GenAI Training?

Contact Sales

Want to keep learning

DIFFICULTY LEVEL

RECOMMENDED COURSES

ChatGPT for Everyone

Introduction to Prompt Engineering

Live Courses

🟢 Self-Calibration Prompting

What is Self-Calibration Prompting?

How to Use Self-Calibration Prompting?

What Are Self-Calibration Prompting Results?

Limitations of Self-Calibration

Conclusion

Footnotes

Bhuwan Bhatt

Master Generative AI with Our Courses

Contact Sales

Explore Our Full Course Collection

Explore Courses

Resources

Follow Us