Announcing our new Course: AI Red-Teaming and AI Safety Masterclass

Check it out →
🧠 Advanced
🧠 AdvancedSelf-Criticism🟦 Self-Refine

🟦 Self-Refine Prompting

Last updated on September 27, 2024 by Bhuwan Bhatt
Takeaways
  • Refining through feedback: Self-Refine enhances LLM outputs by iteratively improving initial results based on model feedback.

  • Practical application: It is a simple three-step process: generate output, get feedback, and refine the answer, repeating until the output is satisfactory.

  • Performance boost: Self-Refine significantly improves performance on tasks such as code optimization and sentiment analysis, especially for larger models.

What is Self-Refine Prompting?

Large language models (LLMs) can solve a wide variety of tasks. Still, they can often fall short in addressing intricate requirements: tasks involving multiple different objectives or tasks involving hard-to-define goals. The initial output from the LLM in such cases involves some inaccuracies and false ideas.

When given a problem, humans come up with an initial draft and then refine iteratively to improve it based on self-provided feedback. For instance, when writing an email for a work colleague, you may first write a direct request such as "Send me the data ASAP". This may look like an okay email to send to friends, but with your work colleague, you may feel the need to be formal. Based on this self-provided feedback, you may re-phrase the email to: "Hi Ashley, could you please send me the data at your earliest convenience?"

Inspired by humans' ability to refine the solution, Self-Refine prompting1 aims to improve the initial outputs from LLMs through iterative feedback and refinement. It is a 3 step approach involving:

  1. Initial output: Prompt the model to get the initial output.
  2. Feedback: Pass the prompt and initial output back to the model to get the feedback.
  3. Refinement: Pass the feedback back to the model to get the refined output.

It is an iterative process and continues till the model output meets the stopping criteria.

How to Use Self-Refine Prompting?

Step 1: Prompt the model to get the output.
Let's prompt the model to generate Python code to find the greatest number among three given numbers.

Step 2: Get feedback Send the output back to the same model to get feedback. If the code cannot be improved anymore, ask the model to say so which is the stopping criteria.

Step 3: Implement feedback Prompt the LLM to use its feedback and improve the existing code.

Step 4: Ask for more feedback.

Since no more improvements are necessary, we stop the iteration.

What Are Self-Refine Prompting Results?

Employing self-refine boosts the model performance and outperforms the previous state-of-the-art across all tasks. Some notable results include:

  • GPT-4's performance increases by 8.7 units for code optimization when augmented using self-refine.
  • Self-refine improves the performance in code readability by at least 13.9 units.
  • Self-refine improves the performance in sentiment reversal tasks by at least 21.6 units.

Self-refine results on various tasks using the latest models from OpenAI1

Limitations of Self-Refine Prompting

While self-refine shows remarkable performance gain across a variety of tasks, there are a few limitations to this approach:

  • The base model needs to be capable of following instructions provided by the users. So, primitive LMs may not be able to benefit from this approach.
  • The results are based on tests performed using the dataset in English.
  • Bad actors can use the technique to steer the model into generating toxic or harmful text.

Conclusion

Self-refine enables LLMs to iteratively refine their own output without the need for labeled data, training, or a separate language model. The technique is simple and can be used across a wide variety of tasks including, but not limited to, code optimization, code readability, math reasoning, acronym generation, etc.

Footnotes

  1. Aman Madaan. (2023). Self-Refine: Iterative Refinement with Self-Feedback. https://arxiv.org/abs/2303.17651 2

Edit this page
Word count: 0
Copyright © 2024 Learn Prompting.