Announcing our new Course: AI Red-Teaming and AI Safety Masterclass
Check it out →Refining through feedback: Self-Refine enhances LLM outputs by iteratively improving initial results based on model feedback.
Practical application: It is a simple three-step process: generate output, get feedback, and refine the answer, repeating until the output is satisfactory.
Performance boost: Self-Refine significantly improves performance on tasks such as code optimization and sentiment analysis, especially for larger models.
Large language models (LLMs) can solve a wide variety of tasks. Still, they can often fall short in addressing intricate requirements: tasks involving multiple different objectives or tasks involving hard-to-define goals. The initial output from the LLM in such cases involves some inaccuracies and false ideas.
When given a problem, humans come up with an initial draft and then refine iteratively to improve it based on self-provided feedback. For instance, when writing an email for a work colleague, you may first write a direct request such as "Send me the data ASAP". This may look like an okay email to send to friends, but with your work colleague, you may feel the need to be formal. Based on this self-provided feedback, you may re-phrase the email to: "Hi Ashley, could you please send me the data at your earliest convenience?"
Inspired by humans' ability to refine the solution, Self-Refine prompting1 aims to improve the initial outputs from LLMs through iterative feedback and refinement. It is a 3 step approach involving:
It is an iterative process and continues till the model output meets the stopping criteria.
Step 1: Prompt the model to get the output.
Let's prompt the model to generate Python code to find the greatest number among three given numbers.
Step 2: Get feedback Send the output back to the same model to get feedback. If the code cannot be improved anymore, ask the model to say so which is the stopping criteria.
Step 3: Implement feedback Prompt the LLM to use its feedback and improve the existing code.
Step 4: Ask for more feedback.
Since no more improvements are necessary, we stop the iteration.
Employing self-refine boosts the model performance and outperforms the previous state-of-the-art across all tasks. Some notable results include:
Self-refine results on various tasks using the latest models from OpenAI1
While self-refine shows remarkable performance gain across a variety of tasks, there are a few limitations to this approach:
Self-refine enables LLMs to iteratively refine their own output without the need for labeled data, training, or a separate language model. The technique is simple and can be used across a wide variety of tasks including, but not limited to, code optimization, code readability, math reasoning, acronym generation, etc.
Aman Madaan. (2023). Self-Refine: Iterative Refinement with Self-Feedback. https://arxiv.org/abs/2303.17651 ↩ ↩2