Last updated on October 3, 2024
Technique | Institution | Date of Publication | Paper | Code |
---|---|---|---|---|
Active Prompting | The Hong Kong University of Science and Technology, University of Toronto, The University of Hong Kong, University of Illinois Urbana-Champaign | Feb 2023 | Active Prompting with Chain-of-Thought for Large Language Models (LLMs) | shizhediao/active-prompt |
Active Prompting (or Active-Prompt)^{} is a technique for improving Chain-of-Thought (CoT) prompting performance by selectively human-annotating exemplars where the model shows the most uncertainty. This approach helps maximize the efficiency of human annotation efforts by focusing only on the most challenging questions for the model.
Active Prompting consists of four main steps:
Active Prompting saves significant human resources by reducing the need to annotate all training data. It outperforms other techniques such as Automatic Chain-of-Thought prompting, Random Chain-of-Thought prompting, and Self-Consistency on a range of reasoning tasks. Active Prompting research^{} is the first to show the benefits of selective question annotation in CoT prompting for solving complex reasoning tasks.
Letβs break down the Active Prompting process with an example. Assume you have a pool of $n$ unlabeled questions.
First, you prompt the model multiple times ($k$) for each unlabeled question using:
Letβs say you choose the CoT option. You provide exemplars (Q1 and Q2), then ask your pool question (Q3). Repeat this process $k$ times for each question.
Q1: Josh and Anna were both born on August 17th, but in different years. To consolidate celebrations they also got married on August 17 when Josh turned 22. If today theyβre celebrating 30 years of marriage and their combined age is exactly 5 times what Joshβs age was when they married, how old was Anna when they got married?
A1: Letβs think step by step. To calculate how old was Anna when they got married, we have to know their combined age, Joshβs age after 30 years, and Annaβs age after 30 years from their marriage. Since their combined age is 5 times Joshβs age when he got married, their combined age is 5 * 22 = 110 years. Josh must be 30 years older than his age when they got married, so he is 22 + 30 = 52 years old now. Therefore, Annaβs current age will be 110 - 52 = 58 years. If they married 30 years ago, Anna must have been 58 - 30 = 28 years old when they married The answer is 28.
Q2: John buys a chair. He then buys a table that is 3 times the price of the chair. Then, he buys a couch that is 5 times the price of the table. If John paid $380 for all these items, what is the price of the couch?
A2: Letβs think step by step. To calculate the price of the couch, we need to know the price of the chair, the price of the table, and the relation between the chair, table, couch, and total money paid. Let x be the price of the chair, 3 * x be the price of the table, and 5 * (3 * x) = 15 * x be the price of the couch. The relationship between the chair, table, couch, and the total price paid is x + 3 * x + 15 * x = $380, which is 19 * x = 380, and x=20. The price of the couch is 15 * x, which is 15 * 20 = $300. The answer is 300.
Q3: John has 5 apples, and he gives 2 to Mary. How many apples does John have left?
As a result you will get $k$ answers for each of your $n$ questions.
Next, you need to measure the uncertainty of the model for each question based on the $k$ answers it generates for a given question.
To do that, you select the uncertainty metric. An example metric could be disagreement:
You use the disagreement among $k$ generated answers for a given question from the pool. The disagreement calculates the unique answers in the predictions.
Then, you select the questions with the highest uncertainty based on the metric. For simplicity, let's review just one example question from the set of the most uncertain questions.
Imagine you take disagreement as an useartainty metric. You prompt the model $k$ times and find that the below question consistently yields different LLM's outputs meaning the model is uncertain in the answer:
John has 5 apples, and he gives 2 to Mary.
How many apples does John have left?
This is just one example while in reality there can be many of them.
You manually annotate the selected question to provide a clear, correct answer:
Q: John has 5 apples, and he gives 2 to Mary. How many apples does John have left?
A: John starts with 5 apples. He gives away 2 apples. Therefore, he has 5 - 2 = 3 apples left.
This annotated question becomes an example for the model.
The code for Active Prompting is open-sourced and available for further research and implementation at shizhediao/active-prompt.
Active Prompting has demonstrated superior performance across several benchmarks, including arithmetic, commonsense, and symbolic reasoning tasks. It consistently outperforms traditional CoT and other baseline techniques, highlighting its effectiveness in enhancing LLM capabilities.
Despite its advantages, Active Prompting has some limitations:
Human Annotation Required: Some level of human involvement is needed to annotate the most uncertain questions.
Choosing the right uncertainty metric matters: The way we measure uncertainty can impact performance, so we need to pick the right one based on the task at hand.
Active prompting really enhances how well Large Language Models solve complex reasoning problems. By focusing on the questions the model is most uncertain about, we make the annotation process efficient and tailor it to boost the model's learning.
Diao, S., Wang, P., Lin, Y., Pan, R., Liu, X., & Zhang, T. (2024). Active Prompting with Chain-of-Thought for Large Language Models. https://arxiv.org/abs/2302.12246 β© β©^{2}