Announcing our new Course: AI Red-Teaming and AI Safety Masterclass
Check it out →Zero-Shot Chain-of-Thought (CoT) prompting1 improves the reasoning and accuracy of large language models (LLMs) by enabling LLMs to generate intermediate reasoning steps before predicting the final answer. While this improves the performance in multi-step reasoning tasks, the experiments show that Zero-Shot CoT has three pitfalls:
Categorization of errors made by Zero-Shot CoT2
Plan-and-Solve (PS) prompting2 aims to solve the issue of missing step errors in Zero-Shot Chain-of-Thoughts (CoT) prompting. To achieve its goal, PS prompting replaces the phrase "Let's think step by step" used in Zero-Shot CoT with "Let's first understand the problem and devise a plan to solve the problem. Then, let's carry out the plan and solve the problem step by step". The example below shows the difference between Zero-Shot CoT and PS prompting.
Q: In a dance class of 20 students, 20% enrolled in contemporary dance, 25% of the remaining enrolled in jazz dance, and the rest enrolled in hip-hop dance. What percentage of the entire students enrolled in hip-hop dance? A: Let's think step by step.
First, 20% of the class enrolled in contemporary dance, so that means that 80% of the class is left.
Next, 25% of the remaining 80% enrolled in jazz dance, so that means that 25% of the original class enrolled in jazz dance. That leaves 55% of the original class.
Finally, the rest of the class (55%) enrolled in hip-hop dance. So, 55% of the entire class enrolled in hip-hop dance.
Q: In a dance class of 20 students, 20% enrolled in contemporary dance, 25% of the remaining enrolled in jazz dance, and the rest enrolled in hip-hop dance. What percentage of the entire students enrolled in hip-hop dance?
A: Let's first understand the problem and devise a plan to solve the problem. Then, let's carry out the plan and solve the problem step by step.
Plan: Step 1: Calculate the total number of students who enrolled: in contemporary and jazz dance. Step 2: Calculate the total number of students who enrolled in hip-hop dance. Step 3: Calculate the percentage of students who enrolled: in hip-hop dance.
Solution: Step 1: 20% of 20 students is 4 students. 25% of the remaining 16 students is 4 students. So, a total of 8 students enrolled in contemporary and jazz dance. Step 2: The remaining 12 students enrolled in hip-hop dance. Step 3: The percentage of students who enrolled in hip-hop dance is 12/20 = 60%.
From the example, it is clear that for complex problems, the reasoning steps generated using the Zero-Shot CoT prompt: "Let's think step by step" may still yield incorrect reasoning steps. Asking the LLM first to devise the plan and then carry it out - PS prompting - can help minimize the missing steps errors.
It is important to note that the zero-shot prompt in PS prompting can be anything as long as it satisfies the following conditions:
Implementing PS prompting is a two-step process. It includes:
Feed the prompt to the LLM in the first step; this generates the reasoning step and the answer.
Next, pass the initial prompt and the generated answer to the LLM to extract the final answer. To extract the answer, you can append a phrase like "Therefore, the answer (arabic numerals) is" at the end of the prompt.
While PS prompting helps minimize the missing steps error, PS+ prompting, an extension of PS prompting, aims to reduce calculation errors of Zero-Shot-CoT along with improving the generated reasoning steps. PS+ prompt extends the PS prompt with additional details instructing the LLM to compute intermediate results and pay more attention to calculation and common sense.
Given a math word problem, Grace weighs 125 pounds. Alex weighs 2 pounds less than 4 times what Grace weighs. What are their combined weights in pounds?, let's solve it using both PS and PS+ prompting.
First, let's try with PS prompting:
You can clearly see that PS fails to correctly evaluate the expression 125 + (4 * 125 - 2) resulting in an incorrect answer. Now, let's employ PS+ prompting.
The only difference between PS and PS+ prompting is that in PS+ prompting, the prompt emphasizes the LLM to pay attention to calculation and intermediate steps. As a result, PS+ prompting reduces calculation and the missing steps error.
There are two major limitations to the PS prompting approach:
Plan-and-Solve prompting techniques (PS and PS+) help reduce the missing steps error and calculation errors by carefully crafting prompts and making the model think and plan before solving the problem. The approach has proven effective in various tasks: math reasoning, common sense reasoning, and symbolic reasoning. However, the approach heavily relies on the trigger sentence in the prompt. As such, coming up with the best trigger sentence or prompt may prove challenging as it is manual.