Announcing our new Paper: The Prompt Report, with Co-authors from OpenAI & Microsoft!

Check it out →
🧠 AdvancedDecomposition🟦 Plan-and-Solve Prompting

🟦 Plan-and-Solve Prompting

Last updated on September 4, 2024 by Bhuwan Bhatt
Takeaways
  • Plan-and-Solve (PS) prompting addresses missing step errors in zero-shot chain-of-thought (CoT) reasoning by introducing a planning stage before solving.
  • PS+ prompting extends PS prompting to reduce calculation errors, ensuring better accuracy in reasoning tasks.
  • How it works: PS prompting asks the model to plan before solving, while PS+ emphasizes careful calculation and intermediate results.
  • PS+ outperforms zero-shot CoT on symbolic reasoning, common sense, and math reasoning datasets.
  • Limitations include sensitivity to prompt wording and its inability to fully resolve semantic misunderstanding errors.

What is Plan-and-Solve Prompting?

Zero-shot chain-of-thought (CoT) prompting1 improves the reasoning and accuracy of large language models (LLMs) by enabling LLMs to generate intermediate reasoning steps before predicting the final answer. While this improves the performance in multi-step reasoning tasks, the experiments show that zero-shot CoT has three pitfalls:

  • Calculation errors
  • Missing Step errors
  • Semantic misunderstanding

Categorization of errors made by Zero-shot-CoT2

Plan-and-Solve (PS) prompting2 aims to solve the issue of missing step errors in zero-shot CoT. To achieve its goal, PS prompting replaces the phrase "Let's think step by step" used in zero-shot CoT with "Let's first understand the problem and devise a plan to solve the problem. Then, let's carry out the plan and solve the problem step by step". The example below shows the difference between zero-shot CoT and PS prompting.

  • Zero-shot-CoT prompting
Astronaut

Prompt


Q: In a dance class of 20 students, 20% enrolled in contemporary dance, 25% of the remaining enrolled in jazz dance, and the rest enrolled in hip-hop dance. What percentage of the entire students enrolled in hip-hop dance? A: Let's think step by step.

Robot

AI Output


First, 20% of the class enrolled in contemporary dance, so that means that 80% of the class is left.

Next, 25% of the remaining 80% enrolled in jazz dance, so that means that 25% of the original class enrolled in jazz dance. That leaves 55% of the original class.

Finally, the rest of the class (55%) enrolled in hip-hop dance. So, 55% of the entire class enrolled in hip-hop dance.
  • Plan-and-Solve (PS) prompting
Astronaut

Prompt


Q: In a dance class of 20 students, 20% enrolled in contemporary dance, 25% of the remaining enrolled in jazz dance, and the rest enrolled in hip-hop dance. What percentage of the entire students enrolled in hip-hop dance?

A: Let's first understand the problem and devise a plan to solve the problem. Then, let's carry out the plan and solve the problem step by step.
Robot

AI Output


Plan:
Step 1: Calculate the total number of students who enrolled: in contemporary and jazz dance.
Step 2: Calculate the total number of students who enrolled in hip-hop dance.
Step 3: Calculate the percentage of students who enrolled: in hip-hop dance.

Solution: Step 1: 20% of 20 students is 4 students. 25% of the remaining 16 students is 4 students. So, a total of 8 students enrolled in contemporary and jazz dance. Step 2: The remaining 12 students enrolled in hip-hop dance. Step 3: The percentage of students who enrolled in hip-hop dance is 12/20 = 60%.

From the example, it is clear that for complex problems, the reasoning steps generated using the Zero-Shot-CoT prompt: "Let's think step by step" may still yield incorrect reasoning steps. Asking the LLM first to devise the plan and then carry it out - PS prompting - can help minimize the missing steps errors.

It is important to note that the zero-shot prompt in PS prompting can be anything as long as it satisfies the following conditions:

  • It triggers the LLM to determine subtasks and accomplish the subtasks.
  • It guides the LLM to pay more attention to calculation and intermediate results to ensure a correct response

How to Use Plan-and-Solve Prompting?

Implementing PS prompting is a two-step process. It includes:

  • inference, and
  • answer extraction

Step 1: Inference

Feed the prompt to the LLM in the first step; this generates the reasoning step and the answer.

Step 2: Answer Extraction

Next, pass the initial prompt and the generated answer to the LLM to extract the final answer. To extract the answer, you can append a phrase like "Therefore, the answer (arabic numerals) is" at the end of the prompt.

What is PS+ Prompting?

While PS prompting helps minimize the missing steps error, PS+ prompting, an extension of PS prompting, aims to reduce calculation errors of Zero-Shot-CoT along with improving the generated reasoning steps. PS+ prompt extends the PS prompt with additional details instructing the LLM to compute intermediate results and pay more attention to calculation and common sense.

PS+ Prompting Example

Given a math word problem, Grace weighs 125 pounds. Alex weighs 2 pounds less than 4 times what Grace weighs. What are their combined weights in pounds?, let's solve it using both PS and PS+ prompting.

First, let's try with PS prompting:

You can clearly see that PS fails to correctly evaluate the expression 125 + (4 * 125 - 2) resulting in an incorrect answer. Now, let's employ PS+ prompting.

  • Inference
  • Answer extraction

The only difference between PS and PS+ prompting is that in PS+ prompting, the prompt emphasizes the LLM to pay attention to calculation and intermediate steps. As a result, PS+ prompting reduces calculation and the missing steps error.

What Are Plan-and-Solve Prompting Results?

  • PS+ prompting outperforms zero-shot CoT in symbolic reasoning datasets like last letters, commonsense reasoning datasets (CSQA, StrategyQA), and math reasoning datasets.
  • PS+ prompting reduces the missing steps error and semantic understanding error compared to zero-shot CoT.

Limitations of Plan-and-Solve Prompting

There are two major limitations to the PS prompting approach:

  • Models like GPT-3 are sensitive to expressions in the prompt hence, designing the prompt to guide LLMs in generating correct reasoning steps takes considerable effort.
  • PS prompting addresses calculation and missing-reasoning-step errors but doesn't pacify semantic misunderstanding errors.

Conclusion

Plan-and-Solve prompting techniques (PS and PS+) help reduce the missing steps error and calculation errors by carefully crafting prompts and making the model think and plan before solving the problem. The approach has proven effective in various tasks: math reasoning, common sense reasoning, and symbolic reasoning. However, the approach heavily relies on the trigger sentence in the prompt. As such, coming up with the best trigger sentence or prompt may prove challenging as it is manual.

Footnotes

  1. Jason Wei. (2022). Chain-of-Thought Prompting Elicits Reasoning in Large Language Models.

  2. Lei Wang. (2023). Plan-and-Solve Prompting: Improving Zero-Shot Chain-of-Thought Reasoning by Large Language Models. 2

Word count: 0
Copyright © 2024 Learn Prompting.