π’ Chain of Draft (CoD)
Large Language Models (LLMs) have revolutionized complex reasoning through Chain-of-Thought (CoT) prompting. While effective, CoT's verbose approach often results in high computational costs and increased latency. This limitation becomes particularly significant in real-world applications where efficiency is crucial.
Chain of Draft (CoD) addresses these challenges by introducing a more efficient approach to LLM reasoning. Inspired by human problem-solving patterns, where we typically jot down only essential information, CoD demonstrates that effective reasoning doesn't require lengthy explanations.
CoD transforms traditional LLM reasoning through three key principles:
- Minimalist expression: Instead of verbose explanations, CoD generates concise, information-dense outputs
- Token efficiency: Achieves comparable or better accuracy while using as little as 7.6% of the tokens compared to CoT
- Human-inspired design: Mirrors natural problem-solving patterns where only critical information is recorded
Implementing CoD
To leverage Chain of Draft effectively, use this streamlined prompt template:

Template
Think step by step, but only keep a minimum draft for each thinking step, with 5 words at most. Return the answer at the end of the response after a separator ####.
Guidelines:
- Limit each step to 5 words
- Focus on essential calculations/transformations
- Maintain logical progression
- Mark final answer with ####
Practical Example: Solving Problems with CoD
Let's examine how different prompting approaches handle a simple arithmetic problem:

Problem
Jason had 20 lollipops. He gave Denny some lollipops. Now Jason has 12 lollipops. How many lollipops did Jason give to Denny?
Three Approaches Compared
- Standard Prompting (Direct Answer)

AI Output
A: 8
- Chain-of-Thought Prompting (Detailed Reasoning)

AI Output
Let's think through this step by step:
- Initially, Jason had 20 lollipops.
- After giving some to Denny, Jason now has 12 lollipops.
- To find out how many lollipops Jason gave to Denny, we need to calculate the difference between the initial number of lollipops and the remaining number.
- We can set up a simple subtraction problem: Initial number - Remaining number = Lollipops given
- Putting in the numbers: 20 - 12 = Lollipops given to Denny
- Solving the subtraction: 20 - 12 = 8
Therefore, Jason gave 8 lollipops to Denny.
8
- Chain-of-Draft Prompting (Minimal Essential Steps)

AI Output
20 - x = 12 x = 20 - 12 = 8
8
This comparison clearly demonstrates how CoD maintains problem-solving clarity while dramatically reducing verbosity.
Advantages and Limitations
Key Benefits
-
Enhanced efficiency
- Significantly reduced response generation time
- Lower computational resource requirements
- Minimal token consumption
-
Practical advantages
- More cost-effective for production deployment
- Faster user response times
- Reduced infrastructure requirements
Current Limitations
While powerful, CoD has several important constraints:
-
Model dependencies
- Requires larger models (>3B parameters) for optimal performance
- Less effective in zero-shot scenarios
- May need specific training data
-
Application constraints
- Best suited for structured reasoning tasks
- May require few-shot examples
- Performance varies by task type
For optimal results, use CoD with larger language models and provide few-shot examples. The technique's effectiveness can vary significantly based on model size and task complexity.
Conclusion
Chain of Draft represents a significant step forward in making LLM reasoning more practical and efficient. By maintaining accuracy while dramatically reducing computational overhead, CoD enables broader deployment of LLM capabilities in resource-constrained environments. As the field evolves, this technique's balance of efficiency and effectiveness positions it as a valuable tool for the future of AI applications.
Valeriia Kuka
Valeriia Kuka, Head of Content at Learn Prompting, is passionate about making AI and ML accessible. Valeriia previously grew a 60K+ follower AI-focused social media account, earning reposts from Stanford NLP, Amazon Research, Hugging Face, and AI researchers. She has also worked with AI/ML newsletters and global communities with 100K+ members and authored clear and concise explainers and historical articles.
Footnotes
-
Xu, S., Xie, W., Zhao, L., & He, P. (2025). Chain of Draft: Thinking Faster by Writing Less. β©