Since their inception, Large Language Models (LLMs) are increasingly becoming popular and getting deployed in a wide range of applications across many different industries. But, a common theme across LLM inference is that they are still confined to token-level, left-to-right decision-making processes. They still fall short in tasks requiring exploration or making assumptions about the future state, given the present state.
Tree of Thoughts (ToT) prompting is a framework for LLM inference that allows LLMs to make an informed decision by considering and self-evaluating multiple different reasoning paths that will likely lead to an optimal solution. ToT also empowers the model to backtrack when a path is unlikely to lead to a valid solution. ToT is similar to the best-first search algorithm in Computer Science.
ToT aims to mimic human's problem-solving approach. Research shows that, given a problem, humans search through a combinatorial problem space - a tree where the nodes represent partial solutions and branches represent operators that modify the nodes. They use heuristics to identify the next branch that guides them closer to the solution. The process continues till the problem concludes.
Let's look at how we can implement ToT using two distinct examples:
Game of 24 is a mathematical problem where the goal is to use given four numbers and four basic operators: +, -, /,*, to obtain 24. At each step, the propose prompt generates three possible solutions, and the value prompt evaluates each of the generated candidates and decides whether proceeding with the suggested generation is worthwhile. It will be clear once we look at the example below.
Problem statement: Using the numbers 4, 9, 10, and 13 and four basic operators +, -, /,*, generate an expression that evaluates to 24.
Evaluating 6, 9, 13
For each generated node using the propose prompt, the value prompt evaluates it. Then, for all the nodes that are likely to reach the solution, expand them using the propose prompt. Use Breadth First Search (BFS) to expand all nodes at one level before moving on to the nodes at the next level. The process is continued until only the number 24 is left in a node.
The creative writing task helps evaluate the creative thinking and planning abilities of the LLM.
Problem statement: Given four random sentences, generate a passage with four paragraphs that end in the input four sentences respectively.
Step 1:
First, ask the LLM to generate 5 different plans for the passage.
Plan 1
Plan 2
Step 2:Present all 5 plans to the LLM and ask it to choose the best one. A simple Zero-Shot voting prompt, "analyze choices below, then conclude which is most promising for the instruction," is used.
Repeat this step 5 times and choose the plan (say Plan 1) that gains the maximum votes.
Step 3
Use the chosen plan to generate the passage.
The image below illustrates the use of ToT for creative writing tasks.
ToT for creative writing
Method | Success |
---|---|
IO (best of 100) | 33% |
CoT (best of 100) | 49% |
ToT (ours) (b=5) | 74% |
ToT coherency score for creative writing task
Method | Success Rate(%) | ||
---|---|---|---|
Letter | Word | Game | |
IO | 38.7 | 14 | 0 |
CoT | 40.6 | 15.6 | 1 |
ToT | 78 | 60 | 20 |
Cost analysis of the Game of 24
Tree of Thought (ToT) is a practical framework for intellectually demanding tasks that require some planning and look-ahead. However, implementing ToT is demanding in terms of resources consumed and effort required. Consequently, it is wise to use it to solve only those tasks that cannot be solved using techniques like IO prompting and CoT prompting.
Bhuwan Bhatt, a Machine Learning Engineer with over 5 years of industry experience, is passionate about solving complex challenges at the intersection of machine learning and Python programming. Bhuwan has contributed his expertise to leading companies, driving innovation in AI/ML projects. Beyond his professional endeavors, Bhuwan is deeply committed to sharing his knowledge and experiences with others in the field. He firmly believes in continuous improvement, striving to grow by 1% each day in both his technical skills and personal development.