ReAct(reason, act) is a paradigm for enabling language models to solve complex tasks using natural language reasoning. ReAct is designed for tasks in which the LLM is allowed to perform certain actions. For example, as in a MRKL system, a LLM may be able to interact with external APIs to retrieve information. When asked a question, the LLM could choose to perform an action to retrieve information, and then answer the question based on the retrieved information.

ReAct Systems can be thought of as MRKL systems, with the added ability to reason about the actions they can perform.

Examine the following image. The question in the top box is sourced from HotPotQA, a question answering dataset that requires complex reasoning. ReAct is able to answer the question by first reasoning about the question (Thought 1), and then performing an action (Act 1) to send a query to Google. It then receives an observation (Obs 1), and continues with this thought, action, observation loop until it reaches a conclusion (Act 3).

ReAct System (Yao et al.)

Readers with knowledge of reinforcement learning may recognize this process as similar to the classic RL loop of state, action, reward, state,... ReAct provides some formalization for this in their paper.

Results

Google used the PaLM LLM in experiments with ReAct. Comparisons to standard prompting (question only), CoT, and other configurations show that ReAct's performance is promising for complex reasoning tasks. Google also performs studies on the FEVER dataset, which covers fact extraction and verification.

ReAct Results (Yao et al.)

Footnotes

Yao, S., Zhao, J., Yu, D., Du, N., Shafran, I., Narasimhan, K., & Cao, Y. (2022). ↩
Yang, Z., Qi, P., Zhang, S., Bengio, Y., Cohen, W. W., Salakhutdinov, R., & Manning, C. D. (2018). HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering. ↩
Chowdhery, A., Narang, S., Devlin, J., Bosma, M., Mishra, G., Roberts, A., Barham, P., Chung, H. W., Sutton, C., Gehrmann, S., Schuh, P., Shi, K., Tsvyashchenko, S., Maynez, J., Rao, A., Barnes, P., Tay, Y., Shazeer, N., Prabhakaran, V., … Fiedel, N. (2022). PaLM: Scaling Language Modeling with Pathways. ↩
Thorne, J., Vlachos, A., Christodoulopoulos, C., & Mittal, A. (2018). FEVER: a large-scale dataset for Fact Extraction and VERification. ↩

Sander Schulhoff

Sander Schulhoff is the Founder of Learn Prompting and an ML Researcher at the University of Maryland. He created the first open-source Prompt Engineering guide, reaching 3M+ people and teaching them to use tools like ChatGPT. Sander also led a team behind Prompt Report, the most comprehensive study of prompting ever done, co-authored with researchers from the University of Maryland, OpenAI, Microsoft, Google, Princeton, Stanford, and other leading institutions. This 76-page survey analyzed 1,500+ academic papers and covered 200+ prompting techniques.

DIFFICULTY LEVEL

RECOMMENDED COURSES

ChatGPT for Everyone

Introduction to Prompt Engineering

Live Courses

LLMs that Reason and Act

Results

Footnotes

Sander Schulhoff