Corrective RAG
Corrective Retrieval-Augmented Generation (CRAG) is a novel approach designed to improve the robustness and accuracy of Retrieval-Augmented Generation (RAG) systems. CRAG enhances RAG by introducing a retrieval evaluator that assesses the quality of retrieved documents and applies corrective strategies when retrieval fails or returns inaccurate results.
Instead of blindly relying on retrieval, CRAG:
- Evaluates retrieved documents to determine their reliability.
- Triggers corrective actions to refine or replace incorrect retrievals.
- Leverages web search to supplement missing or inaccurate knowledge.
- Filters and refines retrieved content to improve knowledge integration.
How CRAG Differs from Standard RAG
Feature | Standard RAG | CRAG |
---|---|---|
Uses retrieved texts as-is | β Yes | β No |
Evaluates retrieval quality | β No | β Yes |
Handles incorrect retrievals | β No | β Yes |
Uses web search for correction | β No | β Yes |
Filters noisy retrievals | β No | β Yes |
Plug-and-play for any RAG model | β Limited | β Yes |
Why is CRAG Needed?
While RAG helps mitigate hallucinations in large language models (LLMs) by incorporating external knowledge, it is highly dependent on the quality of retrieved documents. When retrieval fails (i.e., retrieves incorrect or irrelevant documents), RAG can actually make LLM outputs worse by reinforcing errors.
CRAG solves this problem by:
- Evaluating retrieved documents for correctness.
- Applying different retrieval strategies based on confidence scores.
- Filtering and refining retrieved knowledge for better utilization.
How CRAG Works
CRAG introduces a retrieval evaluator that categorizes retrieved documents into three confidence levels and applies corresponding actions:
Confidence Level | Action |
---|---|
High confidence β | Use and refine retrieval (Correct) |
Low confidence β | Discard and replace with web search (Incorrect) |
Medium confidence π€ | Blend retrieved and web search knowledge (Ambiguous) |
- Retrieval Evaluation: A lightweight retrieval evaluator (based on T5) scores the quality of retrieved documents.
If the evaluator detects poor retrieval, CRAG modifies the retrieval process before generation.
-
Corrective Actions: CRAG applies three corrective actions:
-
(a) Correct: If retrieval is reliable: Refines retrieved documents using a decompose-then-recompose technique. Removes irrelevant information while retaining key knowledge.
-
(b) Incorrect: If retrieval is unreliable: Discards faulty retrievals. Triggers a web search to retrieve more reliable information.
-
(c) Ambiguous: If retrieval confidence is unclear: Combines refined retrieval with web search results. Ensures more diverse and robust knowledge integration.
-
-
Knowledge Refinement: Breaks retrieved documents into smaller "knowledge strips". Filters out irrelevant information while keeping the most critical insights. Ensures higher-quality input for LLMs.
-
Web Search for Knowledge Expansion: Generates search queries using a query rewriter. Extracts knowledge from authoritative sources (e.g., Wikipedia). Filters web search results to avoid misinformation.
You can find the code for CRAG on GitHub.
Main Benefits of CRAG
Feature | Benefits |
---|---|
Improves accuracy & robustness | β’ Corrects misleading retrievals. β’ Reduces hallucinations in LLMs. β’ Improves fact-checking capabilities. |
Enhances retrieval & knowledge utilization | β’ Filters noisy or irrelevant retrieved content. β’ Refines useful knowledge before generation. |
Expands knowledge beyond static corpora | β’ Dynamically searches the web when needed. β’ Ensures fresh and updated information. |
Works with any RAG model | β’ Plug-and-play integration with LLaMA, GPT-4, Alpaca, Self-RAG, etc. β’ No need to fine-tune the generator. |
Lightweight & efficient | β’ Uses a small T5-based retrieval evaluator (0.77B parameters). β’ Minimal computational overhead (only 2-5% more FLOPs). |
Conclusion
CRAG significantly improves RAG-based LLMs by introducing a retrieval evaluator and corrective retrieval strategies. It ensures that LLMs generate factually accurate, concise, and reliable contentβeven when retrieval goes wrong.
Valeriia Kuka
Valeriia Kuka, Head of Content at Learn Prompting, is passionate about making AI and ML accessible. Valeriia previously grew a 60K+ follower AI-focused social media account, earning reposts from Stanford NLP, Amazon Research, Hugging Face, and AI researchers. She has also worked with AI/ML newsletters and global communities with 100K+ members and authored clear and concise explainers and historical articles.
Footnotes
-
Yan, S.-Q., Gu, J.-C., Zhu, Y., & Ling, Z.-H. (2024). Corrective Retrieval Augmented Generation. https://arxiv.org/abs/2401.15884 β©