R^2AG

🟦 This article is rated medium

Reading Time: 2 minutes

Last updated on March 2, 2025

$R^2AG$ (Retrieval-to-Retrieval Augmented Generation) is a new framework that improves Retrieval-Augmented Generation (RAG) by reducing the semantic gap between retrievers and large language models (LLMs).

Traditional RAG models retrieve external documents and pass them to an LLM for response generation. However, retrievers and LLMs are trained differently:

Retrievers focus on finding the most relevant documents.
LLMs focus on understanding and generating language based on retrieved content.

This difference creates a semantic gap, where the LLM might misinterpret retrieved documents, leading to hallucinations or low-quality responses.

$R^2AG$ solves this by incorporating retrieval information directly into LLM generation using a trainable R2-Former model and a retrieval-aware prompting strategy.

How $R^2AG$ Works

Step 1: Retrieval Feature Extraction

The retriever fetches relevant documents based on a query.

$R^2AG$ extracts additional retrieval features:

Relevance score ( $r$ ): How relevant is the document to the query?
Precedent similarity ( $\gamma$ ): How similar is the document to previously retrieved ones?
Neighbor similarity ( $\zeta$ ): How similar is it to nearby documents in the ranking list?

Step 2: Processing with R2-Former

The R2-Former model (a lightweight Transformer) processes retrieval features to understand why each document was retrieved.
It captures semantic relationships between the query and documents.
It outputs retrieval-aware embeddings.

Step 3: Retrieval-Aware Prompting

Instead of just prepending retrieved documents, $R^2AG$ integrates retrieval information into the input embeddings of the LLM. Each document is paired with its retrieval-aware embedding (serving as an "anchor" to guide LLM focus).

Step 4: LLM Generation

The LLM generates a response using both the raw text and retrieval-aware embeddings. This reduces confusion and prevents hallucinations by helping the LLM focus on the most relevant documents.

Step 5: Joint Training (Optional)

If computational resources allow, $R^2AG$ can fine-tune both the R2-Former and LLM together for even better performance.
Otherwise, it works with frozen LLMs, making it a cost-effective upgrade to existing RAG systems.

Note

$R^2AG$ code is available on GitHub.

Results of $R^2AG$

$R^2AG$ outperforms other RAG methods across multiple tasks.
Significant improvements in complex reasoning tasks like multi-hop QA.
Works even when the retriever isn't perfect, helping LLMs focus on useful documents.
Fine-tuned $R^2AG$ beats all competitors, while frozen $R^2AG$ still achieves strong results.

Conclusion

$R^2AG$ is a major advancement in Retrieval-Augmented Generation, offering a smarter, more reliable way to integrate retrieval with generation.

Valeriia Kuka

Valeriia Kuka, Head of Content at Learn Prompting, is passionate about making AI and ML accessible. Valeriia previously grew a 60K+ follower AI-focused social media account, earning reposts from Stanford NLP, Amazon Research, Hugging Face, and AI researchers. She has also worked with AI/ML newsletters and global communities with 100K+ members and authored clear and concise explainers and historical articles.

Footnotes

Ye, F., Li, S., Zhang, Y., & Chen, L. (2024). R^2AG: Incorporating Retrieval Information into Retrieval Augmented Generation. https://arxiv.org/abs/2406.13249 ↩

DIFFICULTY LEVEL

RECOMMENDED COURSES

ChatGPT for Everyone

Introduction to Prompt Engineering

AI Red-Teaming and AI Security Masterclass

Live AI Security Courses

R^2AG

How $R^2AG$ Works

Step 1: Retrieval Feature Extraction

Step 2: Processing with R2-Former

Step 3: Retrieval-Aware Prompting

Step 4: LLM Generation

Step 5: Joint Training (Optional)

Results of $R^2AG$

Conclusion

Valeriia Kuka

Footnotes

DIFFICULTY LEVEL

RECOMMENDED COURSES

ChatGPT for Everyone

Introduction to Prompt Engineering

AI Red-Teaming and AI Security Masterclass

Live AI Security Courses

R^2AG

How R2AGR^2AGR2AG Works

Step 1: Retrieval Feature Extraction

Step 2: Processing with R2-Former

Step 3: Retrieval-Aware Prompting

Step 4: LLM Generation

Step 5: Joint Training (Optional)

Results of R2AGR^2AGR2AG

Conclusion

Valeriia Kuka

Footnotes

How $R^2AG$ Works

Results of $R^2AG$