AI Search

What Is Retrieval-Augmented Generation (RAG)?

Retrieval-Augmented Generation (RAG) Definition

A technique where an AI model retrieves relevant documents from a knowledge base or the web, then uses them as context to generate grounded, cited answers.

Retrieval-Augmented Generation (RAG): Why It Matters

RAG is what makes AI search engines cite real sources rather than hallucinating answers from training data alone. Understanding RAG explains why crawlable, factually clear, well-structured content becomes the "corpus" AI tools draw from — and why content that meets those criteria gets cited disproportionately.

Retrieval-Augmented Generation (RAG): How It Works

When a user asks a question, the system retrieves the most relevant documents (using semantic search or keyword search), injects them as context into the LLM prompt, and instructs the LLM to answer using those sources. The result is an answer grounded in retrieved content with citations pointing to the documents used.

Real-World Example

Perplexity uses RAG — it searches the web in real-time for your query, retrieves the top 10–15 relevant pages, and feeds them to an LLM as context. The LLM generates an answer grounded in those pages with inline citations, which is why Perplexity cites specific sources while a pure LLM might not.

Quick Facts

RAG dramatically reduces LLM hallucinations by grounding answers in real sources
Most modern AI search tools (Perplexity, ChatGPT Search, Google AI Overviews) use RAG
RAG-based answers cite their sources; pure-LLM answers do not
Internal company RAG systems let employees ask questions of proprietary documents