< All Topics
Print

What is Graph RAG?

Graph RAG (Retrieval-Augmented Generation) is an advanced form of RAG that leverages the structured relationships within a knowledge graph to provide more accurate, context-aware, and explainable responses from a large language model (LLM).


How it’s different from standard RAG

Traditional RAG systems typically work by:

  1. Breaking down documents into small, flat chunks of text.
  2. Creating vector embeddings for these chunks.
  3. Using a vector database to find the chunks most semantically similar to a user’s query.
  4. Sending those chunks to the LLM to generate a response.

This method is effective but can struggle with complex queries that require connecting multiple pieces of information or understanding relationships that are not explicitly in a single chunk of text.

How Graph RAG Works

Graph RAG enhances this process by:

  1. Creating a Knowledge Graph: It first extracts entities (like people, places, or concepts) and their relationships from unstructured text to build a knowledge graph. This graph provides a structured, interconnected view of the data.
  2. Contextual Retrieval: Instead of just using vector search to find a few similar text chunks, the system uses the knowledge graph to perform more sophisticated retrieval. It can traverse the graph to find all related entities and information, even if they are in different source documents. This is especially powerful for “multi-hop” questions that require following a chain of connections (e.g., “What were the names of the movies directed by the same person who directed Inception?”).
  3. Augmented Prompt: The retrieved graph data—including the entities, relationships, and associated details—is then used to build a richer, more contextual prompt for the LLM. The LLM can then reason over this structured information to generate a more comprehensive and accurate response.

By leveraging the relational structure of a knowledge graph, Graph RAG overcomes the limitations of simple text chunks, leading to better answers and reducing the risk of “hallucinations” in LLM outputs.