RAG Pipelines - DeveloPassion

# RAG Pipelines RAG Pipelines are the data processing workflows that power [[Retrieval-Augmented Generation (RAG)]] systems. They handle the end-to-end flow from document ingestion to response generation. ## Pipeline Stages ### 1. Ingestion Pipeline ``` Documents → Load → Split → Embed → Store ``` - **Load**: Read documents (PDF, HTML, Markdown, etc.) - **Split**: Chunk into manageable pieces (by tokens, sentences, or semantics) - **Embed**: Convert chunks to vectors using embedding models - **Store**: Index in a [[Vector Store]] ### 2. Query Pipeline ``` Query → Embed → Retrieve → Rerank → Generate ``` - **Embed**: Convert user query to vector - **Retrieve**: Find similar chunks from vector store - **Rerank**: Score and filter results for relevance - **Generate**: Pass context + query to [[Large Language Models (LLMs)|LLM]] ## Pipeline Patterns | Pattern | Description | |---------|-------------| | Naive RAG | Simple retrieve → generate | | Advanced RAG | Query rewriting, hybrid search, reranking | | Modular RAG | Component-based architecture enabling flexible composition. Swap retrievers, rerankers, generators independently | | Agentic RAG | LLM decides what/when to retrieve. The agent drives retrieval decisions rather than following a fixed pipeline | | Graph-Enhanced RAG | Leverages knowledge graphs for structured relationships between entities, improving reasoning over connected facts | | Corrective RAG (Self-RAG) | Evaluate retrieval quality, retry if poor. The model critiques its own retrieval before generating | Notable systems: FlashRAG, GraphRAG, LightRAG, HippoRAG, RAPTOR. ## Key Considerations - **Chunk size**: Balance context vs precision - **Overlap**: Prevent splitting important context - **Embedding model**: Match to your domain - **Top-k selection**: How many chunks to retrieve - **Prompt engineering**: How to present retrieved context ## Frameworks - **[[LangChain]]**: Comprehensive RAG building blocks - **LlamaIndex**: Specialized for data indexing and RAG - **Haystack**: End-to-end NLP pipelines ## References - https://docs.langchain.com/ - https://docs.llamaindex.ai/ ## Related - [[Retrieval-Augmented Generation (RAG)]] - [[Vector Store]] - [[LangChain]] - [[LangGraph]] - [[Large Language Models (LLMs)]] - [[AI Agents]] - [[Embeddings]] - [[Context Engineering]]