# Context Engineering Context engineering is the discipline of designing and building dynamic systems that provide the right information and tools, in the right format, at the right time, to give a [[Large Language Models (LLMs)|LLM]] everything it needs to accomplish a task. It goes well beyond [[Prompt Engineering]]: where prompt engineering focuses on crafting the question, context engineering focuses on the entire information environment surrounding it. The term gained traction in mid-2025, notably through Tobi Lutke (Shopify CEO) and Andrej Karpathy, and was formalized by practitioners like Simon Willison, Phil Schmid, and others. In July 2025, Mei et al. published the first academic survey of the field, analyzing over 1,400 research papers. ## Formal framing Academically, context is modeled as a dynamic assembly of components: **C = A(c_instr, c_know, c_tools, c_mem, c_state, c_query)** Where each component is a distinct type: instructions, knowledge, tool definitions, memory, world/agent state, and the user query. The goal of context engineering is to maximize output quality subject to the hard constraint that everything must fit within the context window (|C| ≤ L_max). This makes it an optimization problem, not just a craft. You're maximizing the signal-to-noise ratio of what enters the model's attention mechanism, within a fixed token budget. ## Why it matters The quality of LLM output depends heavily on what's in the context window. Four cases illustrate this: | Context | Result | |---|---| | No context | Generic answers that could apply to anyone | | Too little context | Vague, imprecise answers | | Right context | Specific, accurate, useful answers | | Too much context | Hallucinations, confusion, contradictions | Context acts as a filter or funnel between the set of all possible answers and the one the model gives back. Too wide and the answer is generic. Too narrow or noisy and the model hallucinates. The goal is to hit the sweet spot. ![[DeveloPassion's Newsletter 197 - Context Engineering - context vs results.png]] It's similar to the distinction between Programming and [[Vibe Coding]]. Both lead to code, but the quality is vastly different. In the same way, providing well-engineered context versus dumping everything produces vastly different AI outputs. ## Types of context Context engineering is not just about the initial prompt. It's about the entire lifecycle of context across an interaction. See [[Types of Context for AI Agents]] for a detailed taxonomy of the six core types (instructions, examples, knowledge, memory, tools, tool results) plus additional dimensions often overlooked. ## Context engineering vs prompt engineering | Dimension | Prompt Engineering | Context Engineering | |---|---|---| | **Focus** | The question/instruction | The entire information environment | | **Scope** | Single prompt | Full interaction lifecycle | | **Model** | Static text string | Dynamic, structured component assembly | | **State** | Primarily stateless | Explicitly stateful (memory, world state) | | **Scalability** | Brittleness increases with length | Manages complexity through modular composition | | **Error analysis** | Manual inspection | Systematic debugging of component functions | | **Mental model** | "Ask the right question" | "Build the right system to provide the right information" | Prompt engineering is a subset of context engineering. A well-engineered prompt inside a poorly engineered context still produces poor results. ## Key principles 1. **Less is more**: only include what the model needs for the current task. Excess context degrades output 2. **Modularity**: break context into composable units that can be loaded/unloaded as needed 3. **Freshness**: stale information is worse than no information. Use live retrieval where possible 4. **Format matters**: how information is structured (headers, tables, examples) affects model attention 5. **Context hygiene**: actively manage context throughout a conversation. Clear, compact, or reset when context becomes bloated 6. **Tool-augmented context**: let the model pull information on demand rather than front-loading everything ## Design patterns - **Receptionist pattern**: a routing layer that selects which agent/context to load based on the task - **Lazy loading**: context components are loaded only when needed, not all upfront - **Context windowing**: keeping recent/relevant context while aging out old information - **Hierarchical context**: high-level context always present; detailed context loaded per task - **RAG-augmented context**: retrieval systems that inject relevant documents dynamically ## Practical applications - [[AI Agents]] that load different tools and instructions per task - [[Claude Code]] with CLAUDE.md files, skills, and MCP servers providing task-specific context - [[Agentic Knowledge Management (AKM)]]: AI assistants that navigate a knowledge base with role-specific contexts - [[Context7]]: injecting up-to-date library documentation into coding prompts ## The understanding-generation asymmetry A key research finding (Mei et al., 2025): LLMs are remarkably good at *understanding* complex contexts but significantly weaker at *generating* equally complex, long-form outputs. Models can digest and reason over millions of tokens of input, but their output quality degrades with length. This asymmetry means context engineering should lean into what models do well: provide rich, well-structured input context and let the model synthesize rather than asking it to generate lengthy output from thin context. In practice: more context in, more concise output out. See [[Context-Understanding-Generation Asymmetry]]. ## Context compression When context exceeds the window budget, compression techniques reduce volume while preserving utility: - **KV Cache management**: optimizing the key-value cache that stores attention state - **Hierarchical memory**: tiered storage where detailed context is offloaded and summarized - **Recurrent compression**: progressive summarization of older context - **Selective attention**: focusing compute on the most relevant parts of the context This is the flip side of "less is more": when you *can't* reduce context, you compress it. ## Connection to [[Systems thinking]] Context engineering is fundamentally a systems design problem. The context window is a system with inputs (prompts, retrieved docs, tool outputs), processing (the model's attention mechanism), and outputs (the response). Designing it well requires thinking about [[Feedback Loop|feedback loops]], information flow, and managing complexity; the same principles that apply to any well-designed system. ## References - Mei, L. et al. (2025). "A Survey of Context Engineering for Large Language Models." arXiv:2507.13334v2 - Simon Willison: https://simonwillison.net/2025/Jun/27/context-engineering - Phil Schmid: https://www.philschmid.de/context-engineering - Drew Breunig: https://www.dbreunig.com/2025/06/22/how-contexts-fail-and-how-to-fix-them.html - Drew Breunig: https://www.dbreunig.com/2025/06/26/how-to-fix-your-context.html - https://github.com/Meirtz/Awesome-Context-Engineering ## Related - [[Types of Context for AI Agents]] - [[Context-Understanding-Generation Asymmetry]] - [[Prompt Engineering]] - [[Large Language Models (LLMs)]] - [[AI Agents]] - [[Retrieval-Augmented Generation (RAG)]] - [[RAG Pipelines]] - [[Model Context Protocol (MCP)]] - [[AI Agent Skills]] - [[Context7]] - [[Headroom]] - [[Claude Code]] - [[Agentic Knowledge Management (AKM)]] - [[Systems thinking]] - [[Feedback Loop]] - [[Embeddings]] - [[Vector Store]] - [[AI Context Rot]] - [[Progressive Disclosure]] - [[Prompt Lazy Loading AI Design Pattern (PLL)]] - [[Receptionist AI Design Pattern]] - [[Context Window]] - [[Token Budget]] - [[Context Bloat]] - [[Context Drift]] - [[Context Hygiene]] - [[Context Compression]] - [[AI Agent Orchestration]] - [[Lazy Loading]] - [[Separation of Concerns]] - [[Knowledge Decay]] - [[Context Anchoring]] - [[AI Instruction Drift]] - [[Harness Engineering]] - [[Intent Engineering]] - [[Agent System Engineering]] - [[Personal Context Management (PCM)]] - [[Team Context Management (TCM)]] - [[Enterprise Context Management (ECM)]] - [[AI Master Prompt]] - [[Levels of AI use]] - [[Levels of AI Context Management]] - [[Vibe Coding]] - [[Context Lifecycle]] - [[Context Entropy]] - [[Context Budget]] - [[Context Layering]] - [[Context Signal-to-Noise Ratio]] - [[Context Provenance]] - [[Context Inheritance]] - [[Context Isolation]] - [[Agentic Context Engineering]] - [[Context Management Maturity Model]] - [[AI Context Governance]] - [[Context-as-Code]] - [[Context Poisoning]] - [[Context Distraction]] - [[Context Confusion]]