# Harness Engineering Harness Engineering is the discipline of designing constraints, tools, feedback loops, documentation, and verification systems that guide AI agents toward reliable, maintainable outputs. Where [[Context Engineering]] focuses on *what information reaches the model*, and [[Prompt Engineering]] focuses on *how you phrase the question*, harness engineering focuses on *how the agent runs*: the environment, guardrails, and feedback mechanisms that surround its execution. The term was coined in late 2025 and formalized in early 2026, notably through OpenAI's internal experiments with [[OpenAI Codex|Codex]], where they built a production application with over 1 million lines of code written entirely by agents. The key insight: the agent wasn't the hard part. The harness was. ## Core components According to the OpenAI model (via Birgitta Böckeler's analysis on Martin Fowler's site), a harness operates across three layers: 1. **Context Engineering**: the foundational layer. Continuous enhancement of knowledge bases embedded in codebases, supplemented by agent access to dynamic information sources. This is where [[Context Engineering]] sits *within* the harness 2. **Architectural Constraints**: enforcement mechanisms combining LLM-based monitoring with deterministic custom linters and structural tests. These constrain the agent's solution space 3. **Periodic Maintenance ("Garbage Collection")**: agents that regularly scan for documentation inconsistencies and architectural violations to combat entropy and code decay ## Relationship to other disciplines | Discipline | Focus | Core Question | |---|---|---| | [[Prompt Engineering]] | Wording and phrasing | "What should I say?" | | [[Context Engineering]] | Information environment | "What information is relevant?" | | **Harness Engineering** | Execution environment | "How should the agent run?" | | [[Intent Engineering]] | Goals and outcomes | "What must be accomplished?" | | [[Agent System Engineering]] | Multi-agent systems | "How do agents work together?" | Harness engineering builds on prompt and context engineering but shifts to execution optimization. It addresses the core limitation of pure context engineering: agents still drift, accumulate entropy, and fail in long-running tasks without structured guardrails. ## Key principles - **Constrain the solution space**: reliability requires limiting flexibility through standardized patterns, not unrestricted generation - **Iterative refinement**: when agents struggle, gaps in documentation, guardrails, or tools become signals for improvement - **Long-term maintainability**: emphasis on internal quality preservation over short-term velocity - **Feedback loops**: closed-loop failure tracking and pattern clustering to systematically improve agent behavior - **Hybrid verification**: combine deterministic tools (linters, type checkers) with AI-driven validation ## Practical examples - [[Claude Code]] with CLAUDE.md files, skills, hooks, and [[Model Context Protocol (MCP)|MCP]] servers providing task-specific scaffolding - [[AI Agent Harness]] implementations like Cursor, Cline, Aider - CI/CD pipelines that validate agent-generated code before merge - Architectural decision records that constrain agent design choices ## References - https://openai.com/index/harness-engineering/ - https://martinfowler.com/articles/exploring-gen-ai/harness-engineering.html - https://blog.langchain.com/the-anatomy-of-an-agent-harness/ - https://www.agent-engineering.dev/article/harness-engineering-in-2026-the-discipline-that-makes-ai-agents-production-ready ## Related - [[Context Engineering]] - [[Prompt Engineering]] - [[Intent Engineering]] - [[Agent System Engineering]] - [[AI Agent Harness]] - [[AI Agents]] - [[AI Agent Skills]] - [[AI Agent Orchestration]] - [[Claude Code]] - [[How coding agents work]] - [[Agentic Engineering]] - [[Feedback Loop]] - [[Levels of AI use]] - [[SOLID Principles]] - [[Software Design Patterns for AI Skills and Agents]]