# Harness Engineering
Harness Engineering is the discipline of designing constraints, tools, feedback loops, documentation, and verification systems that guide AI agents toward reliable, maintainable outputs. Where [[Context Engineering]] focuses on *what information reaches the model*, and [[Prompt Engineering]] focuses on *how you phrase the question*, harness engineering focuses on *how the agent runs*: the environment, guardrails, and feedback mechanisms that surround its execution.
The term was coined in late 2025 and formalized in early 2026, notably through OpenAI's internal experiments with [[OpenAI Codex|Codex]], where they built a production application with over 1 million lines of code written entirely by agents. The key insight: the agent wasn't the hard part. The harness was.
## Core components
According to the OpenAI model (via Birgitta Böckeler's analysis on Martin Fowler's site), a harness operates across three layers:
1. **Context Engineering**: the foundational layer. Continuous enhancement of knowledge bases embedded in codebases, supplemented by agent access to dynamic information sources. This is where [[Context Engineering]] sits *within* the harness
2. **Architectural Constraints**: enforcement mechanisms combining LLM-based monitoring with deterministic custom linters and structural tests. These constrain the agent's solution space
3. **Periodic Maintenance ("Garbage Collection")**: agents that regularly scan for documentation inconsistencies and architectural violations to combat entropy and code decay
## Relationship to other disciplines
| Discipline | Focus | Core Question |
|---|---|---|
| [[Prompt Engineering]] | Wording and phrasing | "What should I say?" |
| [[Context Engineering]] | Information environment | "What information is relevant?" |
| **Harness Engineering** | Execution environment | "How should the agent run?" |
| [[Intent Engineering]] | Goals and outcomes | "What must be accomplished?" |
| [[Agent System Engineering]] | Multi-agent systems | "How do agents work together?" |
Harness engineering builds on prompt and context engineering but shifts to execution optimization. It addresses the core limitation of pure context engineering: agents still drift, accumulate entropy, and fail in long-running tasks without structured guardrails.
## Key principles
- **Constrain the solution space**: reliability requires limiting flexibility through standardized patterns, not unrestricted generation
- **Iterative refinement**: when agents struggle, gaps in documentation, guardrails, or tools become signals for improvement
- **Long-term maintainability**: emphasis on internal quality preservation over short-term velocity
- **Feedback loops**: closed-loop failure tracking and pattern clustering to systematically improve agent behavior
- **Hybrid verification**: combine deterministic tools (linters, type checkers) with AI-driven validation
## Practical examples
- [[Claude Code]] with CLAUDE.md files, skills, hooks, and [[Model Context Protocol (MCP)|MCP]] servers providing task-specific scaffolding
- [[AI Agent Harness]] implementations like Cursor, Cline, Aider
- CI/CD pipelines that validate agent-generated code before merge
- Architectural decision records that constrain agent design choices
## References
- https://openai.com/index/harness-engineering/
- https://martinfowler.com/articles/exploring-gen-ai/harness-engineering.html
- https://blog.langchain.com/the-anatomy-of-an-agent-harness/
- https://www.agent-engineering.dev/article/harness-engineering-in-2026-the-discipline-that-makes-ai-agents-production-ready
## Related
- [[Context Engineering]]
- [[Prompt Engineering]]
- [[Intent Engineering]]
- [[Agent System Engineering]]
- [[AI Agent Harness]]
- [[AI Agents]]
- [[AI Agent Skills]]
- [[AI Agent Orchestration]]
- [[Claude Code]]
- [[How coding agents work]]
- [[Agentic Engineering]]
- [[Feedback Loop]]
- [[Levels of AI use]]
- [[SOLID Principles]]
- [[Software Design Patterns for AI Skills and Agents]]