# How Coding Agents Work
Coding agents are [[AI Agents]] specialized for software engineering tasks. They combine an [[Large Language Models (LLMs)|LLM]] with a coding harness that manages context, tools, and execution flow. The harness, not the model, is the primary differentiator between competing systems.
## Three-Layer Architecture
1. **Model family** (the engine): the [[Large Language Models (LLMs)|LLM]] or [[AI Reasoning Models|reasoning model]] that generates text and tool calls
2. **Agent loop** (observe, inspect, choose, act): the [[Agentic loops|agentic loop]] that drives autonomous behavior
3. **Runtime / harness** (the plumbing): the [[AI Agent Harness]] infrastructure that shapes behavior
## Six Core Components
Based on Sebastian Raschka's analysis of systems like [[Claude Code]] and Codex CLI:
### 1. Live Repo Context
The harness collects workspace summaries, git branch/status, project documentation, and file layout to ground the agent in the current codebase. This is a form of [[Context Engineering]] applied to code.
### 2. Prompt Shape and Cache Reuse
Building a stable prompt prefix enables [[AI KV Cache]] reuse across turns, reducing latency and cost. Session state and short-term memory are integrated into the prompt structure. Relates to [[Context Compression]] and [[AI Cost Management]].
### 3. Tool Access and Use
The agent emits structured [[AI Tool Use|tool calls]] validated by the harness. Permission gating, approval workflows, path containment checks, and bounded execution scope enforce [[AI Agent Permissions]] at the tool level. This implements the [[Least Privilege Principle]].
### 4. Minimizing Context Bloat
Practical techniques to combat [[Context Bloat]]: clipping verbose tool output, deduplicating repeated file reads, summarizing transcripts, and applying recency-weighted compression where older context is compressed more aggressively. Maintains a high [[Context Signal-to-Noise Ratio]].
### 5. Structured Session Memory
Three distinct layers of [[AI Agent Memory]]:
- **Working memory**: small, distilled, explicitly maintained state
- **Full transcript**: complete interaction history for session resumption
- **Durable state**: JSON-based persistence and event recording for recovery
### 6. Delegation with Bounded Subagents
[[AI Subagents]] handle parallel subtasks with explicit boundaries: context inheritance rules, restriction scopes, recursion depth limits, and read-only modes. This is [[AI Agent Orchestration]] at the harness level.
## Key Insight
Vanilla [[Large Language Models (LLMs)|LLM]] capabilities are increasingly homogeneous across providers. The [[AI Agent Harness]] quality, specifically how well it manages [[Context Engineering]], determines practical performance differences. [[Harness Engineering]] is where the real differentiation happens.
## References
- https://magazine.sebastianraschka.com/p/components-of-a-coding-agent
## Related
- [[AI Agents]]
- [[AI Agent Harness]]
- [[Harness Engineering]]
- [[Context Engineering]]
- [[Agentic Engineering]]
- [[Agent System Engineering]]
- [[Claude Code]]
- [[AI Agent Memory]]
- [[AI Subagents]]
- [[AI Agent Permissions]]