How Coding Agents Work - DeveloPassion

# How Coding Agents Work Coding agents are [[AI Agents]] specialized for software engineering tasks. They combine an [[Large Language Models (LLMs)|LLM]] with a coding harness that manages context, tools, and execution flow. The harness, not the model, is the primary differentiator between competing systems. ## Three-Layer Architecture 1. **Model family** (the engine): the [[Large Language Models (LLMs)|LLM]] or [[AI Reasoning Models|reasoning model]] that generates text and tool calls 2. **Agent loop** (observe, inspect, choose, act): the [[Agentic loops|agentic loop]] that drives autonomous behavior 3. **Runtime / harness** (the plumbing): the [[AI Agent Harness]] infrastructure that shapes behavior ## Six Core Components Based on Sebastian Raschka's analysis of systems like [[Claude Code]] and Codex CLI: ### 1. Live Repo Context The harness collects workspace summaries, git branch/status, project documentation, and file layout to ground the agent in the current codebase. This is a form of [[Context Engineering]] applied to code. ### 2. Prompt Shape and Cache Reuse Building a stable prompt prefix enables [[AI KV Cache]] reuse across turns, reducing latency and cost. Session state and short-term memory are integrated into the prompt structure. Relates to [[Context Compression]] and [[AI Cost Management]]. ### 3. Tool Access and Use The agent emits structured [[AI Tool Use|tool calls]] validated by the harness. Permission gating, approval workflows, path containment checks, and bounded execution scope enforce [[AI Agent Permissions]] at the tool level. This implements the [[Least Privilege Principle]]. ### 4. Minimizing Context Bloat Practical techniques to combat [[Context Bloat]]: clipping verbose tool output, deduplicating repeated file reads, summarizing transcripts, and applying recency-weighted compression where older context is compressed more aggressively. Maintains a high [[Context Signal-to-Noise Ratio]]. ### 5. Structured Session Memory Three distinct layers of [[AI Agent Memory]]: - **Working memory**: small, distilled, explicitly maintained state - **Full transcript**: complete interaction history for session resumption - **Durable state**: JSON-based persistence and event recording for recovery ### 6. Delegation with Bounded Subagents [[AI Subagents]] handle parallel subtasks with explicit boundaries: context inheritance rules, restriction scopes, recursion depth limits, and read-only modes. This is [[AI Agent Orchestration]] at the harness level. ## Key Insight Vanilla [[Large Language Models (LLMs)|LLM]] capabilities are increasingly homogeneous across providers. The [[AI Agent Harness]] quality, specifically how well it manages [[Context Engineering]], determines practical performance differences. [[Harness Engineering]] is where the real differentiation happens. ## References - https://magazine.sebastianraschka.com/p/components-of-a-coding-agent ## Related - [[AI Agents]] - [[AI Agent Harness]] - [[Harness Engineering]] - [[Context Engineering]] - [[Agentic Engineering]] - [[Agent System Engineering]] - [[Claude Code]] - [[AI Agent Memory]] - [[AI Subagents]] - [[AI Agent Permissions]] - [[Loop Engineering]]