# AI Agents AI Agents are autonomous software systems powered by [[Large Language Models (LLMs)]] that can (potentially) perceive their environment, make decisions, and take actions to achieve goals. Unlike simple chatbots that respond to single prompts, agents operate in *loops*; observing, reasoning, acting, and "learning" from results. The key distinction is autonomy: agents can break down complex tasks, use tools, and iterate until objectives are met without constant human intervention. ## Core Components ### Perception How the agent understands its environment: - Reading files and codebases - Browsing the web - Receiving user instructions - Observing tool outputs and errors - ... ### Reasoning The "brain" that decides what to do: - Goal decomposition (breaking large tasks into steps) - Planning and strategy selection - Error analysis and recovery - Context management across interactions ### Action Tools and capabilities the agent can use: - File operations (read, write, edit) - Code execution (shell commands, scripts) - API calls and web requests - Communication with users or other agents ### Memory How agents maintain context: - **Short-term**: Current conversation/task context - **Long-term**: Persistent storage across sessions (e.g., [[Beads]]) - **Episodic**: Logs of past actions and outcomes ## Agent Architectures ### ReAct (Reasoning + Acting) Interleaves reasoning traces with actions: ``` Thought: I need to find the bug in the authentication code Action: Search for "authentication" in the codebase Observation: Found 3 files... Thought: The error is likely in auth.js based on the stack trace Action: Read auth.js ... ``` ### Plan-and-Execute Creates a full plan before acting: 1. Analyze the task 2. Generate step-by-step plan 3. Execute each step 4. Verify results ### Reflexion Agents that learn from mistakes: - Attempt task - Evaluate outcome - Reflect on failures - Retry with improved approach ## Agent Tools Agents extend LLM capabilities through tools: | Tool Type | Examples | |-----------|----------| | File System | Read, write, edit, search files | | Code Execution | Run shell commands, scripts | | Web | Fetch URLs, search, browse | | APIs | Database queries, external services | | Communication | Ask user questions, send notifications | ## Coding Agents AI agents specialized for software development: - **[[Claude Code]]**: Anthropic's agentic coding CLI - **[[Cursor.com]]**: AI-first code editor - **[[GitHub Copilot]] Workspace**: Task-based coding agent - **Aider**: Terminal-based coding assistant - **OpenHands (Devin)**: Autonomous software engineer ### Personal AI Assistants - **[[Clawdbot]]**: Self-hosted assistant with messaging app integration ### Supporting Infrastructure - **[[Beads]]**: Persistent task tracking for agents - **[[Beads Viewer]]**: Visualize agent task graphs - **[[Ralph TUI]]**: Orchestrate agent loops autonomously ## Agent Patterns ### Tool Use Loop ``` while task_not_complete: observe() → think() → select_tool() → act() → evaluate() ``` ### Hierarchical Agents - **Orchestrator**: High-level planning and delegation - **Workers**: Specialized agents for specific tasks - **Verifiers**: Check work quality ### Human-in-the-Loop - Agent proposes actions - Human approves or rejects - Agent learns from feedback ## Challenges - **Hallucination**: Agents may invent facts or capabilities - **Context limits**: Long tasks exceed context windows - **Error propagation**: Mistakes compound over iterations - **Cost**: Extended agent runs consume many tokens - **Safety**: Autonomous actions require guardrails ## Evaluation How to measure agent performance: - **Task completion rate**: Did it achieve the goal? - **Efficiency**: Steps/tokens required - **Error recovery**: How well it handles failures - **Safety**: Did it avoid harmful actions? ## References - https://en.wikipedia.org/wiki/Intelligent_agent - ReAct paper: "ReAct: Synergizing Reasoning and Acting in Language Models" ## Related - [[AI Agent Swarms]] - [[Claude Code]] - [[Claude Managed Agents]] - [[Clawdbot]] - [[Mastra AI]] - [[Beads]] - [[Beads Viewer]] - [[Ralph TUI]] - [[Large Language Models (LLMs)]] - [[LangChain]] - [[LangGraph]] - [[Ralph Loop]] - [[Ralph Wiggum Technique]] - [[Retrieval-Augmented Generation (RAG)]] - [[Context Engineering]] - [[Types of Context for AI Agents]] - [[Agentic Engineering]] - [[How coding agents work]] - [[AI Subagents]] - [[Agentic TDD]] - [[Code is cheap, quality is not]] - [[AI Agents Web Browsing]] - [[Browser Use]] - [[Vercel Agent Browser]] - [[Walden Yan]] — Cognition AI co-founder, writes about coding agents - [[Romain Huet]] — OpenAI DX lead, AI agent demos - [[Microsoft AI Agent Governance Toolkit]] — open-source governance plane: policy, identity, sandboxing, SRE, compliance - [[AI Governance]]