# Ephemeral Environments An ephemeral environment is a short-lived, automatically provisioned, automatically torn-down stack used for a specific, bounded purpose ; reviewing a pull request, running an integration test, letting an AI agent execute commands without polluting a host. The defining property is **lifecycle bound to a task**, not to a team or a developer. The pattern unifies what used to be three separate things: 1. **Preview environments** in the PR review flow ; spin up a copy of the app on every PR, tear it down on merge 2. **Test sandboxes** that hold a real database/service for end-to-end runs 3. **Agent execution boxes** ; a fresh machine for an AI agent to run untrusted or destructive commands on ## Why They Matter Now The rise of [[AI Agents]] reshaped the conversation. When an autonomous agent can run shell commands, the question "where does it run?" stops being a deployment detail and becomes a safety primitive. Running on the developer's host risks deleting their files. Running in a long-lived shared box risks contamination across runs. An *ephemeral* environment ; isolated, time-bounded, cost-bounded, observable ; is the right blast radius. This is why the recent surge in tools: [[Crabbox]] (remote testbox for humans + agents on [[Hetzner]] / AWS Spot), [[Vercel Sandboxes]] (microVMs for AI agents and untrusted code), [[Claude Managed Agents Environments]] (Anthropic-hosted), [[Codex Cloud]] (OpenAI-hosted). They are all instances of the same pattern. ## Defining Properties - **Provisioned on demand**: spun up by a trigger (PR opened, agent task started, test job dispatched), not always-on - **Bounded lifetime**: TTL, idle timeout, or end-of-task hook - **Bounded cost**: explicit spend caps per user / org / provider - **Bounded blast radius**: container, microVM, or fresh VM ; never the developer's host - **Reproducible setup**: declarative config (Dev Containers, GitHub Actions setup steps, image manifests) - **Observable**: per-environment logs, traces, and metrics, addressable by ID ## Common Failure Modes - **No teardown discipline**: "ephemeral" environments that never actually get torn down ; this is just a fleet, billed by the hour - **State leaks**: shared databases or caches that survive teardown defeat isolation - **Unbounded cost**: missing per-org spend caps means one runaway agent burns the credit - **Poor observability**: when something fails, you can't tell which environment ran which step - **Reset-by-recreation antipattern**: tearing down and re-provisioning instead of cleaning up state inside an existing env, when the latter is faster and good enough ## Ephemeral Environments vs Adjacent Concepts - **vs [[Cloud Development Environment (CDE)]]**: a CDE is per-developer and long-lived; an ephemeral environment is per-task and short-lived - **vs [[GitHub Actions|CI runners]]**: CI runners are technically ephemeral but exist solely to run a workflow; ephemeral environments often serve interactive use (preview URLs, agent shells, manual review) - **vs [[DevBox]] / Nix shells**: those are ephemeral *processes* on a long-lived machine; ephemeral environments are the *whole machine* ## References - Ephemeral Environments primer (community): https://ephemeralenvironments.io/ - "Sandboxes for agents" framing — see [[Vercel Sandboxes]] note for context ## Related - [[Crabbox]] - [[Vercel Sandboxes]] - [[Claude Managed Agents Environments]] - [[Codex Cloud]] - [[Cloud Development Environment (CDE)]] - [[GitHub Codespaces]] - [[GitHub Actions]] - [[AI Agents]] - [[AI Agent Harness]]