# Ephemeral Environments
An ephemeral environment is a short-lived, automatically provisioned, automatically torn-down stack used for a specific, bounded purpose ; reviewing a pull request, running an integration test, letting an AI agent execute commands without polluting a host. The defining property is **lifecycle bound to a task**, not to a team or a developer.
The pattern unifies what used to be three separate things:
1. **Preview environments** in the PR review flow ; spin up a copy of the app on every PR, tear it down on merge
2. **Test sandboxes** that hold a real database/service for end-to-end runs
3. **Agent execution boxes** ; a fresh machine for an AI agent to run untrusted or destructive commands on
## Why They Matter Now
The rise of [[AI Agents]] reshaped the conversation. When an autonomous agent can run shell commands, the question "where does it run?" stops being a deployment detail and becomes a safety primitive. Running on the developer's host risks deleting their files. Running in a long-lived shared box risks contamination across runs. An *ephemeral* environment ; isolated, time-bounded, cost-bounded, observable ; is the right blast radius.
This is why the recent surge in tools: [[Crabbox]] (remote testbox for humans + agents on [[Hetzner]] / AWS Spot), [[Vercel Sandboxes]] (microVMs for AI agents and untrusted code), [[Claude Managed Agents Environments]] (Anthropic-hosted), [[Codex Cloud]] (OpenAI-hosted). They are all instances of the same pattern.
## Defining Properties
- **Provisioned on demand**: spun up by a trigger (PR opened, agent task started, test job dispatched), not always-on
- **Bounded lifetime**: TTL, idle timeout, or end-of-task hook
- **Bounded cost**: explicit spend caps per user / org / provider
- **Bounded blast radius**: container, microVM, or fresh VM ; never the developer's host
- **Reproducible setup**: declarative config (Dev Containers, GitHub Actions setup steps, image manifests)
- **Observable**: per-environment logs, traces, and metrics, addressable by ID
## Common Failure Modes
- **No teardown discipline**: "ephemeral" environments that never actually get torn down ; this is just a fleet, billed by the hour
- **State leaks**: shared databases or caches that survive teardown defeat isolation
- **Unbounded cost**: missing per-org spend caps means one runaway agent burns the credit
- **Poor observability**: when something fails, you can't tell which environment ran which step
- **Reset-by-recreation antipattern**: tearing down and re-provisioning instead of cleaning up state inside an existing env, when the latter is faster and good enough
## Ephemeral Environments vs Adjacent Concepts
- **vs [[Cloud Development Environment (CDE)]]**: a CDE is per-developer and long-lived; an ephemeral environment is per-task and short-lived
- **vs [[GitHub Actions|CI runners]]**: CI runners are technically ephemeral but exist solely to run a workflow; ephemeral environments often serve interactive use (preview URLs, agent shells, manual review)
- **vs [[DevBox]] / Nix shells**: those are ephemeral *processes* on a long-lived machine; ephemeral environments are the *whole machine*
## References
- Ephemeral Environments primer (community): https://ephemeralenvironments.io/
- "Sandboxes for agents" framing — see [[Vercel Sandboxes]] note for context
## Related
- [[Crabbox]]
- [[Vercel Sandboxes]]
- [[Claude Managed Agents Environments]]
- [[Codex Cloud]]
- [[Cloud Development Environment (CDE)]]
- [[GitHub Codespaces]]
- [[GitHub Actions]]
- [[AI Agents]]
- [[AI Agent Harness]]