# Crabbox Crabbox is an open-source remote testbox system from [[OpenClaw]] that lets developers and AI agents lease short-lived Linux machines on shared cloud capacity to run tests and commands. It keeps the local developer story unchanged ; edit, save, run ; while offloading the actual compute to owned infrastructure. > Remote testbox for humans and agents. The lobster way 🦞 The CLI is a [[Go]] binary that syncs your dirty checkout to a leased machine via [[Remote Sync (Rsync)|rsync]], then streams command output back. Credentials never live on the developer's machine: a [[Cloudflare Workers|Cloudflare Worker]] with a [[Cloudflare Durable Objects|Durable Object]] brokers provider access, enforces TTLs, applies monthly spend caps, and tracks usage per user, org, and provider. ## Why It Exists The dominant pattern for "more compute than my laptop" is either CI (slow feedback loop, no interactivity) or full cloud development environments like Codespaces or Gitpod (heavy, opinionated, hosted elsewhere). Crabbox sits between the two: ephemeral machines you control, leased on-demand, with the local edit/save/run loop intact. It is purpose-built to be safe to hand to AI coding agents that want to spawn parallel test runs without burning down your laptop or your wallet. ## Key Features - **Local workflow preservation**: [[Remote Sync (Rsync)|rsync]] syncs the dirty checkout to a remote machine ; no commit/push required - **Brokered credentials**: a [[Cloudflare Workers|Cloudflare Worker]] holds provider credentials instead of distributing them to every CLI - **Cost controls**: TTL-bounded machines, monthly spend caps, per-user / per-org / per-provider usage tracking - **Warm machine reuse**: `crabbox warmup` keeps boxes hot for repeated runs via `--id` - **Multi-provider**: [[Hetzner]] and AWS Spot, with provider fallback - **[[GitHub Actions]] integration**: reuses repository Actions setup steps for workspace hydration - **Interactive access**: [[Secure Shell (SSH)|SSH]], [[Virtual Network Computing (VNC)|VNC]] desktop on Linux, macOS, and Windows runners - **Native [[OpenClaw]] plugin**: exposes `crabbox_run`, `crabbox_warmup`, `crabbox_status`, `crabbox_list`, `crabbox_stop` as agent tools ## Technology Stack - **CLI**: [[Go]] binary - **Broker**: [[Cloudflare Workers|Cloudflare Worker]] with a [[Cloudflare Durable Objects|Durable Object]] - **Runners**: vanilla [[Ubuntu]] boxes (managed Windows and macOS via AWS) - **Transport**: HTTPS for CLI ↔ broker, [[Secure Shell (SSH)|SSH]] + [[Remote Sync (Rsync)|rsync]] for CLI ↔ runner ## Quick Start ```bash brew install openclaw/tap/crabbox crabbox login crabbox run -- pnpm test ``` The `brew` tap is provided by [[OpenClaw]] (via [[Homebrew]] on macOS and Linux). Once authenticated, `crabbox run -- <command>` syncs the working tree, leases a runner, runs the command, and streams output back. ## How It Compares - **vs CI ([[GitHub Actions]], etc.)**: interactive, no commit required, sub-minute warm-machine reuse. Crabbox actually *uses* repository Actions setup steps to hydrate the workspace, so the environment matches CI without going through CI. - **vs [[Cloud Development Environment (CDE)|cloud dev environments]] ([[GitHub Codespaces|Codespaces]], Gitpod)**: developer keeps their local editor and toolchain ; only execution moves remote. No port-forwarded IDE, no "cloud workstation" mental model. - **vs local declarative envs ([[DevBox]], devcontainers)**: those keep compute on the laptop; Crabbox moves it to your cloud while keeping the editor local. - **vs DIY scripts on a personal cloud VM**: TTLs, spend caps, multi-provider fallback, brokered credentials, and warm-pool management are built in. - **vs [[Codex Cloud]] / [[Claude Managed Agents Environments]]**: those are managed agent runtimes hosted by the model vendor. Crabbox is BYO-cloud ; you run it on your own [[Hetzner]] / AWS accounts and your own broker. Crabbox sits inside the broader [[Ephemeral Environments]] pattern: short-lived, isolated, cost- and time-bounded compute, this time aimed specifically at the human-or-agent test loop. ## Where It Fits in the AI Agent Stack Crabbox is not an [[AI Agent Harness]]; it is *infrastructure that harnesses use*. An [[OpenClaw]] sub-agent (or any [[AI Agent Skills|skilled]] harness) can call `crabbox_run` to execute a destructive or expensive command on an ephemeral box instead of on the user's host. The broker enforces the spending and TTL boundaries the harness's own permission system cannot. This is the "give the agent its own machines, not your machine" pattern, with the safety rails (cost caps, time bounds, credential isolation) baked into the tool rather than left to the agent's good judgment. ## References - Website: https://crabbox.sh/ - GitHub: https://github.com/openclaw/crabbox - Documentation: https://openclaw.github.io/crabbox/ ## Related - [[OpenClaw]] - [[Peter Steinberger]] - [[AI Agents]] - [[AI Agent Harness]] - [[AI Agent Skills]] - [[Agent Client Protocol (ACP)]] - [[Claude Code]] - [[Codex Cloud]] - [[Claude Managed Agents Environments]] - [[Ephemeral Environments]] - [[Cloud Development Environment (CDE)]] - [[GitHub Codespaces]] - [[DevBox]] - [[GitHub Actions]] - [[Hetzner]] - [[Cloudflare]] - [[Cloudflare Workers]] - [[Cloudflare Durable Objects]] - [[Go]] - [[Remote Sync (Rsync)]] - [[Secure Shell (SSH)]] - [[Virtual Network Computing (VNC)]] - [[Ubuntu]] - [[Homebrew]]