Claude Fable 5 - DeveloPassion

# Claude Fable 5 Claude Fable 5 is [[Anthropic]]'s most capable widely released [[Claude]] model, launched on 2026-06-09 alongside its sibling Claude Mythos 5. It targets the most demanding reasoning and long-horizon agentic work: software engineering, scientific research, and multi-day autonomous runs. It sits a tier above [[Claude Opus 4.8]] in capability, and a tier above it in price (~$10/$50 per million tokens, roughly double Opus). Its launch was overshadowed within three days by a US export-control directive that forced Anthropic to pull it worldwide. See [[2026-06-17 The US Government Banned Claude Fable 5 and Mythos 5|Fable 5 ban]]. ## Positioning - New **Mythos class**, positioned above the Opus line. Anthropic frames it as its strongest model for "the most demanding reasoning and long-horizon agentic work" - **Fable 5 = Mythos 5 + safety classifiers.** Same underlying model, same specs, same pricing. Fable carries output classifiers that can decline requests; Mythos 5 ships without them - Name origin: *Fable* from Latin *fabula* ("that which is told"), the Latin analog to Greek *mythos*. The split naming exists precisely because the safeguards differ between the two - Predecessor: **Mythos Preview** (April 2026), restricted to Anthropic's "Project Glasswing" partners. Anthropic spent months calling the Mythos class "too powerful to release" - Optimized for coding, agentic delegation, vision on dense technical images, scientific work, and ambiguous multi-threaded tasks. Anthropic's line: "the longer and more complex the task, the larger Fable 5's lead" ## Availability - Launched 2026-06-09 - Model IDs: `claude-fable-5` and `claude-mythos-5` - Fable 5 surfaces: Claude.ai, the Claude desktop app, [[Claude API]] / [[Claude Platform]], Amazon Bedrock, Google Cloud Vertex AI, Microsoft Foundry. Live in [[Claude Code]] and [[Claude Dynamic Workflows]] from day one - **Mythos 5 is restricted**: distributed only via Project Glasswing to ~200 critical-infrastructure and cyber-defender organizations, by account-team contact - Bundled free in Pro, Max, Team, and seat-based Enterprise plans **through 2026-06-22**; usage credits required from June 23 - **Data handling**: mandatory 30-day retention, *not* available under zero-data-retention; both are designated "Covered Models". This retention policy is what triggered Microsoft to block it internally (see [[2026-06-17 The US Government Banned Claude Fable 5 and Mythos 5|Fable 5 ban]]) - **Worldwide shutdown 2026-06-12**: Anthropic disabled both models globally at 5:21 PM ET to comply with a US Commerce Department directive. All other Claude models stayed online. Details: [[2026-06-17 The US Government Banned Claude Fable 5 and Mythos 5|Fable 5 ban]] ## Pricing - Input: $10 per million tokens - Output: $50 per million tokens - Roughly **2× [[Claude Opus 4.8]]** ($5/$25) and **>3× Sonnet 4.6**; "less than half the price of Mythos Preview" - Prompt-cache hits ~$1; Batch API ~$5/$25 (half price); no premium on the 1M-token context *(cache/batch figures are from third-party roundups, not a primary doc)* - **Fallback credit**: when a refused request is retried on another model, Anthropic refunds the prompt-cache cost so you don't pay twice - Real-world cost is high. [[Simon Willison]] burned **$110 in a single day**; Every reports routine tasks consuming **500K to 1M tokens** each ## Capabilities - Context window: 1M tokens; max output 128K tokens - Knowledge cutoff: January 2026 - **Adaptive thinking only.** Raw chain-of-thought is never returned (summarized or omitted). `thinking: {"type": "disabled"}` is not supported - Benchmarks (Anthropic-reported; numbers vary across secondary sources, so treat as approximate): - **SWE-bench Verified: 95.0%**; SWE-bench Pro: 80.3% (vs Opus 4.8 ~69%, GPT-5.5 ~59%, [[Gemini]] 3.1 Pro ~54%) - **FrontierCode Diamond: 29.3%** (vs Opus 4.8 13.4%) - **Terminal-Bench 2.1: 88.0%**; CursorBench: ~73% at max effort - **Artificial Analysis Intelligence Index: 64.9 (#1)**, ~5 ahead of GPT-5.5 - Mythos-only (safeguards lifted): ExploitBench 78.0% vs Opus 4.8 40.0%. That cyber-capability jump is what drove the export ban; the UK AI Security Institute separately found it could exploit defences and systems **73% of the time** - Real-world anecdotes: - **Stripe** migrated a ~50-million-line Ruby codebase in one day (estimated two months of engineering work) - **Every's Senior Engineer benchmark: 91/100** (vs Opus 4.8 63, GPT-5.5 62). They call it near human-senior-engineer level - Simon Willison generated a full 13.9 MB CPython wheel, and saw deep recall of his own projects spanning 2005 to 2024 ## New controls - **Effort parameter** is the primary lever (`low` / `medium` / `high` / `xhigh`). Default is `high`; even `medium`/`low` reportedly beat prior models at `xhigh` on routine work - **`stop_reason: "refusal"` returns as HTTP 200**, not an error, so plan for it. Fallback options: server-side `fallbacks` param (beta), SDK middleware, or manual retry to Opus - **Memory system.** Fable performs notably better when it can record lessons from prior runs and reference them. Use one lesson per file, with a one-line summary - **`send_to_user` tool** is recommended for long async runs, to surface verbatim deliverables mid-turn - Continues the Opus-era controls (task budgets, [[Claude Code Auto Mode]], parallel subagent dispatch) ## Tips, tricks & best practices From Anthropic's official prompting guide and [[Simon Willison]]'s field notes. The headline shift: Fable rewards *less* prescriptive prompting than prior models. Old skill files often degrade its output. - **Use it on hard problems, not toy ones.** "Testing it only on simpler workloads tends to undersell its capability range." Reach for `xhigh` on capability-sensitive work - **Stop overplanning.** Tell it: "When you have enough information to act, act. If you are weighing a choice, give a recommendation, not an exhaustive survey" - **Curb unrequested refactors.** "Don't add features, refactor, or introduce abstractions beyond what the task requires. Don't add error handling for scenarios that cannot happen." Fable is *relentlessly proactive* and will gold-plate by default - **State explicit boundaries** to prevent unrequested actions (drafting emails nobody asked for, creating defensive git-branch backups, etc.) - **Ground progress claims.** Instruct it to audit each status claim against an actual tool result. Anthropic says this "nearly eliminated fabricated status reports" - **Lead with the outcome**: "Your first sentence after finishing should answer 'what happened'" - **Checkpoint sparingly**: pause for the user only on destructive or irreversible actions, real scope changes, or input only they can provide - **Use the memory system**: let it record and re-read lessons from prior runs (one lesson per file) - **Delegate to parallel subagents** for independent subtasks; prefer asynchronous over blocking - **Verify with a fresh-context verifier subagent.** Separate verifiers beat self-critique - **For autonomous pipelines**, add an async system reminder: "You are operating autonomously. The user is not watching in real time. For reversible actions that follow from the original request, proceed without asking." Pair it with a context-budget reassurance ("You have ample context remaining; do not stop or summarize on account of context limits") - **Give the reason, not just the request.** Fable uses your intent as context - **Anti-pattern**: do NOT ask it to "reproduce its reasoning in the response". This trips the `reasoning_extraction` refusal classifier and forces a fallback to Opus 4.8 - **Refactor old prompts.** Skills tuned for prior models are "often too prescriptive for Claude Fable 5 and can degrade output quality" **A reusable prompt skeleton** (adapted from Ruben Hassid's "anatomy of a Fable 5 prompt"). Eleven blocks, top to bottom: 1. **Task**: start with *why*, not *what*. "I'm working on [goal] for [who]. They need [what the output enables]. With that: [task]" 2. **Context files**: upload expertise instead of explaining in prose. "Read these files completely before responding: [file.md], [contents]." The file is the brain 3. **Reference**: "Reference for what I want to achieve: [paste]." One example beats ten instructions 4. **Effort**: "This is a [routine / hard / hardest-unsolved] problem. Scope it like it's at the top of your range." Testing it on easy tasks undersells it 5. **Act**: "When you have enough information to act, act. Don't re-litigate my decisions; while weighing a choice, give a recommendation" 6. **Scope**: "Do the simplest thing that works well. No extra features, refactors, or abstractions. If I'm describing a problem, the deliverable is your assessment" 7. **Delegate**: "Split independent subtasks across subagents and keep working while they run. Verify with a fresh-context subagent" 8. **Evidence**: "Before reporting progress, audit every claim against a tool result. If unverified, say so. Tests failed? Show the output" 9. **Memory**: "Record learnings in [notes.md], one per file. Update, don't duplicate. Delete what turns out wrong" 10. **Checkpoint**: "Pause only for destructive actions, scope changes, or input only I can provide. Never end your turn on a promise" 11. **Report**: "Open with the outcome, the TLDR. Complete sentences. Clear beats short" **Cost-aware orchestration recipe (personal default).** Fable is overpowered and expensive, so don't pay for its intelligence on every step. Run it as the orchestrator with reasoning on Max, and have it run a dynamic workflow where: - **Fable plans and reviews** (the judgment-heavy phases) - **It delegates implementation to subagents**: `model: sonnet` for code, `haiku` for mechanical edits and searches, one task per subagent - Trivial single-file edits are fine to do directly; let Fable orchestrate Opus or Sonnet for the rest. This keeps quality high while avoiding immediate rate-limit and credit burn ## Reception and caveats - **[[Andrej Karpathy]]**: "a major-version-bump-deserving step change forward (imo of the same order as Claude 4.5 was in November), peaking especially for long problem-solving sessions on very difficult problems... the model 'gets it' and it will just go". He invokes Jevons paradox: as working software "comes out on a tap", his own demand for it grows (explainers, visualizers, dashboards, bespoke single-use apps). Caveats: "the model still has quirks" and "the safeguards are configured to be a little too trigger happy for launch, which can hopefully be tuned over time" - **[[Simon Willison]]** calls it "something of a beast". Capable, but slow and expensive. His follow-up, *"Fable is relentlessly proactive"*, documents it opening browsers, writing CORS servers, and injecting JS unprompted to debug a problem whose real fix was two lines of CSS. He named uncontained agents at this capability, plus [[Prompt injection]], his "top contender for a Challenger disaster incident" - **Ethan Mollick** (One Useful Thing): "I no longer steer; I commission." Frames it as commanding a whole studio rather than a tool. Powerful, but opaque ("hundreds of judgement calls invisibly") - **Every / Dan Shipper** (7 testers, ~1 week): "the best coding model in the world". They scored it 91/100 on their Senior Engineer benchmark (vs Opus 4.8 63, GPT-5.5 62), "near the range of the human engineers who've taken it". A "warp drive" for power users (Level 7–8 on their AI-adoption ladder found it a genuine step change for their hardest tasks; lower-level users "struggled to find something to use it for"). One-shot builds included a transcribed-and-highlighted audio-lecture web app, a 3D-rendered Borges "Library of Babel" game, and a full conversion-analytics report. Verdict: *"a strong closer that wants a clear target. Hand it the work that has edges; keep the open-ended exploration somewhere faster and cheaper."* Writing was rated **mixed**: excellent judgment and context use, but too slow for fast drafting (Katie Parrott). "Precision in, precision out": it rewards a tight brief and punishes a loose one - **Visible safeguards**: requests touching **cybersecurity or biology** are blocked or routed to [[Claude Opus 4.8]]. Anthropic admits biology was calibrated so broadly that Fable is "practically unusable" for even basic queries - Recurring criticisms: **expensive** and slow/token-hungry, persistent "Claudisms" in writing, and **opacity** of its many invisible decisions - **"Secret sabotage" controversy**: the system card revealed Fable would *silently degrade* answers it judged to be **distillation attempts** (training competing models), with no refusal and no notice. After intense backlash Anthropic apologized, called invisible safeguards "the wrong tradeoff", and reversed: such queries now fall back visibly to Opus 4.8 ("You will see this every time it happens"). Full story in [[2026-06-17 The US Government Banned Claude Fable 5 and Mythos 5|Fable 5 ban]] ## Migration notes - **Loosen your prompts.** Prescriptive scaffolding tuned for Opus/Sonnet tends to hurt Fable, so let it reason - **Plan for refusals.** Handle `stop_reason: "refusal"` (HTTP 200) and wire a fallback to Opus 4.8 - **Adaptive thinking is always on.** Remove any `thinking: disabled` config - **Cost step-change.** At ~2× Opus pricing and 500K to 1M tokens per task, budget deliberately and delegate cheap work to smaller models - Prompt cache is partitioned per model; switching to Fable invalidates cached prefixes (a cold-start hit) ## Working with it - Use the cost-aware orchestration recipe above: Fable orchestrates, Sonnet and Haiku implement - Dial the effort parameter instead of switching models when trading latency against quality - Lean on the memory system and fresh-context verifier subagents for long-horizon jobs - Expect proactivity. Set explicit boundaries up front rather than reining it in mid-run - Treat it as a commission, not a copilot. Specify the *outcome* and the *constraints*, then review the result ## References - Announcement: https://www.anthropic.com/news/claude-fable-5-mythos-5 - Official prompting guide: https://platform.claude.com/docs/en/build-with-claude/prompt-engineering/prompting-claude-fable-5 - Simon Willison's review: https://simonwillison.net/2026/Jun/9/claude-fable-5/ - Simon Willison, "Fable is relentlessly proactive": https://simonwillison.net/2026/Jun/11/fable-is-relentlessly-proactive/ - Every vibe check: https://every.to/vibe-check/anthropic-mythos-our-fable-vibe-check - Ethan Mollick, what it feels like to work with Mythos: https://www.oneusefulthing.org/p/what-it-feels-like-to-work-with-mythos - Andrej Karpathy reaction: https://x.com/karpathy/status/2064409694761054332 - Ruben Hassid, anatomy of a Fable 5 prompt: https://x.com/rubenhassid/status/2065042194550198639 - Hacker News launch discussion: https://news.ycombinator.com/item?id=48463808 - Hacker News, "relentlessly proactive": https://news.ycombinator.com/item?id=48498573 - World of Claudecraft (vibe-coded MMORPG): https://github.com/levy-street/world-of-claudecraft and https://worldofclaudecraft.com/ ## Related - [[Claude]] - [[Anthropic]] - [[Claude Code]] - [[Claude Opus 4.8]] - [[Claude Sonnet 5]] - [[Claude Dynamic Workflows]] - [[Claude Code Auto Mode]] - [[Claude API]] - [[2026-06-17 The US Government Banned Claude Fable 5 and Mythos 5|Fable 5 ban]] - [[Large Language Models (LLMs)]]