# Gemini 3.5 Flash Gemini 3.5 Flash is a minor release in the [[Gemini 3]] family from [[Google DeepMind]] (May 2026), pitched by Google as its "strongest agentic and coding model yet." Two things make it worth a note. It is the first Flash-tier model that beats the previous generation's Pro. And it is the release where "Flash" stopped meaning "cheap." ## What stands out - **A Flash model that beats the prior Pro.** Google reports it outperforming Gemini 3.1 Pro on hard coding and agentic benchmarks. The tier names (Pro / Flash / Nano) no longer track the capability ladder cleanly. - **Speed is the actual product.** Roughly 4× faster output-token generation than other frontier models. On long-horizon agentic work — multi-step tool use, document reasoning, code transformation — that throughput compounds into real wall-clock savings. - **Tuned for agents, not trivia.** Its strong numbers sit on agent-shaped evals (tool use, terminal work), not raw-knowledge ones. ## Benchmarks (Google-reported, May 2026) - Terminal-Bench 2.1: 76.2% - MCP Atlas: 83.6% - GDPval-AA: 1656 Elo - CharXiv Reasoning (multimodal): 84.2% These are Google's own figures, so treat them directionally until independent replication. The shape is consistent though: it indexes hard on agentic and tool-use tasks. ## The pricing story This is the real story of the release. API pricing jumped from $0.30 / $2.50 per million tokens (Gemini 2.5 Flash) to $1.50 / $9.00 (3.5 Flash) — 5× on input, ~3.6× on output. [[Simon Willison]] clocked it as roughly a tripling versus the immediately prior Flash preview, and noted that all three major labs are now "probing the price tolerance of their API customers." It is one data point in a wider trend: [[AI API prices are rising]] across the closed-weight frontier. The number that should change how you read the tiering: on [[Artificial Analysis]]'s test suite, the "high" configuration of 3.5 Flash cost ~$1,551 to run — more than Google's own Gemini 3.1 Pro Preview (~$892). A Flash model can now cost more to run than a Pro model. The word "Flash" is doing marketing work, not pricing work. ## The verbosity trap HN practitioners describe a consistent behaviour: the model is very fast and decent for its tier, but it papers over reasoning gaps by *doing more* — adding code, elaborating, generating extra output — instead of fixing the core problem. That matters for cost. The per-token price is already up; a model that emits more tokens per task inflates the real dollar cost a second time. This is the same token-economy caveat flagged in the [[DeepSeek v4]] note, pointing the other way. There, cheap-per-token verbosity erodes the discount; here, expensive-per-token verbosity compounds the premium. The headline per-token number is never the figure to decide on — [[AI Cost Management]] is a per-workload question, not a per-model one. ## Positioning - Already in production at Shopify, Salesforce, and Macquarie Bank for forecasting, customer onboarding, and enterprise task automation - Strongest where speed × agentic competence beats raw reasoning depth: high-volume tool-use loops, batch document work, [[Vibe Coding|vibe-coding]]-style iteration - Competes with [[Claude Opus 4.7]], [[GPT-5.4]], and the cheap open-weight frontier ([[DeepSeek v4]], [[Kimi K2.6]]) — and it is the open-weight models that reset the price floor 3.5 Flash is now pushing against from the wrong side ## Caveats - Benchmark figures are Google-published; there is no full model card at this version granularity - The "4× faster" and cost claims come from launch posts and third-party tests; your workload's token mix will dominate the real number - Some developers are explicitly declining the upgrade and re-architecting to call Gemini only for web-grounding requests — the economics do not survive every use case ## References - Google announcement: https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-3-5/ - Simon Willison's writeup: https://simonwillison.net/2026/May/19/gemini-35-flash/ - Hacker News discussion: https://news.ycombinator.com/item?id=48196570 ## Related - [[Gemini 3]] - [[Gemini]] - [[Google DeepMind]] - [[Gemini 3.5 Pro]] - [[Gemini 3.1 Flash Live]] - [[Gemini Spark]] - [[Large Language Models (LLMs)]] - [[AI Agents]] - [[AI Cost Management]] - [[AI API prices are rising]] - [[Artificial Analysis]] - [[Simon Willison]] - [[DeepSeek v4]] - [[Kimi K2.6]] - [[Claude Opus 4.7]] - [[GPT-5.4]] - [[GPT-5.5]] - [[AI Frontier Model]] - [[Context Window]] - [[Vibe Coding]]