# Gemini 3.5 Flash
Gemini 3.5 Flash is a minor release in the [[Gemini 3]] family from [[Google DeepMind]] (May 2026), pitched by Google as its "strongest agentic and coding model yet." Two things make it worth a note. It is the first Flash-tier model that beats the previous generation's Pro. And it is the release where "Flash" stopped meaning "cheap."
## What stands out
- **A Flash model that beats the prior Pro.** Google reports it outperforming Gemini 3.1 Pro on hard coding and agentic benchmarks. The tier names (Pro / Flash / Nano) no longer track the capability ladder cleanly.
- **Speed is the actual product.** Roughly 4× faster output-token generation than other frontier models. On long-horizon agentic work — multi-step tool use, document reasoning, code transformation — that throughput compounds into real wall-clock savings.
- **Tuned for agents, not trivia.** Its strong numbers sit on agent-shaped evals (tool use, terminal work), not raw-knowledge ones.
## Benchmarks (Google-reported, May 2026)
- Terminal-Bench 2.1: 76.2%
- MCP Atlas: 83.6%
- GDPval-AA: 1656 Elo
- CharXiv Reasoning (multimodal): 84.2%
These are Google's own figures, so treat them directionally until independent replication. The shape is consistent though: it indexes hard on agentic and tool-use tasks.
## The pricing story
This is the real story of the release. API pricing jumped from $0.30 / $2.50 per million tokens (Gemini 2.5 Flash) to $1.50 / $9.00 (3.5 Flash) — 5× on input, ~3.6× on output. [[Simon Willison]] clocked it as roughly a tripling versus the immediately prior Flash preview, and noted that all three major labs are now "probing the price tolerance of their API customers." It is one data point in a wider trend: [[AI API prices are rising]] across the closed-weight frontier.
The number that should change how you read the tiering: on [[Artificial Analysis]]'s test suite, the "high" configuration of 3.5 Flash cost ~$1,551 to run — more than Google's own Gemini 3.1 Pro Preview (~$892). A Flash model can now cost more to run than a Pro model. The word "Flash" is doing marketing work, not pricing work.
## The verbosity trap
HN practitioners describe a consistent behaviour: the model is very fast and decent for its tier, but it papers over reasoning gaps by *doing more* — adding code, elaborating, generating extra output — instead of fixing the core problem. That matters for cost. The per-token price is already up; a model that emits more tokens per task inflates the real dollar cost a second time.
This is the same token-economy caveat flagged in the [[DeepSeek v4]] note, pointing the other way. There, cheap-per-token verbosity erodes the discount; here, expensive-per-token verbosity compounds the premium. The headline per-token number is never the figure to decide on — [[AI Cost Management]] is a per-workload question, not a per-model one.
## Positioning
- Already in production at Shopify, Salesforce, and Macquarie Bank for forecasting, customer onboarding, and enterprise task automation
- Strongest where speed × agentic competence beats raw reasoning depth: high-volume tool-use loops, batch document work, [[Vibe Coding|vibe-coding]]-style iteration
- Competes with [[Claude Opus 4.7]], [[GPT-5.4]], and the cheap open-weight frontier ([[DeepSeek v4]], [[Kimi K2.6]]) — and it is the open-weight models that reset the price floor 3.5 Flash is now pushing against from the wrong side
## Caveats
- Benchmark figures are Google-published; there is no full model card at this version granularity
- The "4× faster" and cost claims come from launch posts and third-party tests; your workload's token mix will dominate the real number
- Some developers are explicitly declining the upgrade and re-architecting to call Gemini only for web-grounding requests — the economics do not survive every use case
## References
- Google announcement: https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-3-5/
- Simon Willison's writeup: https://simonwillison.net/2026/May/19/gemini-35-flash/
- Hacker News discussion: https://news.ycombinator.com/item?id=48196570
## Related
- [[Gemini 3]]
- [[Gemini]]
- [[Google DeepMind]]
- [[Gemini 3.5 Pro]]
- [[Gemini 3.1 Flash Live]]
- [[Gemini Spark]]
- [[Large Language Models (LLMs)]]
- [[AI Agents]]
- [[AI Cost Management]]
- [[AI API prices are rising]]
- [[Artificial Analysis]]
- [[Simon Willison]]
- [[DeepSeek v4]]
- [[Kimi K2.6]]
- [[Claude Opus 4.7]]
- [[GPT-5.4]]
- [[GPT-5.5]]
- [[AI Frontier Model]]
- [[Context Window]]
- [[Vibe Coding]]