Gemini 3.5 Flash
Gemini 3.5 Flash
Google's current flagship for agentic and coding tasks. Released May 19, 2026 at Google I/O. Launched directly to general availability — no preview suffix. Outperforms Gemini 3.1 Pro on agentic benchmarks while coming in ~40% cheaper per token than that Pro model. Positioned as a full agentic model at efficiency pricing, not a capability-cutdown.
Gemini 3.5 Pro is slated to follow.
Benchmarks
| Benchmark | Score |
|---|---|
| Terminal-Bench 2.1 | 76.2% |
| MCP Atlas | 83.6% |
| Toolathlon | 56.5% |
| Finance Agent v2 | 57.9% |
| CharXiv Reasoning (multimodal) | 84.2% |
| GDPval-AA | 1656 Elo |
All scores beat Gemini 3.1 Pro. Terminal-Bench 2.1 is +5.9 points over 3.1 Pro (70.3%). MCP Atlas tests mcp-based tool coordination; 83.6% leads the published leaderboard at launch. Finance Agent v2 improvement (+14.9 points) is the largest delta across benchmarks.
Pricing
- Input: $1.50 per million tokens
- Output: $9.00 per million tokens
- Cached input: $0.15 per million tokens
Roughly 40% below Gemini 3.1 Pro ($2.00/$12.00). 3× higher than 3.0 Flash Preview, but with significantly higher capability (outperforms last generation's Pro model).
Context Window
1M tokens (1,048,576). Max output: 64K tokens.
Speed
4× faster output tokens per second than comparable frontier models. Tasks often complete at under half the cost of slower alternatives at the same capability tier.
Strengths
- Agentic tool use: Leads MCP Atlas (83.6%) and Toolathlon (56.5%), which directly measure mcp and multi-tool orchestration capability.
- Speed/cost ratio: 4× throughput advantage makes it practical for high-volume agentic pipelines where latency compounds across subagent calls.
- Multimodal reasoning: 84.2% on CharXiv Reasoning; strong for document and chart understanding.
- Finance/domain agents: Largest benchmark delta over prior Pro model in Finance Agent v2.
Weaknesses
- Priced 3× higher than prior Flash-tier models — cost increase may be material for very high-volume deployments that relied on Flash cheapness.
- Gemini 3.5 Pro not yet available; if top-tier reasoning is the priority, no 3.5-generation Pro option yet.
- Context window shared at 1M tokens with several competitors; no differentiation there.
Use Cases
Best for: agentic-workflows with heavy tool use, mcp-based pipelines, multimodal document analysis, coding agents where speed and cost matter, Google-ecosystem deployments.
Not ideal for: tasks requiring the absolute ceiling of reasoning (wait for Gemini 3.5 Pro), or cases where the previous Flash-tier pricing ($0.10–0.50/MTok) was a hard constraint.
Availability
- Gemini API via Google AI Studio (general availability)
- Vertex AI
- Google Antigravity (agent-first development platform)
- Gemini app (globally)
- AI Mode in Google Search
- Gemini Enterprise Agent Platform
API model ID: gemini-3.5-flash
Related
gemini-2-5-pro · mcp · evals · agentic-workflows