Gemini 3.5 Flash

2 min · model, google

Gemini 3.5 Flash

Google's current flagship for agentic and coding tasks. Released May 19, 2026 at Google I/O. Launched directly to general availability — no preview suffix. Outperforms Gemini 3.1 Pro on agentic benchmarks while coming in ~40% cheaper per token than that Pro model. Positioned as a full agentic model at efficiency pricing, not a capability-cutdown.

Gemini 3.5 Pro is slated to follow.

Benchmarks

Benchmark	Score
Terminal-Bench 2.1	76.2%
MCP Atlas	83.6%
Toolathlon	56.5%
Finance Agent v2	57.9%
CharXiv Reasoning (multimodal)	84.2%
GDPval-AA	1656 Elo

All scores beat Gemini 3.1 Pro. Terminal-Bench 2.1 is +5.9 points over 3.1 Pro (70.3%). MCP Atlas tests mcp-based tool coordination; 83.6% leads the published leaderboard at launch. Finance Agent v2 improvement (+14.9 points) is the largest delta across benchmarks.

Pricing

Input: $1.50 per million tokens
Output: $9.00 per million tokens
Cached input: $0.15 per million tokens

Roughly 40% below Gemini 3.1 Pro ($2.00/$12.00). 3× higher than 3.0 Flash Preview, but with significantly higher capability (outperforms last generation's Pro model).

Context Window

1M tokens (1,048,576). Max output: 64K tokens.

Speed

4× faster output tokens per second than comparable frontier models. Tasks often complete at under half the cost of slower alternatives at the same capability tier.

Strengths

Agentic tool use: Leads MCP Atlas (83.6%) and Toolathlon (56.5%), which directly measure mcp and multi-tool orchestration capability.
Speed/cost ratio: 4× throughput advantage makes it practical for high-volume agentic pipelines where latency compounds across subagent calls.
Multimodal reasoning: 84.2% on CharXiv Reasoning; strong for document and chart understanding.
Finance/domain agents: Largest benchmark delta over prior Pro model in Finance Agent v2.

Weaknesses

Priced 3× higher than prior Flash-tier models — cost increase may be material for very high-volume deployments that relied on Flash cheapness.
Gemini 3.5 Pro not yet available; if top-tier reasoning is the priority, no 3.5-generation Pro option yet.
Context window shared at 1M tokens with several competitors; no differentiation there.

Use Cases

Best for: agentic-workflows with heavy tool use, mcp-based pipelines, multimodal document analysis, coding agents where speed and cost matter, Google-ecosystem deployments.

Not ideal for: tasks requiring the absolute ceiling of reasoning (wait for Gemini 3.5 Pro), or cases where the previous Flash-tier pricing ($0.10–0.50/MTok) was a hard constraint.

Availability

Gemini API via Google AI Studio (general availability)
Vertex AI
Google Antigravity (agent-first development platform)
Gemini app (globally)
AI Mode in Google Search
Gemini Enterprise Agent Platform

API model ID: gemini-3.5-flash

gemini-2-5-pro · mcp · evals · agentic-workflows

Sources

linked from

Gemini 2.5 Ultra

Gemini 3.5 Flash

Gemini 3.5 Flash

Benchmarks

Pricing

Context Window

Speed

Strengths

Weaknesses

Use Cases

Availability

Related

Sources