Gemini 3.5 Flash

Gemini 3.5 Flash

Google's current flagship for agentic and coding tasks. Released May 19, 2026 at Google I/O. Launched directly to general availability — no preview suffix. Outperforms Gemini 3.1 Pro on agentic benchmarks while coming in ~40% cheaper per token than that Pro model. Positioned as a full agentic model at efficiency pricing, not a capability-cutdown.

Gemini 3.5 Pro is slated to follow.

Benchmarks

Benchmark Score
Terminal-Bench 2.1 76.2%
MCP Atlas 83.6%
Toolathlon 56.5%
Finance Agent v2 57.9%
CharXiv Reasoning (multimodal) 84.2%
GDPval-AA 1656 Elo

All scores beat Gemini 3.1 Pro. Terminal-Bench 2.1 is +5.9 points over 3.1 Pro (70.3%). MCP Atlas tests mcp-based tool coordination; 83.6% leads the published leaderboard at launch. Finance Agent v2 improvement (+14.9 points) is the largest delta across benchmarks.

Pricing

  • Input: $1.50 per million tokens
  • Output: $9.00 per million tokens
  • Cached input: $0.15 per million tokens

Roughly 40% below Gemini 3.1 Pro ($2.00/$12.00). 3× higher than 3.0 Flash Preview, but with significantly higher capability (outperforms last generation's Pro model).

Context Window

1M tokens (1,048,576). Max output: 64K tokens.

Speed

4× faster output tokens per second than comparable frontier models. Tasks often complete at under half the cost of slower alternatives at the same capability tier.

Strengths

  • Agentic tool use: Leads MCP Atlas (83.6%) and Toolathlon (56.5%), which directly measure mcp and multi-tool orchestration capability.
  • Speed/cost ratio: 4× throughput advantage makes it practical for high-volume agentic pipelines where latency compounds across subagent calls.
  • Multimodal reasoning: 84.2% on CharXiv Reasoning; strong for document and chart understanding.
  • Finance/domain agents: Largest benchmark delta over prior Pro model in Finance Agent v2.

Weaknesses

  • Priced 3× higher than prior Flash-tier models — cost increase may be material for very high-volume deployments that relied on Flash cheapness.
  • Gemini 3.5 Pro not yet available; if top-tier reasoning is the priority, no 3.5-generation Pro option yet.
  • Context window shared at 1M tokens with several competitors; no differentiation there.

Use Cases

Best for: agentic-workflows with heavy tool use, mcp-based pipelines, multimodal document analysis, coding agents where speed and cost matter, Google-ecosystem deployments.

Not ideal for: tasks requiring the absolute ceiling of reasoning (wait for Gemini 3.5 Pro), or cases where the previous Flash-tier pricing ($0.10–0.50/MTok) was a hard constraint.

Availability

  • Gemini API via Google AI Studio (general availability)
  • Vertex AI
  • Google Antigravity (agent-first development platform)
  • Gemini app (globally)
  • AI Mode in Google Search
  • Gemini Enterprise Agent Platform

API model ID: gemini-3.5-flash

Related

gemini-2-5-pro · mcp · evals · agentic-workflows

Sources