GPT-5.5
GPT-5.5
OpenAI's flagship agentic model. Released April 23, 2026. First fully retrained base model since GPT-4.5. Positioned as a single agentic system — takes long sequences of actions, uses tools, browses the web, writes and runs code, and checks its own work without handoff. Strongest gains in agentic coding, computer use, knowledge work, and early scientific research.
A "Pro" tier variant (GPT-5.5 Pro) ships alongside the standard model with significantly higher capability and cost.
Benchmarks
| Benchmark | Score |
|---|---|
| Terminal-Bench 2.0 | 82.7% |
| SWE-Bench Pro | 58.6% |
| FrontierMath Tier 1–3 | 52.4% |
| FrontierMath Tier 4 | 39.6% |
Terminal-Bench 2.0 score of 82.7% is ~7.6 points ahead of GPT-5.4 (75.1%) and leads the leaderboard at launch. FrontierMath Tier 4 (39.6%) is described as nearly double GPT-5.4's score on that tier. SWE-Bench Pro measures real GitHub issue resolution in a single end-to-end pass.
Pricing
| Tier | Input ($/MTok) | Output ($/MTok) |
|---|---|---|
| GPT-5.5 (standard) | $5.00 | $30.00 |
| GPT-5.5 Pro | $30.00 | $180.00 |
| Batch (50% discount) | $2.50 | $15.00 |
| Cached input | $0.50 | — |
Prompts exceeding 272K input tokens are priced at 2× input and 1.5× output for the full session.
Context Window
1.1M tokens (GPT-5.5 Pro). Standard GPT-5.5 context window is in the same range; the Pro variant explicitly supports 1.1M.
Latency
Same per-token latency as GPT-5.4. Speed did not regress with capability gains.
ChatGPT UI Tiers
Within the ChatGPT interface, three response tiers map to different underlying models:
- Instant — backed by GPT-5.3
- Thinking — backed by GPT-5.4
- Pro — backed by GPT-5.5 Pro
Strengths
- Agentic coding: Top Terminal-Bench 2.0 score at launch; handles multi-step software tasks end-to-end without intervention.
- Math and reasoning: Large jump on FrontierMath Tier 4 relative to prior generation.
- Tool use and computer use: Designed as a single system that integrates browsing, code execution, and action-taking without handoff between specialist models.
- Ecosystem: Available in Codex for automated software development workflows; deep integration across ChatGPT, API, and enterprise tooling.
Weaknesses
- High cost — $30/MTok output for standard puts it above most frontier models; Pro tier is expensive at scale.
- No public extended thinking / chain-of-thought transparency in the same form as extended-thinking on Anthropic models.
- Context window advantage is primarily on Pro tier.
Use Cases
Best for: autonomous coding agents (Codex), knowledge-work automation, multi-step research tasks, computer use workflows where OpenAI ecosystem integration matters.
Not ideal for: cost-sensitive high-volume pipelines (batch pricing helps but output cost is high), tasks where explicit CoT reasoning control is needed.
Availability
- ChatGPT: Plus, Pro, Business, Enterprise (standard and Pro tiers vary by plan)
- API: GPT-5.5 and GPT-5.5 Pro generally available
- Codex
Related
gpt-5-4 · evals · agentic-workflows