GPT-5.4
GPT-5.4
OpenAI's current flagship API model. The "GPT-5.4" identifier maps to a specific deployment in the GPT-5 series. The ChatGPT UI exposes three named modes — Instant, Thinking, and Pro — which map to different underlying models in the 5.x family (Instant = GPT-5.3, Thinking = GPT-5.4, Pro = GPT-5.4 / GPT-5.5 Pro), but on the API, GPT-5.4 is the direct model identifier.
Three Tiers (ChatGPT UI)
Instant — Fast, lower cost. Standard completions without extended reasoning. Maps to GPT-5.3. Comparable to Sonnet-tier for routine tasks.
Thinking — Activates chain-of-thought reasoning. Maps to GPT-5.4. Competes on hard coding, math, and reasoning benchmarks. Similar positioning to extended-thinking in Claude.
Pro — Maximum capability. Slower, expensive. Maps to GPT-5.4 / GPT-5.5 Pro. Equivalent role to claude-opus-4-6 in OpenAI's stack — used when thinking tier misses.
Benchmarks
OpenAI publishes internal evals rather than third-party benchmarks for 5.x series. Competitive with Claude Sonnet 4.6 on SWE-bench class tasks in the Thinking tier. Stronger on structured reasoning, weaker on computer-use-style tasks.
Pricing (GPT-5.4 API)
- Input: $2.50 per million tokens
- Output: $15 per million tokens
- Context window: 1M tokens
Check the OpenAI API pricing page for the latest rates, as pricing in this series changes frequently.
Strengths
- Strong structured reasoning in Thinking/Pro modes
- Best tooling ecosystem — most third-party integrations target OpenAI first
- codex-cli runs natively on GPT-5.4 with sandboxed execution
- Excellent at following complex JSON schemas and structured output formats
Weaknesses
- computer-use capability lags Claude's implementation
- Less transparent about benchmark methodology than competitors
- Vendor lock-in risk — OpenAI's API terms and pricing shift frequently
Use Cases
Best for: integrations with existing OpenAI-based tooling, structured output tasks, agentic coding via codex-cli, teams already on the OpenAI platform.
Consider alternatives: claude-sonnet-4-6 for computer use and autonomous coding, gemini-2-5-pro for very long contexts, deepseek-r1 for math-heavy work.
Related
codex-cli · extended-thinking · evals · agentic-workflows