GPT-5.4

2 min · model, openai

GPT-5.4

OpenAI's current flagship API model. The "GPT-5.4" identifier maps to a specific deployment in the GPT-5 series. The ChatGPT UI exposes three named modes — Instant, Thinking, and Pro — which map to different underlying models in the 5.x family (Instant = GPT-5.3, Thinking = GPT-5.4, Pro = GPT-5.4 / GPT-5.5 Pro), but on the API, GPT-5.4 is the direct model identifier.

Three Tiers (ChatGPT UI)

Instant — Fast, lower cost. Standard completions without extended reasoning. Maps to GPT-5.3. Comparable to Sonnet-tier for routine tasks.

Thinking — Activates chain-of-thought reasoning. Maps to GPT-5.4. Competes on hard coding, math, and reasoning benchmarks. Similar positioning to extended-thinking in Claude.

Pro — Maximum capability. Slower, expensive. Maps to GPT-5.4 / GPT-5.5 Pro. Equivalent role to claude-opus-4-6 in OpenAI's stack — used when thinking tier misses.

Benchmarks

OpenAI publishes internal evals rather than third-party benchmarks for 5.x series. Competitive with Claude Sonnet 4.6 on SWE-bench class tasks in the Thinking tier. Stronger on structured reasoning, weaker on computer-use-style tasks.

Pricing (GPT-5.4 API)

Input: $2.50 per million tokens
Output: $15 per million tokens
Context window: 1M tokens

Check the OpenAI API pricing page for the latest rates, as pricing in this series changes frequently.

Strengths

Strong structured reasoning in Thinking/Pro modes
Best tooling ecosystem — most third-party integrations target OpenAI first
codex-cli runs natively on GPT-5.4 with sandboxed execution
Excellent at following complex JSON schemas and structured output formats

Weaknesses

computer-use capability lags Claude's implementation
Less transparent about benchmark methodology than competitors
Vendor lock-in risk — OpenAI's API terms and pricing shift frequently

Use Cases

Best for: integrations with existing OpenAI-based tooling, structured output tasks, agentic coding via codex-cli, teams already on the OpenAI platform.

Consider alternatives: claude-sonnet-4-6 for computer use and autonomous coding, gemini-2-5-pro for very long contexts, deepseek-r1 for math-heavy work.

codex-cli · extended-thinking · evals · agentic-workflows

Sources

linked from

Vibe Coding GPT-5.5 Grok 4 Codex CLI

GPT-5.4

GPT-5.4

Three Tiers (ChatGPT UI)

Benchmarks

Pricing (GPT-5.4 API)

Strengths

Weaknesses

Use Cases

Related

Sources