How does GPT-5.4 mini compare with other LLMs?

AskClash compares GPT-5.4 mini against nearby AI models using public benchmark scores, pricing, context window, and access details.

What benchmarks are tracked for GPT-5.4 mini?

The page shows cached public benchmark cells such as HLE, GPQA, SWE-bench, SWE-Pro, Terminal-Bench, MCP Atlas, MMMU-Pro, ARC-AGI-2, Tau2, and related model scores when available.

LLM Leaderboard · Proprietary

GPT-5.4 mini benchmarks, pricing, and LLM comparison.

Compare GPT-5.4 mini vs GPT, Claude, Gemini, DeepSeek, open-weight, and frontier AI models using public benchmark scores, token pricing, context window, and access details.

Open live LLM leaderboard Open in app

Rank #17AskClash overall score: 49.9

$0.75 / $4.50Input and output token price, when published. Context: 400K.

API/OAuthBilling and access path cached for this model row.

GPT-5.4 mini benchmark snapshot

AskClash combines public LLM benchmark cells into a weighted percentile score and penalizes missing coverage so narrow rows do not dominate better-measured models.

Overall49.9

Benchmark cells11

Context400K

CreatorOpenAI

GPT-5.4 mini public benchmark scores

Cached benchmark values can include HLE, GPQA, SWE-bench, SWE-Pro, SWE-Atlas, Terminal-Bench, MCP Atlas, MMMU-Pro, ARC-AGI-2, Tau2, and model-specific coding or agent scores.

HLE

41.5 score

GPQA

88.0 score

MATH-500

97.4 score

SWE-Pro

54.4 score

Terminal-Bench

60.0 score

OSWorld

72.1 score

MCP Atlas

57.7 score

Finance Agent

45.4 score

MMMU-Pro

78.0 score

Tau2

93.4 score

GPT-5.4 mini vs other AI models

Use these comparison links to evaluate GPT-5.4 mini against nearby LLMs by benchmark score, price, context window, and provider.

GPT-5.4 mini vs GPT-5.5 xHigh GPT-5.4 mini vs GPT-5.5 GPT-5.4 mini vs Claude Opus 4.7 (Adaptive)GPT-5.4 mini vs Gemini 3.5 Flash GPT-5.4 mini vs GPT-5.4 GPT-5.4 mini vs Claude Mythos Preview GPT-5.4 mini vs Claude Opus 4.7 GPT-5.4 mini vs Qwen3.7 Max

Related AI and tech coverage

Cached AskClash article matches that can provide release, provider, benchmark, pricing, or market context around this model.

GPT 5.4 is a big step for Codex

On evaluating and understanding the frontier of agents, and why I still turn to Claude.

datasette-llm 0.1a4

Release: datasette-llm 0.1a4 Simon Willison’s Weblog Subscribe Sponsored by: WorkOS — Ready to sell to Enterprise clients? Build and ship securely with WorkOS. 31st March 2026 Release datasette-llm 0.1a4 — LLM integration plugin for other plugins to depend on Ability to configure different API keys for models based on their purpose - for example, set it up so enrichments always use gpt-5.4-mini with an API key dedicated to that purpose. #4 I released llm-echo 0.3 to provide an API key testing ut

Trusted access for the next era of cyber defense

Trusted access for the next era of cyber defense Simon Willison’s Weblog Subscribe Sponsored by: Teleport — Connect agents to your infra in seconds with Teleport Beams. Built-in identity. Zero secrets. Get early access 14th April 2026 - Link Blog Trusted access for the next era of cyber defense ( via ) OpenAI's answer to Claude Mythos appears to be a new model called GPT-5.4-Cyber: In preparation for increasingly more capable models from OpenAI over the next few months, we are fine-tuning our mo

Quoting Romain Huet

A quote from Romain Huet Simon Willison’s Weblog Subscribe Sponsored by: Sonar — Now with SAST + SCA for secure, dependency-aware Agentic Engineering. SonarQube Advanced Security 25th April 2026 Since GPT-5.4, we’ve unified Codex and the main model into a single system, so there’s no separate coding line anymore. GPT-5.5 takes this further, with strong gains in agentic coding, computer use, and any task on a computer. — Romain Huet , confirming OpenAI won't release a GPT-5.5-Codex model Posted 2

Last cached leaderboard date: May 22, 2026. This model page is generated from the AskClash LLM Leaderboard cache and linked from the live leaderboard.