LLM Leaderboard · Proprietary

GPT-5.3 Codex benchmarks, pricing, and LLM comparison.

Compare GPT-5.3 Codex vs GPT, Claude, Gemini, DeepSeek, open-weight, and frontier AI models using public benchmark scores, token pricing, context window, and access details.

Rank #11AskClash overall score: 68.6
$1.75 / $14.0Input and output token price, when published. Context: 400K.
API/OAuthBilling and access path cached for this model row.

GPT-5.3 Codex benchmark snapshot

AskClash combines public LLM benchmark cells into a weighted percentile score and penalizes missing coverage so narrow rows do not dominate better-measured models.

Overall68.6
Benchmark cells10
Context400K
CreatorOpenAI

GPT-5.3 Codex public benchmark scores

Cached benchmark values can include HLE, GPQA, SWE-bench, SWE-Pro, SWE-Atlas, Terminal-Bench, MCP Atlas, MMMU-Pro, ARC-AGI-2, Tau2, and model-specific coding or agent scores.

HLE

39.9 score

GPQA

91.5 score

MATH-500

99.0 score

IFEval

75.4 score

SWE-bench

85.0 score

SWE-Pro

56.8 score

Terminal-Bench

77.3 score

OSWorld

64.7 score

MMMU-Pro

78.5 score

Tau2

86.0 score

GPT-5.3 Codex vs other AI models

Use these comparison links to evaluate GPT-5.3 Codex against nearby LLMs by benchmark score, price, context window, and provider.

Related AI and tech coverage

Cached AskClash article matches that can provide release, provider, benchmark, pricing, or market context around this model.

Quoting OpenAI Codex base_instructions

Never talk about goblins, gremlins, raccoons, trolls, ogres, pigeons, or other animals or creatures unless it is absolutely and unambiguously relevant to the user's query. - Tracking the history of the now-deceased OpenAI Microsoft AGI clause - 27th April 2026

Last cached leaderboard date: May 25, 2026. This model page is generated from the AskClash LLM Leaderboard cache and linked from the live leaderboard.