HLE
52.2 score
Compare GPT-5.5 vs GPT, Claude, Gemini, DeepSeek, open-weight, and frontier AI models using public benchmark scores, token pricing, context window, and access details.
AskClash combines public LLM benchmark cells into a weighted percentile score and penalizes missing coverage so narrow rows do not dominate better-measured models.
Cached benchmark values can include HLE, GPQA, SWE-bench, SWE-Pro, SWE-Atlas, Terminal-Bench, MCP Atlas, MMMU-Pro, ARC-AGI-2, Tau2, and model-specific coding or agent scores.
52.2 score
93.6 score
58.6 score
84.1 score
78.7 score
75.3 score
51.8 score
83.2 score
85.0 score
98.0 score
Use these comparison links to evaluate GPT-5.5 against nearby LLMs by benchmark score, price, context window, and provider.
Cached AskClash article matches that can provide release, provider, benchmark, pricing, or market context around this model.
OpenAI locks GPT-5.5-Cyber behind velvet rope • The Register The Register Home Page Search Search The Register Navigation Topics Security All Security Cyber-crime Patches Research CSO Off-Prem All Off-Prem Edge + IoT Channel PaaS + IaaS SaaS On-Prem All On-Prem Systems Storage Networks HPC Personal Tech Cx0 Public Sector Software All Software AI + ML Applications Databases DevOps OSes Virtualization Offbeat All Offbeat Columnists Science Geek's Guide BOFH Legal Bootnotes Site News About Us More
Our evaluation of OpenAI's GPT-5.5 cyber capabilities Simon Willison’s Weblog Subscribe 30th April 2026 - Link Blog Our evaluation of OpenAI's GPT-5.5 cyber capabilities . The UK's AI Security Institute previously evaluated Claude Mythos : now they've evaluated GPT-5.5 for finding security vulnerability and found it to be comparable to Mythos, but unlike Mythos it's generally available right now. Posted 30th April 2026 at 11:03 pm Recent articles LLM 0.32a0 is a major backwards-compatible refact

Latent Space

Latent Space
Last cached leaderboard date: May 22, 2026. This model page is generated from the AskClash LLM Leaderboard cache and linked from the live leaderboard.