HLE
53.0 score
Compare Claude Opus 4.6 vs GPT, Claude, Gemini, DeepSeek, open-weight, and frontier AI models using public benchmark scores, token pricing, context window, and access details.
AskClash combines public LLM benchmark cells into a weighted percentile score and penalizes missing coverage so narrow rows do not dominate better-measured models.
Cached benchmark values can include HLE, GPQA, SWE-bench, SWE-Pro, SWE-Atlas, Terminal-Bench, MCP Atlas, MMMU-Pro, ARC-AGI-2, Tau2, and model-specific coding or agent scores.
53.0 score
91.3 score
98.0 score
80.8 score
11.8 score
70.2 score
72.7 score
77.3 score
68.8 score
84.8 score
Use these comparison links to evaluate Claude Opus 4.6 against nearby LLMs by benchmark score, price, context window, and provider.
Cached AskClash article matches that can provide release, provider, benchmark, pricing, or market context around this model.

Latent Space
Putnam 2025 Problems in Rocq using Opus 4.6 and Rocq-MCP Source: arXiv Logic / Formal Methods URL: https://arxiv.org/abs/2603.20405

Nathan Lambert - Interconnects

Latent Space
Last cached leaderboard date: May 22, 2026. This model page is generated from the AskClash LLM Leaderboard cache and linked from the live leaderboard.