LLM Leaderboard · Open Weight

GLM-5.1 benchmarks, pricing, and LLM comparison.

Compare GLM-5.1 vs GPT, Claude, Gemini, DeepSeek, open-weight, and frontier AI models using public benchmark scores, token pricing, context window, and access details.

Rank #20AskClash overall score: 48.7
$1.40 / $4.40Input and output token price, when published. Context: 203K.
APIBilling and access path cached for this model row.

GLM-5.1 benchmark snapshot

AskClash combines public LLM benchmark cells into a weighted percentile score and penalizes missing coverage so narrow rows do not dominate better-measured models.

Overall48.7
Benchmark cells9
Context203K
CreatorZ.AI

GLM-5.1 public benchmark scores

Cached benchmark values can include HLE, GPQA, SWE-bench, SWE-Pro, SWE-Atlas, Terminal-Bench, MCP Atlas, MMMU-Pro, ARC-AGI-2, Tau2, and model-specific coding or agent scores.

HLE

52.3 score

GPQA

86.2 score

MATH-500

97.4 score

SWE-Pro

19.8 score

Terminal-Bench

65.1 score

MCP Atlas

71.8 score

Tau2

97.7 score

GLM-5.1 vs other AI models

Use these comparison links to evaluate GLM-5.1 against nearby LLMs by benchmark score, price, context window, and provider.

Related AI and tech coverage

Cached AskClash article matches that can provide release, provider, benchmark, pricing, or market context around this model.

SGLang — RadixAttention Inference Server v0.5.11 Release Notes

- **CUDA 13 + Torch 2.11**: Default CUDA version moves to 13.0 across SGLang, sgl-kernel, and Docker images, and PyTorch is upgraded from 2.9 to 2.11 — modernizing the build matrix and unlocking newer kernels: #21247, #24162, #24183, #23593 ([tracking issue #21498](https://github.com/sgl-project/sglang/issues/21498)) - **Day-0 / New Model Support**: Gemma 4, GLM-5.1, Qwen3.6, MiMo-V2.5 / V2.5-Pro, Ling-2.6-Flash, Mistral Medium 3.5, and Kimi-K2.6 — with cookbook recipes for tuned deployment comm

Last cached leaderboard date: May 22, 2026. This model page is generated from the AskClash LLM Leaderboard cache and linked from the live leaderboard.