Beta
Leaderboard/DeepSeek-R1 (May 2025)
DeepSeek

DeepSeek-R1 (May 2025)

by DeepSeek 路 Released 2024-01-01

48.5
avg score
N/A
Input Price
N/A
Output Price
N/A
Context Window
text
Type

Tested on 11 benchmarks with 48.5% average. Top scores: MATH level 5 (96.6%), Fiction.LiveBench (75.0%), Aider polyglot (71.4%).

Benchmark Scores

BenchmarkCategoryScoreBar
MATH level 5math96.6
Fiction.LiveBenchknowledge75.0
Aider polyglotcoding71.4
GPQA diamondknowledge68.4
OTIS Mock AIME 2024-2025math66.4
WeirdMLcoding41.6
DeepResearch Benchknowledge35.1
SimpleBenchreasoning29.0
SimpleQA Verifiedknowledge27.4
ARC-AGIreasoning21.2
ARC-AGI-2reasoning1.1

Similar Models