Providers Countries MCP Servers Trends News Calculator Status

Leaderboard/Llama 3.1-405B

Llama 3.1-405B

Open Source

by Meta · Released 2024-01-01

49.3

avg score

N/A

Input Price

N/A

Output Price

N/A

Context Window

text

Type

Tested on 15 benchmarks with 49.3% average. Top scores: ARC AI2 (93.7%), HellaSwag (85.6%), TriviaQA (82.7%).

Benchmark Scores

Benchmark	Category	Score	Bar
ARC AI2	knowledge	93.7
HellaSwag	knowledge	85.6
TriviaQA	knowledge	82.7
MMLU	knowledge	79.3
Winogrande	knowledge	78.4
BBH	reasoning	77.2
PIQA	knowledge	71.8
MATH level 5	math	49.8
GPQA diamond	knowledge	34.5
OpenBookQA	knowledge	32.3
WeirdML	coding	21.4
OTIS Mock AIME 2024-2025	math	9.6
SimpleBench	reasoning	7.6
Cybench	coding	7.5
The Agent Company	agentic	7.4

Similar Models

Gemini 2.0 Flash

Google DeepMind

DeepSeek-R1 (May 2025)

Meta Llama 3.1 Timeline

Llama 3.1 70B Instruct

$0.40/M in131Kctx7 benchmarks

Llama 3.1 8B Instruct

$0.02/M in(-0.38)16Kctx(-115K)8 benchmarks

N/AN/Actx15 benchmarks