Explain Like I'm 5

This benchmark measures the ability to explain complex topics simply and concisely for a five-year-old.

May 14, 2026
4 tasks
110 models
$0.2307
karllorey
Public

ResultsPreliminary

Vote in the arena

49 of 110 models on the leaderboard so far. More join with each arena vote.

Ministral 3 14B 2512
by Mistral
100%
score
Claude Sonnet 4.6
by Anthropic
100%
score
Claude Haiku 4.5
by Anthropic
100%
score
4
GLM 5.1
by Z.ai
100%
score
5
GPT-5.5 Pro
by OpenAI
100%
score

Prompt Details

Expand each prompt to see per-model responses and reasoning.

Model Comparison

Compare performance across models and prompts.

Ministral 3 14B 2512
by Mistral on OpenRouter
987ms
$0.0001
100%
Claude Sonnet 4.6
by Anthropic on OpenRouter
3.3s
$0.0015
100%
Claude Haiku 4.5
by Anthropic on OpenRouter
2.4s
$0.0018
100%
GLM 5.1
by Z.ai on OpenRouter
18.1s
$0.0104
100%
GPT-5.5 Pro
by OpenAI on OpenRouter
12.4s
$0.0750
100%
Mistral Large 3 2512
by Mistral on OpenRouter
2.2s
$0.0003
98%
Claude 3 Haiku
by Anthropic on OpenRouter
1.4s
$0.0001
96%
Claude Opus 4.6
by Anthropic on OpenRouter
5.4s
$0.0135
96%
Qwen3.6 Max Preview
by Qwen on OpenRouter
27.6s
$0.0130
94%
Kimi K2 Thinking
by MoonshotAI on OpenRouter
7.9s
$0.0035
94%

Value Analysis

Find models with the best balance of quality, cost, and speed.

Best value frontier
Best value
Size = duration

Highlighted models offer the best score at their price point. Larger dots take longer to produce a result.

Token Usage

Average tokens used per model across all prompts.

DeepSeek V3.2 SpecialeOpenRouter
1,509 avg (89 in / 1,420 out)
Gemini 2.5 ProOpenRouter
1,502 avg (89 in / 1,413 out)
Qwen3.6 FlashOpenRouter
1,409 avg (102 in / 1,308 out)
GLM 4.7OpenRouter
1,384 avg (93 in / 1,291 out)
Hy3 previewOpenRouter
1,356 avg (99 in / 1,257 out)