Money Boy Cultural Literacy Test

This benchmark evaluates knowledge of the Austrian rapper Money Boy, including his biography, lyrics, and pop culture influence. It tests the model's ability to identify specific biographical facts and complete iconic German cloud rap verses.

Jan 6, 2026
7 tasks
110 models
$2.0850
karllorey
Link only

ResultsPreliminary

Vote in the arena

108 of 110 models on the leaderboard so far. More join with each arena vote.

Gemini 3 Flash Preview
by Google
100%
score
Gemini 3.1 Pro Preview
by Google
90%
score
Gemini 3.5 Flash
by Google
89%
score
4
GPT-5.5 Pro
by OpenAI
74%
score
5
GPT-5.3 Chat
by OpenAI
73%
score

Prompt Details

Expand each prompt to see per-model responses and reasoning.

Model Comparison

Compare performance across models and prompts.

Gemini 3 Flash Preview
by Google on OpenRouter
1.4s
$0.0014
100%
Gemini 3.1 Pro Preview
by Google on OpenRouter
9.1s
$0.0687
90%
Gemini 3.5 Flash
by Google on OpenRouter
2.1s
$0.0052
89%
GPT-5.5 Pro
by OpenAI on OpenRouter
26.8s
$0.8640
74%
GPT-5.3 Chat
by OpenAI on OpenRouter
2.8s
$0.0135
73%
Gemini 3.1 Flash Lite
by Google on OpenRouter
823ms
$0.0005
72%
GPT-5.5
by OpenAI on OpenRouter
4.5s
$0.0313
71%
GPT-5.2
by OpenAI on OpenRouter
5.0s
$0.0175
67%
Kimi K2.6
by MoonshotAI on OpenRouter
19.3s
$0.0193
66%
Claude Sonnet 4.5
by Anthropic on OpenRouter
3.9s
$0.0124
66%

Value Analysis

Find models with the best balance of quality, cost, and speed.

Best value frontier
Best value
Size = duration

Highlighted models offer the best score at their price point. Larger dots take longer to produce a result.

Token Usage

Average tokens used per model across all prompts.

Qwen3.5-35B-A3BOpenRouter
5,139 avg (96 in / 5,043 out)
Qwen3.5-9BOpenRouter
3,442 avg (95 in / 3,347 out)
Qwen3.5-27BOpenRouter
3,119 avg (96 in / 3,023 out)
Qwen3.6 PlusOpenRouter
2,075 avg (94 in / 1,981 out)
Hy3 previewOpenRouter
2,056 avg (94 in / 1,962 out)