Money Boy Cultural Literacy Test

Running

This benchmark evaluates knowledge of the Austrian rapper Money Boy, including his biography, lyrics, and pop culture influence. It tests the model's ability to identify specific biographical facts and complete iconic German cloud rap verses.

Jan 6, 2026

7 tasks

110 models

user_c636b9d7

Link only

ResultsPreliminary

Vote in the arena

99 of 110 models scored automatically so far. Arena votes unlock the rest and refine the ranking.

Gemini 3 Flash Preview

by Google

1.4s

$0.0014

100%

score

Gemini 3.1 Pro Preview

by Google

9.1s

$0.0687

90%

score

Gemini 3.5 Flash

by Google

2.1s

$0.0052

84%

score

GPT-5.5 Pro

by OpenAI

26.8s

$0.8640

74%

score

GPT-5.3 Chat

by OpenAI

2.8s

$0.0135

74%

score

Prompt Details

Expand each prompt to see per-model responses and reasoning.

Model Comparison

Compare performance across models and prompts.

Gemini 3 Flash Preview

by Google on OpenRouter

1.4s

$0.0014

100%

Gemini 3.1 Pro Preview

by Google on OpenRouter

9.1s

$0.0687

90%

Gemini 3.5 Flash

by Google on OpenRouter

2.1s

$0.0052

84%

GPT-5.5 Pro

by OpenAI on OpenRouter

26.8s

$0.8640

74%

GPT-5.3 Chat

by OpenAI on OpenRouter

2.8s

$0.0135

74%

GPT-5.5

by OpenAI on OpenRouter

4.5s

$0.0313

73%

Gemini 3.1 Flash Lite

by Google on OpenRouter

823ms

$0.0005

72%

Claude Sonnet 4.5

by Anthropic on OpenRouter

3.9s

$0.0124

67%

GPT-5.2

by OpenAI on OpenRouter

5.0s

$0.0175

67%

Gemini 2.5 Pro

by Google on OpenRouter

10.5s

$0.0632

63%

Model	Duration	Cost	Score
Gemini 3 Flash Preview by Google on OpenRouter	1.4s	$0.0014	100%
Gemini 3.1 Pro Preview by Google on OpenRouter	9.1s	$0.0687	90%
Gemini 3.5 Flash by Google on OpenRouter	2.1s	$0.0052	84%
GPT-5.5 Pro by OpenAI on OpenRouter	26.8s	$0.8640	74%
GPT-5.3 Chat by OpenAI on OpenRouter	2.8s	$0.0135	74%
GPT-5.5 by OpenAI on OpenRouter	4.5s	$0.0313	73%
Gemini 3.1 Flash Lite by Google on OpenRouter	823ms	$0.0005	72%
Claude Sonnet 4.5 by Anthropic on OpenRouter	3.9s	$0.0124	67%
GPT-5.2 by OpenAI on OpenRouter	5.0s	$0.0175	67%
Gemini 2.5 Pro by Google on OpenRouter	10.5s	$0.0632	63%

Value Analysis

Find models with the best balance of quality, cost, and speed.

Best value frontier

Best value

Size = duration

Highlighted models offer the best score at their price point. Larger dots take longer to produce a result.

Token Usage

Average tokens used per model across all prompts.

Qwen3.5-35B-A3BOpenRouter

5,139 avg (96 in / 5,043 out)

Qwen3.5-9BOpenRouter

3,442 avg (95 in / 3,347 out)

Qwen3.5-27BOpenRouter

3,119 avg (96 in / 3,023 out)

Qwen3.6 PlusOpenRouter

2,075 avg (94 in / 1,981 out)

Hy3 previewOpenRouter

2,056 avg (94 in / 1,962 out)