Tests the model's specific knowledge regarding the history, geography, transportation, and culture of the German city Karlsruhe.
29 of 110 models on the leaderboard so far. More join with each arena vote.
Expand each prompt to see per-model responses and reasoning.
Compare performance across models and prompts.
Find models with the best balance of quality, cost, and speed.
Average tokens used per model across all prompts.