This benchmark evaluates an LLM's knowledge of German and Central European geography, focusing on relative positioning, proximity, and spatial orientation between cities and landmarks.
29 of 110 models on the leaderboard so far. More join with each arena vote.
Expand each prompt to see per-model responses and reasoning.
Compare performance across models and prompts.
Find models with the best balance of quality, cost, and speed.
Average tokens used per model across all prompts.