This benchmark evaluates an LLM's knowledge of German and Central European geography, focusing on relative positioning, proximity, and spatial orientation between cities and landmarks.
27 of 110 models scored automatically so far. Arena votes unlock the rest and refine the ranking.
Expand each prompt to see per-model responses and reasoning.
Compare performance across models and prompts.
Find models with the best balance of quality, cost, and speed.
Average tokens used per model across all prompts.