This benchmark evaluates an LLM's knowledge of German and Central European geography, focusing on relative positioning, proximity, and spatial orientation between cities and landmarks.
Each test is one prompt sent to every model in the benchmark.
11 tests × 110 models = 2420 arena votes for reliable rankings.