Categorization Bench

This benchmark measures the model's ability to suggest relevant category names for a given set of related concepts or items.

May 16, 2026
10 tasks
110 models
$2.1665
karllorey
Public

Tests

Each test is one prompt sent to every model in the benchmark.

10 tests × 110 models = 2200 arena votes for reliable rankings.