Untitled Benchmark

May 26, 2026
50 tasks
41 models
$0.0092
user_c636b9d7
Public

ResultsPreliminary

Vote in the arena

5 of 41 models on the leaderboard so far. More join with each arena vote.

Prompt Details

Expand each prompt to see per-model responses and reasoning.

Model Comparison

Compare performance across models and prompts.

Value Analysis

Find models with the best balance of quality, cost, and speed.

Best value frontier
Best value
Size = duration

Highlighted models offer the best score at their price point. Larger dots take longer to produce a result.

Token Usage

Average tokens used per model across all prompts.