Evalry
BenchmarksRankingsModels
All Collections

Small

Cheapest models for quick tests

19 modelsPerformance Tier

Highlights

Best Score

Google: Gemini 3 Flash Preview

89.5%

Best Value

OpenAI: gpt-oss-20b

72% @ $0.02

Fastest

Mistral: Ministral 3 3B 2512

196 tok/s

Cheapest

OpenAI: gpt-oss-20b

$0.02 / 1M

Performance Analysis

Based on 19 of 19 models with benchmark data

Price vs Quality

Top-left = best value

Speed vs Quality

Top-right = ideal

Find the best model for you

Run your prompts against these models and see which works best for you.

Start Benchmark
All Models

19 of 19

#
1.
Amazon: Nova Lite 1.0
Amazon45%300K$0.06
2.
Amazon: Nova Micro 1.0
Amazon43%128K$0.04
3.
Anthropic: Claude 3 Haiku
Anthropic62%200K$0.25
4.
Anthropic: Claude 3.5 Haiku
Anthropic72%200K$0.80
5.
Anthropic: Claude Haiku 4.5
Anthropic79%200K$1.00
6.
Auto Router
OpenRouter83%2.0M$4.83
7.
Google: Gemini 2.0 Flash
Google75%1.0M$0.10
8.
Google: Gemini 2.5 Flash
Google80%1.0M$0.30
9.
Google: Gemini 2.5 Flash Lite
Google76%1.0M$0.10
10.
Google: Gemini 3 Flash Preview
Google90%1.0M$0.50
11.
Meta: Llama 3.2 1B Instruct
Meta30%60K$0.03
12.
Meta: Llama 3.2 3B Instruct
Meta37%131K$0.02
13.
Meta: Llama 4 Scout
Meta65%328K$0.08
14.
Mistral: Ministral 3 14B 2512
Mistral69%262K$0.20
15.
Mistral: Ministral 3 3B 2512
Mistral54%131K$0.10
16.
Mistral: Ministral 3 8B 2512
Mistral66%262K$0.15
17.
OpenAI: GPT-5 Nano
OpenAI76%400K$0.05
18.
OpenAI: gpt-oss-120b
OpenAI69%131K$0.04
19.
OpenAI: gpt-oss-20b
OpenAI72%131K$0.02
Evalry

Compare LLM responses across models with automated evaluation.

LinkedInGitHubEmail
HomeBenchmarksRankingsModelsTermsLegalPrivacy

© 2026 apistemic GmbH. All rights reserved.