Qwen: Qwen2.5-VL 7B Instruct

by Qwen

Qwen2.5 VL 7B is a multimodal LLM from the Qwen Team with the following key enhancements: - SoTA understanding of images of various resolution & ratio: Qwen2.5-VL achieves state-of-the-art performance on visual understanding benchmarks, including MathVista, DocVQA, RealWorldQA, MTVQA, etc. - Understanding videos of 20min+: Qwen2.5-VL can understand videos over 20 minutes for high-quality video-based question answering, dialog, content creation, etc. - Agent that can operate your mobiles, robots, etc.: with the abilities of complex reasoning and decision making, Qwen2.5-VL can be integrated with devices like mobile phones, robots, etc., for automatic operation based on visual environment and text instructions. - Multilingual Support: to serve global users, besides English and Chinese, Qwen2.5-VL now supports the understanding of texts in different languages inside images, including most European languages, Japanese, Korean, Arabic, Vietnamese, etc. For more details, see this [blog post](https://qwenlm.github.io/blog/qwen2-vl/) and [GitHub repo](https://github.com/QwenLM/Qwen2-VL). Usage of this model is subject to [Tongyi Qianwen LICENSE AGREEMENT](https://huggingface.co/Qwen/Qwen1.5-110B-Chat/blob/main/LICENSE).

Avg Score

25.5%

20 answers

Avg Latency

14.8s

9 runs

Pricing

Free

input

Free

output

per 1M tokens

Context

33K

tokens

Alternatives

Models with similar or better quality but different tradeoffs

Same Quality, Cheaper

Models with similar or better performance at a lower cost per token.

Model	Cost
Meta: Llama 3.2 3B Instruct	-76%
Meta: Llama 3 8B Instruct	-74%
Mistral: Ministral 3B	-74%
Sao10K: Llama 3 8B Lunaris	-73%
Google: Gemma 3n 4B	-70%

Same Quality, Faster

Models with similar or better performance but lower latency.

Model	Latency
Mistral: Ministral 3B	-83%
Mistral: Devstral Small 1.1	-81%
Mistral: Mistral 7B Instruct	-78%
OpenAI: GPT-3.5 Turbo 16k	-76%
Morph: Morph V3 Fast	-76%

Same Cost, Better

Models at a similar price point with higher benchmark scores.

Model	Score
Qwen: Qwen-Turbo	+30%
Z.AI: GLM 4 32B	+29%
Mistral: Mistral Small 3	+19%
Google: Gemma 3n 4B	+18%
Amazon: Nova Micro 1.0	+11%

Other Models from Qwen

Compare performance with other models from the same creator

Model	Score	Latency	Cost/1M
Qwen: Qwen3 Coder 480B A35B	100.0%	1.1s	Free
Qwen: Qwen3 VL 235B A22B Instruct	88.6%	48.0s	$0.70
Qwen: Qwen3 Max	87.9%	31.7s	$3.60
Qwen: Qwen-Plus	85.8%	22.0s	$0.80
Qwen: Qwen3 235B A22B Thinking 2507	84.6%	248.4s	$0.35
Qwen: Qwen3 Coder Plus	84.2%	20.5s	$3.00
Qwen: Qwen3 Coder 480B A35B	83.5%	19.2s	$0.58
Qwen: Qwen3 Coder 480B A35B	81.3%	6.2s	$1.01
Qwen: Qwen3 Next 80B A3B Thinking	80.8%	28.8s	$0.68
Qwen: Qwen3 VL 235B A22B Thinking	80.4%	112.0s	$1.98
Qwen: Qwen3 VL 32B Instruct	78.8%	26.3s	$1.00
Qwen: Qwen Plus 0728	78.5%	19.6s	$0.80
Qwen: Qwen3 235B A22B	74.6%	78.3s	$0.40
Qwen: Qwen3 30B A3B Instruct 2507	73.1%	30.8s	$0.20
Qwen: Qwen3 Next 80B A3B Instruct	71.4%	27.9s	$0.59
Qwen: Qwen3 VL 8B Thinking	70.8%	79.4s	$1.14
Qwen: Qwen-Turbo	70.7%	18.6s	$0.13
Qwen: Qwen Plus 0728	70.0%	49.2s	$2.20
Qwen: Qwen3 235B A22B Instruct 2507	69.6%	25.6s	$0.27
Qwen: Qwen3 Coder Flash	69.2%	14.1s	$0.90
Qwen: Qwen3 30B A3B Thinking 2507	65.8%	45.6s	$0.20
Qwen: Qwen-Max	65.8%	12.7s	$4.00
Qwen: Qwen2.5 VL 72B Instruct	65.0%	22.1s	$0.38
Qwen: Qwen3 Coder 30B A3B Instruct	63.6%	30.8s	$0.17
Qwen: Qwen3 VL 30B A3B Thinking	63.1%	83.6s	$0.60
Qwen2.5 72B Instruct	62.9%	22.2s	$0.26
Qwen: Qwen3 8B	62.5%	111.5s	$0.15
Qwen: Qwen3 14B	61.5%	70.4s	$0.14
Qwen: Qwen VL Max	61.3%	36.4s	$2.00
Qwen: Qwen3 32B	59.6%	121.7s	$0.16
Qwen: QwQ 32B	59.2%	143.6s	$0.28
Qwen: Qwen VL Plus	57.9%	10.4s	$0.42
Qwen: Qwen3 30B A3B	57.7%	149.8s	$0.14
Qwen: Qwen3 VL 30B A3B Instruct	57.5%	38.3s	$0.38
Qwen: Qwen3 VL 8B Instruct	54.6%	133.7s	$0.29
Qwen2.5 Coder 32B Instruct	52.7%	19.7s	$0.07
Qwen: Qwen2.5 VL 32B Instruct	47.3%	39.1s	$0.14
Qwen: Qwen2.5 7B Instruct	30.8%	12.6s	$0.07
Qwen: Qwen2.5 Coder 7B Instruct	6.7%	6.4s	$0.06
Qwen: Qwen3 Next 80B A3B Instruct	—	—	Free
Qwen: Qwen3 4B (free)	—	—	Free

Benchmark Performance

How this model performs across different benchmarks

No benchmark data available

Run benchmarks with this model to see performance breakdown

Price vs Performance

Compare cost efficiency across all models

Current model (baseline)

Other models (relative score)

Y-axis shows score difference from shared benchmarks. X-axis uses log scale.

Score Over Time

Performance trends across all benchmark runs

Benchmark Activity

Number of benchmark runs over time

Quickstart

Get started with this model using OpenRouter

View on OpenRouter

import { OpenRouter } from "@openrouter/sdk";

const openrouter = new OpenRouter({
  apiKey: "<OPENROUTER_API_KEY>"
});

const completion = await openrouter.chat.completions.create({
  model: "qwen/qwen-2.5-vl-7b-instruct:free",
  messages: [
    {
      role: "user",
      content: "Hello!"
    }
  ]
});

console.log(completion.choices[0].message.content);

Get your API key at openrouter.ai/keys