GLM 4.7 Flash

As a 30B-class SOTA model, GLM-4.7-Flash offers a new option that balances performance and efficiency. It is further optimized for agentic coding use cases, strengthening coding capabilities, long-horizon task planning,...

by Z.ai

Overview

Quick stats across all benchmark runs.

Score

37%

8 benchmarks

Avg Latency

30.0s

25 requests

Pricing

$0.06 in / $0.40 out

per 1M tokens

Context

200K

tokens

Alternatives

Models with similar or better quality but different tradeoffs

Same Quality, Cheaper

Models with similar or better performance at a lower cost per token.

Model	Cost
Llama 4 Scout	-59%
gpt-oss-20b	-53%
Ministral 3 14B 2512	-49%
Gemini 2.5 Flash Lite	-49%
Llama 3.3 70B Instruct	-46%

Same Quality, Faster

Models with similar or better performance but lower latency.

Model	Latency
Gemini 2.5 Flash Lite	-62%
Gemini 2.5 Flash	-60%
Llama 4 Maverick	-60%
Llama 4 Scout	-60%
Ministral 3 14B 2512	-54%

Same Cost, Better

Models at a similar price point with higher benchmark scores.

Model	Score
DeepSeek V3.2	+8%
Llama 4 Scout	+7%
gpt-oss-20b	+7%
Mistral Large 3 2512	+3%
Llama 4 Maverick	+0%

Benchmark Performance

How this model performs across different benchmarks

Benchmark	Score	Rank
Categorization Bench	37%	34 / 52
Money Boy Cultural Literacy Test	28%	76 / 99

Price vs Performance

Compare cost efficiency across all models

Current model (baseline)

Other models (relative score)

Y-axis shows score difference from shared benchmarks. X-axis uses log scale.

Score Over Time

Performance trends across all benchmark runs

Benchmark Activity

Number of benchmark runs over time

Quickstart

View on OpenRouter

Get started with this model using OpenRouter

import { OpenRouter } from "@openrouter/sdk";

const openrouter = new OpenRouter({
  apiKey: "<OPENROUTER_API_KEY>"
});

const completion = await openrouter.chat.completions.create({
  model: "z-ai/glm-4.7-flash",
  messages: [
    {
      role: "user",
      content: "Hello!"
    }
  ]
});

console.log(completion.choices[0].message.content);

Get your API key at openrouter.ai/keys

Other Models from Z.ai

Compare performance with other models from the same creator

Model	Latency	Cost/1M	Score
GLM 4.7	53.4s	$1.07	52%
GLM 5.1	20.1s	$2.03	46%
GLM 4.6V	43.5s	$0.60	—
GLM 4 32B	18.7s	$0.10	—
GLM 4.5V	51.3s	$1.20	—
GLM 4.5	48.4s	$1.40	—
GLM 4.6	36.3s	$1.08	—
GLM 4.5 Air	28.4s	$0.49	—
GLM 5V Turbo	—	$2.60	—
GLM 5 Turbo	15.2s	$2.60	—
GLM 5	7.8s	$1.26	—