DeepSeek: DeepSeek V3.1 Terminus

by DeepSeek

DeepSeek-V3.1 Terminus is an update to [DeepSeek V3.1](/deepseek/deepseek-chat-v3.1) that maintains the model's original capabilities while addressing issues reported by users, including language consistency and agent capabilities, further optimizing the model's performance in coding and search agents. It is a large hybrid reasoning model (671B parameters, 37B active) that supports both thinking and non-thinking modes. It extends the DeepSeek-V3 base with a two-phase long-context training process, reaching up to 128K tokens, and uses FP8 microscaling for efficient inference. Users can control the reasoning behaviour with the `reasoning` `enabled` boolean. [Learn more in our docs](https://openrouter.ai/docs/use-cases/reasoning-tokens#enable-reasoning-with-default-config) The model improves tool use, code generation, and reasoning efficiency, achieving performance comparable to DeepSeek-R1 on difficult benchmarks while responding more quickly. It supports structured tool calling, code agents, and search agents, making it suitable for research, coding, and agentic workflows.

Avg Score

86.4%

21 answers

Avg Latency

29.2s

9 runs

Pricing

$0.21

input

$0.79

output

per 1M tokens

Context

164K

tokens

Alternatives

Models with similar or better quality but different tradeoffs

Same Quality, Cheaper

Models with similar or better performance at a lower cost per token.

Model	Cost
OpenAI: gpt-oss-20b	-62%
Google: Gemini 2.0 Flash Lite	-49%
Google: Gemini 2.0 Flash	-34%
DeepSeek: DeepSeek V3.2 Exp	-31%
Google: Gemini 2.5 Flash Lite Preview 09-2025	-30%

Same Quality, Faster

Models with similar or better performance but lower latency.

Model	Latency
Mistral: Codestral 2508	-87%
Google: Gemini 2.5 Flash Lite Preview 09-2025	-84%
Google: Gemini 2.5 Flash Preview 09-2025	-77%
OpenAI: GPT-5 Chat	-76%
Perplexity: Sonar Pro	-75%

Same Cost, Better

Models at a similar price point with higher benchmark scores.

Model	Score
Google: Gemini 2.5 Flash Lite	+4%
DeepSeek: DeepSeek V3 0324	+4%
Google: Gemini 2.5 Flash Lite Preview 09-2025	+3%
DeepSeek: DeepSeek V3.2 Exp	+1%
xAI: Grok 4 Fast	+1%

Other Models from DeepSeek

Compare performance with other models from the same creator

Model	Score	Latency	Cost/1M
DeepSeek: R1	96.2%	109.7s	$1.60
DeepSeek: DeepSeek V3.2 Exp	89.6%	96.8s	$0.27
DeepSeek: DeepSeek V3	88.5%	21.2s	$0.75
DeepSeek: DeepSeek V3.1	85.8%	60.0s	$0.45
DeepSeek: R1 0528	85.0%	36.9s	Free
DeepSeek: R1 0528	81.5%	120.7s	$1.07
DeepSeek: DeepSeek V3.2 Speciale	80.7%	152.3s	$0.34
DeepSeek: DeepSeek V3.2	79.9%	37.8s	$0.32
DeepSeek: DeepSeek V3 0324	78.7%	15.3s	$0.53
DeepSeek: R1 Distill Llama 70B	49.2%	33.8s	$0.07
DeepSeek: R1 Distill Qwen 32B	40.8%	53.5s	$0.29
DeepSeek: DeepSeek R1 0528 Qwen3 8B	—	—	$0.07
DeepSeek: DeepSeek Prover V2	—	—	$1.34
DeepSeek: R1 Distill Qwen 14B	—	—	$0.15

Benchmark Performance

How this model performs across different benchmarks

No benchmark data available

Run benchmarks with this model to see performance breakdown

Price vs Performance

Compare cost efficiency across all models

Current model (baseline)

Other models (relative score)

Y-axis shows score difference from shared benchmarks. X-axis uses log scale.

Score Over Time

Performance trends across all benchmark runs

Benchmark Activity

Number of benchmark runs over time

Quickstart

Get started with this model using OpenRouter

View on OpenRouter

import { OpenRouter } from "@openrouter/sdk";

const openrouter = new OpenRouter({
  apiKey: "<OPENROUTER_API_KEY>"
});

const completion = await openrouter.chat.completions.create({
  model: "deepseek/deepseek-v3.1-terminus:exacto",
  messages: [
    {
      role: "user",
      content: "Hello!"
    }
  ]
});

console.log(completion.choices[0].message.content);

Get your API key at openrouter.ai/keys