Nous: Hermes 4 70B

by Nous

Hermes 4 70B is a hybrid reasoning model from Nous Research, built on Meta-Llama-3.1-70B. It introduces the same hybrid mode as the larger 405B release, allowing the model to either respond directly or generate explicit <think>...</think> reasoning traces before answering. Users can control the reasoning behaviour with the `reasoning` `enabled` boolean. [Learn more in our docs](https://openrouter.ai/docs/use-cases/reasoning-tokens#enable-reasoning-with-default-config) This 70B variant is trained with the expanded post-training corpus (~60B tokens) emphasizing verified reasoning data, leading to improvements in mathematics, coding, STEM, logic, and structured outputs while maintaining general assistant performance. It supports JSON mode, schema adherence, function calling, and tool use, and is designed for greater steerability with reduced refusal rates.

Avg Score

68.1%

13 answers

Avg Latency

7.0s

9 runs

Pricing

$0.11

input

$0.38

output

per 1M tokens

Context

131K

tokens

Alternatives

Models with similar or better quality but different tradeoffs

No alternatives found

Run benchmarks on this model to discover alternatives

Other Models from Nous

Compare performance with other models from the same creator

Model	Score	Latency	Cost/1M
Nous: Hermes 3 405B Instruct	100.0%	1.9s	Free
Nous: Hermes 4 405B	50.4%	17.1s	$2.00
Nous: Hermes 3 405B Instruct	46.5%	22.4s	$1.00
Nous: Hermes 3 70B Instruct	27.7%	165.0s	$0.30
NousResearch: Hermes 2 Pro - Llama-3 8B	15.7%	12.8s	$0.14
Nous: DeepHermes 3 Mistral 24B Preview	—	—	$0.06

Benchmark Performance

How this model performs across different benchmarks

No benchmark data available

Run benchmarks with this model to see performance breakdown

Price vs Performance

Compare cost efficiency across all models

Current model (baseline)

Other models (relative score)

Y-axis shows score difference from shared benchmarks. X-axis uses log scale.

Score Over Time

Performance trends across all benchmark runs

Benchmark Activity

Number of benchmark runs over time

Quickstart

Get started with this model using OpenRouter

View on OpenRouter

import { OpenRouter } from "@openrouter/sdk";

const openrouter = new OpenRouter({
  apiKey: "<OPENROUTER_API_KEY>"
});

const completion = await openrouter.chat.completions.create({
  model: "nousresearch/hermes-4-70b",
  messages: [
    {
      role: "user",
      content: "Hello!"
    }
  ]
});

console.log(completion.choices[0].message.content);

Get your API key at openrouter.ai/keys