NVIDIA: Llama 3.1 Nemotron 70B Instruct

by NVIDIA

NVIDIA's Llama 3.1 Nemotron 70B is a language model designed for generating precise and useful responses. Leveraging [Llama 3.1 70B](/models/meta-llama/llama-3.1-70b-instruct) architecture and Reinforcement Learning from Human Feedback (RLHF), it excels in automatic alignment benchmarks. This model is tailored for applications requiring high accuracy in helpfulness and response generation, suitable for diverse user queries across multiple domains. Usage of this model is subject to [Meta's Acceptable Use Policy](https://www.llama.com/llama3/use-policy/).

Avg Score

53.8%

12 answers

Avg Latency

16.7s

8 runs

Pricing

$1.20

input

$1.20

output

per 1M tokens

Context

131K

tokens

Alternatives

Models with similar or better quality but different tradeoffs

No alternatives found

Run benchmarks on this model to discover alternatives

Other Models from NVIDIA

Compare performance with other models from the same creator

Model	Score	Latency	Cost/1M
NVIDIA: Llama 3.3 Nemotron Super 49B V1.5	77.1%	31.1s	$0.25
NVIDIA: Nemotron 3 Nano 30B A3B	72.1%	9.6s	Free
NVIDIA: Llama 3.1 Nemotron Ultra 253B v1	70.0%	61.9s	$1.20
NVIDIA: Nemotron Nano 9B V2	58.5%	30.1s	$0.10
NVIDIA: Nemotron Nano 12B 2 VL	49.6%	34.2s	$0.40
NVIDIA: Nemotron Nano 12B 2 VL	45.0%	106.8s	Free
NVIDIA: Nemotron Nano 9B V2	41.3%	55.4s	Free
NVIDIA: Nemotron 3 Nano 30B A3B	—	—	$0.13

Benchmark Performance

How this model performs across different benchmarks

No benchmark data available

Run benchmarks with this model to see performance breakdown

Price vs Performance

Compare cost efficiency across all models

Current model (baseline)

Other models (relative score)

Y-axis shows score difference from shared benchmarks. X-axis uses log scale.

Score Over Time

Performance trends across all benchmark runs

Benchmark Activity

Number of benchmark runs over time

Quickstart

Get started with this model using OpenRouter

View on OpenRouter

import { OpenRouter } from "@openrouter/sdk";

const openrouter = new OpenRouter({
  apiKey: "<OPENROUTER_API_KEY>"
});

const completion = await openrouter.chat.completions.create({
  model: "nvidia/llama-3.1-nemotron-70b-instruct",
  messages: [
    {
      role: "user",
      content: "Hello!"
    }
  ]
});

console.log(completion.choices[0].message.content);

Get your API key at openrouter.ai/keys