All models

Meta: Llama 3.3 70B Instruct

by Meta

The Meta Llama 3.3 multilingual large language model (LLM) is a pretrained and instruction tuned generative model in 70B (text in/text out). The Llama 3.3 instruction tuned text only model is optimized for multilingual dialogue use cases and outperforms many of the available open source and closed chat models on common industry benchmarks. Supported languages: English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai. [Model Card](https://github.com/meta-llama/llama-models/blob/main/models/llama3_3/MODEL_CARD.md)

Avg Score

71.7%

240 answers

Avg Latency

3.5s

15 runs

Pricing

Free

input

/

Free

output

per 1M tokens

Context

131K

tokens

Alternatives

Models with similar or better quality but different tradeoffs

Same Quality, Cheaper

Models with similar or better performance at a lower cost per token.

ModelCost
Meta: Llama 4 Scout-27%

Same Quality, Faster

Models with similar or better performance but lower latency.

Same Cost, Better

Models at a similar price point with higher benchmark scores.

Other Models from Meta

Compare performance with other models from the same creator

Benchmark Performance

How this model performs across different benchmarks

Price vs Performance

Compare cost efficiency across all models

Current model
Other models
X-axis uses log scale for better visualization of price range

Score Over Time

Performance trends across all benchmark runs

Benchmark Activity

Number of benchmark runs over time

Quickstart

Get started with this model using OpenRouter

View on OpenRouter
import { OpenRouter } from "@openrouter/sdk";

const openrouter = new OpenRouter({
  apiKey: "<OPENROUTER_API_KEY>"
});

const completion = await openrouter.chat.completions.create({
  model: "meta-llama/llama-3.3-70b-instruct:free",
  messages: [
    {
      role: "user",
      content: "Hello!"
    }
  ]
});

console.log(completion.choices[0].message.content);

Get your API key at openrouter.ai/keys