Meta's quality-focused MoE model with 17B active parameters (400B total, 128 experts). Targets quality-critical tasks with benchmark scores competitive with GPT-4o and Gemini 2.5 Pro.
1.0M tokens
8
available
Cheapest
Anyscale
$0.50/1M tokens
Amazon Bedrock, Anyscale, Azure AI, Deep Infra, Fireworks, Hugging Face Inference, Replicate, Together AI
Sorted by total cost (input + output per 1M tokens). Click a row to view provider details.
| Provider | Pricing (per 1M) | Rate Limits | Regions | Health | Latency |
|---|---|---|---|---|---|
In: $0.25Out: $0.25 | 600 RPM / 1.0M TPM | us-east-1us-west-2 | Healthy | 0ms | |
In: $0.27Out: $0.27 | 600 RPM / 1.0M TPM | us-east-1us-west-2 | Healthy | 0ms | |
In: $0.30Out: $0.30 | 300 RPM / 500K TPM | us-east-1eu-west-1 | Unhealthy | 0ms | |
In: $0.37Out: $0.37 | 200 RPM / 600K TPM | us-east-1us-west-2eu-west-1 | Healthy | 0ms | |
In: $0.20Out: $0.60 | 600 RPM / 1.0M TPM | us-east-1eu-west-1 | Healthy | 0ms | |
In: $0.22Out: $0.88 | 600 RPM / 1.0M TPM | us-east-1us-west-2 | Healthy | 0ms | |
In: $0.30Out: $0.95 | 300 RPM / 500K TPM | us-east-1us-west-2 | Healthy | 0ms | |
In: $0.34Out: $0.99 | 100 RPM / 400K TPM | us-east-1us-west-2 | Healthy | 0ms |
Use this model via Anyscale with an OpenAI-compatible SDK.
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://api.anyscale.com/v1",
apiKey: process.env.ANYSCALE_API_KEY,
});
const response = await client.chat.completions.create({
model: "meta-llama/Llama-4-Maverick-17B-128E-Instruct",
messages: [
{ role: "user", content: "Hello!" }
],
});
console.log(response.choices[0].message.content);Using Anyscale API • OpenAI-compatible SDK