Meta's efficient open-weight model with 8 billion parameters from the Llama 3.1 family. Optimized for instruction following with strong performance on general tasks, coding, and multilingual benchmarks. Ideal for cost-effective deployment and edge inference scenarios.
131K tokens
2
available
Cheapest
Modal
$0.40/1M tokens
Modal, Scaleway
Sorted by total cost (input + output per 1M tokens). Click a row to view provider details.
| Provider | Pricing (per 1M) | Rate Limits | Regions | Health | Latency |
|---|---|---|---|---|---|
In: $0.20Out: $0.20 | 120 RPM / 400K TPM | us-east-1us-west-2 | Healthy | 0ms | |
In: $0.30Out: $0.30 | 60 RPM / 200K TPM | eu-west-1 | Healthy | 0ms |
Use this model via Modal with an OpenAI-compatible SDK.
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://api.modal.com/v1",
apiKey: process.env.MODAL_API_KEY,
});
const response = await client.chat.completions.create({
model: "meta-llama/Llama-3.1-8B-Instruct",
messages: [
{ role: "user", content: "Hello!" }
],
});
console.log(response.choices[0].message.content);Using Modal API • OpenAI-compatible SDK