Meta·Llama 3 family

Llama 3.1 8B Instruct

otherReleased Jul 2024

Meta's efficient open-weight model with 8 billion parameters from the Llama 3.1 family. Optimized for instruction following with strong performance on general tasks, coding, and multilingual benchmarks. Ideal for cost-effective deployment and edge inference scenarios.

Capabilities

chatcompletionfunction-callingcode-generation

Modalities

textcode

Context Window

131K tokens

Providers

available

Available from 2 providers

Cheapest

Modal

$0.40/1M tokens

Modal, Scaleway

Providers (2)

Sorted by total cost (input + output per 1M tokens). Click a row to view provider details.

Provider	Pricing (per 1M)	Rate Limits	Regions	Health	Latency
Modal	In: $0.20Out: $0.20	120 RPM / 400K TPM	us-east-1us-west-2	Healthy	0ms
Scaleway	In: $0.30Out: $0.30	60 RPM / 200K TPM	eu-west-1	Healthy	0ms

Quick Start

Use this model via Modal with an OpenAI-compatible SDK.

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.modal.com/v1",
  apiKey: process.env.MODAL_API_KEY,
});

const response = await client.chat.completions.create({
  model: "meta-llama/Llama-3.1-8B-Instruct",
  messages: [
    { role: "user", content: "Hello!" }
  ],
});

console.log(response.choices[0].message.content);

Using Modal API • OpenAI-compatible SDK