Google's medium-size open-weight model with 12 billion parameters from the Gemma 4 family. Encoder-free unified multimodal architecture that natively processes text, image, audio, and video inputs without dedicated encoders. Features a 256K context window and supports 140+ languages. First medium-sized model capable of natively ingesting audio. Suitable for local deployment on GPUs with 16GB VRAM.
262K tokens
3
available
Cheapest
Google AI Studio
$0.00/1M tokens
Google AI Studio, Hugging Face Inference, NVIDIA NIM
Sorted by total cost (input + output per 1M tokens). Click a row to view provider details.
| Provider | Pricing (per 1M) | Rate Limits | Regions | Health | Latency |
|---|---|---|---|---|---|
In: FreeOut: Free | 15 RPM / 500K TPM | us-east-1eu-west-1global | Healthy | 0ms | |
In: FreeOut: Free | 200 RPM / 500K TPM | us-east-1us-west-2 | Healthy | 0ms | |
In: $0.10Out: $0.10 | 300 RPM / 500K TPM | us-east-1eu-west-1 | Healthy | 0ms |
Use this model via Google AI Studio with an OpenAI-compatible SDK.
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://api.google-ai-studio.com/v1",
apiKey: process.env.GOOGLE_AI_STUDIO_API_KEY,
});
const response = await client.chat.completions.create({
model: "gemma-4-12b-it",
messages: [
{ role: "user", content: "Hello!" }
],
});
console.log(response.choices[0].message.content);Using Google AI Studio API • OpenAI-compatible SDK