Z.ai's (formerly Zhipu AI) flagship open-weight coding model with a 1M-token context window. Mixture-of-Experts architecture with 753B total parameters and ~40B active per request, featuring two cost-balancing reasoning modes. Tops several coding benchmarks while remaining a fraction of the cost of comparable proprietary frontier models. MIT-licensed weights.
1.0M tokens
2
available
Cheapest
OpenRouter
$2.60/1M tokens
OpenRouter, Zhipu AI
Sorted by total cost (input + output per 1M tokens). Click a row to view provider details.
| Provider | Pricing (per 1M) | Rate Limits | Regions | Health | Latency |
|---|---|---|---|---|---|
In: $0.60Out: $2.00 | 200 RPM / 300K TPM | us-east-1 | Healthy | 0ms | |
In: $0.60Out: $2.00 | 300 RPM / 500K TPM | us-east-1global | Healthy | 0ms |
Use this model via OpenRouter with an OpenAI-compatible SDK.
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://openrouter.ai/api/v1",
apiKey: process.env.OPENROUTER_API_KEY,
});
const response = await client.chat.completions.create({
model: "z-ai/glm-5.2",
messages: [
{ role: "user", content: "Hello!" }
],
});
console.log(response.choices[0].message.content);Using OpenRouter API • OpenAI-compatible SDK