Back to providers
Cerebras
HealthyTelemetry updated 38m agoUltra-fast inference provider powered by the Cerebras Wafer Scale Engine, known for extremely high tokens/sec throughput. Offers OpenAI-compatible API with free tier access.
API Base URL
https://api.cerebras.ai/v1
Authentication
api-key
Uptime (24h)
100.0%
Uptime (7d)
100.0%
Supported Regions
us-east-1
Latency (TTFT)
Time to first token percentiles
No latency data available
Health History
Uptime over the last 7 days
7-Day Uptime100.00% — Excellent
24-Hour Uptime100.00% — Excellent
Current Status
healthy
Last Checked
38m ago
Supported Models (3)
Models available through this provider. Click a model to view details.
| Model | Pricing (per 1M) | Rate Limits | Regions |
|---|---|---|---|
K2 Think k2-think | In: $0.60 Out: $0.60 | 30 RPM / 60K TPM | us-east-1 |
Llama 3.3 70B Instruct llama-3-3-70b | In: $0.60 Out: $0.60 | 30 RPM / 60K TPM | us-east-1 |
Llama 4 Scout llama-4-scout | In: $0.60 Out: $0.60 | 30 RPM / 60K TPM | us-east-1 |