Back to providers
Groq
HealthyTelemetry updated 38m agoUltra-fast LPU (Language Processing Unit) inference provider offering extremely low latency. Supports streaming, function calling, and audio transcription via Whisper models. Per-model rate limits apply.
API Base URL
https://api.groq.com/openai/v1
Authentication
bearer
Uptime (24h)
100.0%
Uptime (7d)
100.0%
Supported Regions
us-east-1eu-west-1
Latency (TTFT)
Time to first token percentiles
Avg TTFT80ms
P5080ms
P9581ms
P9981ms
Avg Total Time
80ms
Avg TTFT
80ms
Health History
Uptime over the last 7 days
7-Day Uptime100.00% — Excellent
24-Hour Uptime100.00% — Excellent
Current Status
healthy
Last Checked
38m ago
Supported Models (4)
Models available through this provider. Click a model to view details.
| Model | Pricing (per 1M) | Rate Limits | Regions |
|---|---|---|---|
DeepSeek R1 deepseek-r1 | In: $0.75 Out: $0.99 | 30 RPM / 100K TPM | us-east-1 |
Llama 3.3 70B Instruct llama-3-3-70b | In: $0.59 Out: $0.79 | 30 RPM / 100K TPM | us-east-1eu-west-1 |
Llama 4 Scout llama-4-scout | In: $0.11 Out: $0.34 | 30 RPM / 100K TPM | us-east-1eu-west-1 |
Qwen3 32B qwen3-32b | In: $0.29 Out: $0.39 | 30 RPM / 100K TPM | us-east-1 |