Groq

HealthyTelemetry updated 49m ago

Ultra-fast LPU (Language Processing Unit) inference provider offering extremely low latency. Supports streaming, function calling, and audio transcription via Whisper models. Per-model rate limits apply.

API Base URL

https://api.groq.com/openai/v1

Authentication

bearer

Uptime (24h)

100.0%

Uptime (7d)

100.0%

Supported Regions

us-east-1eu-west-1

Latency (TTFT)

Time to first token percentiles

Avg TTFT165ms

P50165ms

P95169ms

P99170ms

Avg Total Time

165ms

Avg TTFT

165ms

Health History

Uptime over the last 7 days

7-Day Uptime100.00% — Excellent

24-Hour Uptime100.00% — Excellent

Current Status

healthy

Last Checked

49m ago

Supported Models (4)

Models available through this provider. Click a model to view details.

Model	Pricing (per 1M)	Rate Limits	Regions
DeepSeek R1 deepseek-r1	In: $0.75 Out: $0.99	30 RPM / 100K TPM	us-east-1
Llama 3.3 70B Instruct llama-3-3-70b	In: $0.59 Out: $0.79	30 RPM / 100K TPM	us-east-1eu-west-1
Llama 4 Scout llama-4-scout	In: $0.11 Out: $0.34	30 RPM / 100K TPM	us-east-1eu-west-1
Qwen3 32B qwen3-32b	In: $0.29 Out: $0.39	30 RPM / 100K TPM	us-east-1