Back to providers

Groq

HealthyTelemetry updated 38m ago

Ultra-fast LPU (Language Processing Unit) inference provider offering extremely low latency. Supports streaming, function calling, and audio transcription via Whisper models. Per-model rate limits apply.

API Base URL

https://api.groq.com/openai/v1

Authentication

bearer

Uptime (24h)

100.0%

Uptime (7d)

100.0%

Supported Regions

us-east-1eu-west-1

Latency (TTFT)

Time to first token percentiles

Avg TTFT80ms
P5080ms
P9581ms
P9981ms

Avg Total Time

80ms

Avg TTFT

80ms

Health History

Uptime over the last 7 days

7-Day Uptime100.00% — Excellent
24-Hour Uptime100.00% — Excellent

Current Status

healthy

Last Checked

38m ago

Supported Models (4)

Models available through this provider. Click a model to view details.

ModelPricing (per 1M)Rate LimitsRegions

DeepSeek R1

deepseek-r1

In: $0.75
Out: $0.99
30 RPM / 100K TPM
us-east-1

Llama 3.3 70B Instruct

llama-3-3-70b

In: $0.59
Out: $0.79
30 RPM / 100K TPM
us-east-1eu-west-1

Llama 4 Scout

llama-4-scout

In: $0.11
Out: $0.34
30 RPM / 100K TPM
us-east-1eu-west-1

Qwen3 32B

qwen3-32b

In: $0.29
Out: $0.39
30 RPM / 100K TPM
us-east-1