Back to providers

Hugging Face Inference

UnhealthyTelemetry updated 38m ago

Inference API and dedicated endpoints for open-source models hosted on the Hugging Face Hub. Offers serverless inference for popular models and dedicated GPU endpoints for production workloads.

API Base URL

https://api-inference.huggingface.co

Authentication

bearer

Uptime (24h)

0.0%

Uptime (7d)

36.3%

Supported Regions

us-east-1eu-west-1

Latency (TTFT)

Time to first token percentiles

No latency data available

Health History

Uptime over the last 7 days

7-Day Uptime36.31% — Critical
24-Hour Uptime0.00% — Critical

Current Status

unhealthy

Last Checked

38m ago

Supported Models (3)

Models available through this provider. Click a model to view details.

ModelPricing (per 1M)Rate LimitsRegions

DeepSeek V4

deepseek-v4

In: $0.18
Out: $0.36
300 RPM / 500K TPM
us-east-1eu-west-1

Llama 4 Maverick

llama-4-maverick

In: $0.30
Out: $0.30
300 RPM / 500K TPM
us-east-1eu-west-1

Qwen3 32B

qwen3-32b

In: $0.18
Out: $0.18
300 RPM / 500K TPM
us-east-1eu-west-1