Back to providers

Hugging Face Inference

HealthyTelemetry updated 19m ago

Inference API and dedicated endpoints for open-source models hosted on the Hugging Face Hub. Offers serverless inference for popular models and dedicated GPU endpoints for production workloads.

Authentication

bearer

Uptime (24h)

100.0%

Uptime (7d)

100.0%

Supported Regions

us-east-1eu-west-1

Latency (TTFT)

Time to first token percentiles

No latency data available

Health History

Uptime over the last 7 days

7-Day Uptime100.00% — Excellent
24-Hour Uptime100.00% — Excellent

Current Status

healthy

Last Checked

19m ago

Supported Models (11)

Models available through this provider. Click a model to view details.

ModelPricing (per 1M)Rate LimitsRegions

AlemLLM

alemllm

In: $0.00
Out: $0.00
60 RPM / 300K TPM
us-east-1eu-west-1

Alloma 8B Instruct

alloma-8b

In: $0.10
Out: $0.10
60 RPM / 200K TPM
us-east-1

DeepSeek V4

deepseek-v4

In: $0.18
Out: $0.36
300 RPM / 500K TPM
us-east-1eu-west-1

Gemma 4 12B

gemma-4-12b

In: $0.10
Out: $0.10
300 RPM / 500K TPM
us-east-1eu-west-1

ISSAI KazLLM 1.0 70B

kazllm-1-0-70b

In: $0.00
Out: $0.00
60 RPM / 300K TPM
us-east-1eu-west-1

Kumru 7B

kumru-7b

In: $0.00
Out: $0.00
60 RPM / 300K TPM
us-east-1eu-west-1

Llama 4 Maverick

llama-4-maverick

In: $0.30
Out: $0.30
300 RPM / 500K TPM
us-east-1eu-west-1

Qwen3 32B

qwen3-32b

In: $0.18
Out: $0.18
300 RPM / 500K TPM
us-east-1eu-west-1

StableLM 2 12B

stablelm-2-12b

In: $0.10
Out: $0.10
60 RPM / 200K TPM
us-east-1

Trendyol LLM 8B T1

trendyol-llm-8b

In: $0.00
Out: $0.00
60 RPM / 300K TPM
us-east-1eu-west-1

WiroAI Turkish LLM 9B

wiroai-turkish-llm-9b

In: $0.00
Out: $0.00
60 RPM / 300K TPM
us-east-1eu-west-1