Providers

Browse 42 inference providers and their offerings

Alibaba Model Studio

Active

Alibaba Cloud AI platform providing access to Qwen family models and other large language models through DashScope API, offering OpenAI-compatible endpoints with multi-region availability.

https://dashscope-intl.aliyuncs.com/compatible-mode/v1
api-key
us-east-1eu-west-1
openai compatible

Amazon Bedrock

Active

Fully managed AWS service offering foundation models from leading AI companies including Anthropic, Meta, and Mistral through a unified API. Supports OpenAI-compatible endpoints via Mantle inference engine with cross-region inference capabilities.

https://bedrock-runtime.us-east-1.amazonaws.com
bearer
us-east-1us-west-2eu-west-1ap-southeast-1
openai compatible

Anthropic

Active

AI safety company providing API access to the Claude family of models, known for helpfulness, harmlessness, and honesty with strong reasoning and analysis capabilities.

https://api.anthropic.com/v1
api-key
us-east-1eu-west-1
anthropic compatible

Anyscale

Active

Serverless inference platform built on Ray, offering high-throughput access to popular open-weight models. OpenAI-compatible API with competitive pricing and enterprise-grade scalability.

https://api.endpoints.anyscale.com/v1
bearer
us-east-1us-west-2
openai compatible

Azure AI

Active

Microsoft's cloud AI platform providing access to OpenAI models, open-source models, and enterprise AI services through Azure AI Studio. Offers global deployment with enterprise-grade security and compliance.

https://models.inference.ai.azure.com
api-key
us-east-1us-west-2eu-west-1ap-southeast-1
openai compatible

Baseten

Active

ML infrastructure platform for deploying and serving machine learning models at scale, offering managed GPU inference with auto-scaling and OpenAI-compatible API endpoints for popular LLMs.

https://bridge.baseten.co/v1
api-key
us-east-1us-west-2
openai compatible

Cerebras

Active

Ultra-fast inference provider powered by the Cerebras Wafer Scale Engine, known for extremely high tokens/sec throughput. Offers OpenAI-compatible API with free tier access.

https://api.cerebras.ai/v1
api-key
us-east-1
openai compatible

Cloudflare Workers AI

Active

Serverless AI inference platform running on Cloudflare's global edge network. Offers 10,000 neurons/day free allocation with OpenAI-compatible API and wide model selection including vision models.

https://api.cloudflare.com/client/v4/accounts
api-key
us-east-1eu-west-1global
openai compatible

Cohere

Active

Enterprise AI platform specializing in language models for business applications including RAG, search, and text generation. Offers Command and Embed model families with strong multilingual support.

https://api.cohere.com/v2
bearer
us-east-1eu-west-1global
cohere compatible

Deep Infra

Active

Serverless inference platform offering fast and cost-effective access to popular open-weight models. OpenAI-compatible API with pay-per-token pricing and no minimum commitments.

https://api.deepinfra.com/v1/openai
bearer
us-east-1eu-west-1
openai compatible

DeepSeek

Active

Chinese AI research company providing direct API access to their DeepSeek family of models with competitive pricing and strong performance on coding and reasoning tasks.

https://api.deepseek.com/v1
bearer
us-east-1global
openai compatible

Fireworks

Active

High-speed AI inference platform optimized for low-latency serving of open-source models, offering OpenAI-compatible API endpoints with custom model deployment and fine-tuning capabilities.

https://api.fireworks.ai/inference/v1
api-key
us-east-1us-west-2
openai compatible

Google AI Studio

Active

Google's developer platform for accessing Gemini and Gemma models via OpenAI-compatible API. Free tier available with generous rate limits. Data may be used for training outside EU/EEA/UK/CH regions.

https://generativelanguage.googleapis.com/v1beta/openai
api-key
us-east-1eu-west-1global
openai compatible

Google Cloud Vertex AI

Active

Google Cloud's enterprise AI platform providing access to Gemini models with enterprise-grade security, compliance, and global infrastructure. Supports streaming, function calling, and multimodal inputs.

https://generativelanguage.googleapis.com/v1
oauth2
us-east-1us-west-2eu-west-1global
google compatible

Groq

Active

Ultra-fast LPU (Language Processing Unit) inference provider offering extremely low latency. Supports streaming, function calling, and audio transcription via Whisper models. Per-model rate limits apply.

https://api.groq.com/openai/v1
bearer
us-east-1eu-west-1
openai compatible

Hugging Face Inference

Active

Inference API and dedicated endpoints for open-source models hosted on the Hugging Face Hub. Offers serverless inference for popular models and dedicated GPU endpoints for production workloads.

https://api-inference.huggingface.co
bearer
us-east-1eu-west-1
openai compatible

Hyperbolic

Active

Decentralized AI compute platform offering affordable GPU inference for open-source models, providing OpenAI-compatible API endpoints with competitive pricing and global availability.

https://api.hyperbolic.xyz/v1
api-key
us-west-2
openai compatible

IBM watsonx.ai

Active

IBM's enterprise AI platform providing access to Granite foundation models and third-party models via a REST API. Supports text generation, embeddings, and fine-tuning with regional deployments across IBM Cloud.

https://us-south.ml.cloud.ibm.com/ml/v1
bearer
us-south-1eu-de-1eu-gb-1
custom compatible

Inference.net

Active

Distributed AI inference network providing affordable access to open-source language models through a decentralized GPU marketplace, offering OpenAI-compatible API endpoints with competitive per-token pricing.

https://api.inference.net/v1
api-key
us-east-1eu-west-1
openai compatible

Meta AI

Active

Meta's AI research division providing the open-weight Llama family of models. Models are available through various inference providers and can be self-hosted.

https://www.meta.ai/api
none
global
custom compatible

MiniMax

Active

Chinese AI company providing large language models with strong multilingual and multimodal capabilities. Known for competitive pricing and high-quality text generation.

https://api.minimax.chat/v1
api-key
us-east-1global
openai compatible

Mistral AI

Active

European AI company providing high-performance language models with strong multilingual capabilities. Offers both open-weight and proprietary models through an OpenAI-compatible API.

https://api.mistral.ai/v1
bearer
eu-west-1us-east-1global
openai compatible

Modal

Active

Serverless cloud platform for running AI workloads with on-demand GPU access, offering custom model deployment and OpenAI-compatible inference endpoints with automatic scaling and pay-per-second pricing.

https://api.modal.com/v1
bearer
us-east-1us-west-2
openai compatible

Moonshot AI

Active

Chinese AI company behind the Kimi series of models, known for ultra-long context windows and strong reasoning capabilities. Offers OpenAI-compatible API access.

https://api.moonshot.cn/v1
api-key
us-east-1global
openai compatible

Nebius

Active

Cloud AI platform providing scalable GPU infrastructure and managed inference services for large language models, with data centers in Europe and competitive pricing for open-source model hosting.

https://api.studio.nebius.ai/v1
api-key
eu-west-1
openai compatible

NLP Cloud

Active

Production-ready AI inference API offering managed deployment of open-source and proprietary language models with dedicated GPU instances, providing high availability and data privacy compliance.

https://api.nlpcloud.io/v1
bearer
us-east-1eu-west-1
custom compatible

Novita

Active

AI model inference platform providing affordable access to open-source language models with OpenAI-compatible API endpoints, offering pay-per-token pricing and global availability.

https://api.novita.ai/v3/openai
api-key
us-east-1
openai compatible

NVIDIA NIM

Active

NVIDIA's inference microservice platform providing optimized deployment of LLMs on GPU infrastructure. Offers free endpoints for select models and partner endpoints through Deep Infra, Together AI, Bitdeer, GMI Cloud, and CoreWeave.

https://integrate.api.nvidia.com/v1
api-key
us-east-1us-west-2global
openai compatible

OpenAI

Active

Leading AI research company providing API access to GPT-4, GPT-3.5, DALL-E, and other foundation models through a developer-friendly REST API with global availability.

https://api.openai.com/v1
bearer
us-east-1eu-west-1
openai compatible

OpenRouter

Active

Unified API gateway providing access to hundreds of models from multiple providers through a single OpenAI-compatible endpoint. Free models available with shared quota, up to 1000 requests/day.

https://openrouter.ai/api/v1
api-key
us-east-1
openai compatible

Perplexity

Active

AI-powered answer engine offering API access to proprietary and open-source models with built-in web search grounding. Specializes in providing accurate, cited responses with real-time information access.

https://api.perplexity.ai
bearer
us-east-1us-west-2
openai compatible

Replicate

Active

Cloud platform for running open-source AI models with a simple API. Hosts over 1000 community models with serverless GPU inference, pay-per-second pricing, and no infrastructure management required.

https://api.replicate.com/v1
bearer
us-east-1us-west-2
custom compatible

SambaNova

Active

AI hardware and software platform offering high-performance inference services powered by custom DataScale systems, providing OpenAI-compatible API endpoints for open-source models with industry-leading throughput.

https://api.sambanova.ai/v1
api-key
us-west-2
openai compatible

Sber

Active

Russian banking and technology conglomerate providing access to GigaChat family of language models through a dedicated API. Offers models ranging from compact Lightning to flagship Ultra with strong multilingual and reasoning capabilities.

https://gigachat.devices.sberbank.ru/api/v1
oauth2
ru-central-1
custom compatible

Scaleway

Active

European cloud provider offering managed AI inference endpoints with GPU instances across European data centers, providing OpenAI-compatible API access to popular open-source models.

https://api.scaleway.ai/v1
api-key
eu-west-1
openai compatible

SiliconFlow

Active

High-performance AI inference platform offering ultra-low latency and cost-effective access to open-source models. Supports models with up to 1M token context windows and OpenAI-compatible API endpoints.

https://api.siliconflow.cn/v1
bearer
ap-east-1us-west-2global
openai compatible

Together AI

Active

Cloud platform for running and fine-tuning open-source AI models, offering competitive pricing and OpenAI-compatible API endpoints for popular open-weight models.

https://api.together.xyz/v1
bearer
us-east-1us-west-2
openai compatible

Upstage

Active

South Korean AI company providing enterprise-grade language models optimized for Korean, English, and Japanese. Offers the Solar model family through a direct API with competitive pricing and high throughput.

https://api.upstage.ai/v1/solar
api-key
global
openai compatible

xAI

Active

AI company founded by Elon Musk providing the Grok family of models. Known for real-time information access and strong reasoning capabilities with OpenAI-compatible API.

https://api.x.ai/v1
bearer
us-east-1global
openai compatible

Xiaomi MiMo

Active

Xiaomi's AI inference platform providing access to the MiMo family of models via an OpenAI-compatible API endpoint. Offers flagship agentic and multimodal models with competitive pricing.

https://mimo.xiaomi.com/api/v1
api-key
global
openai compatible

Yandex Cloud

Active

Russian cloud platform providing access to YandexGPT foundation models through Yandex Cloud AI Studio. Offers OpenAI-compatible API endpoints with strong Russian and English language capabilities.

https://ai.api.cloud.yandex.net/v1
api-key
ru-central-1
openai compatible

Zhipu AI

Active

Chinese AI company providing the GLM family of models with strong bilingual (Chinese/English) capabilities. Known for competitive performance on reasoning and coding benchmarks.

https://open.bigmodel.cn/api/paas/v4
api-key
us-east-1global
openai compatible