Providers

AI21 Labs

https://api.ai21.com/studio/v1

Israeli AI company specializing in enterprise LLMs with their unique hybrid SSM-Transformer Jamba architecture. Offers models with 256K context windows optimized for grounding, instruction-following, and long-context enterprise tasks.

Alibaba Model Studio

https://dashscope-intl.aliyuncs.com/compatible-mode/v1

Alibaba Cloud AI platform providing access to Qwen family models and other large language models through DashScope API, offering OpenAI-compatible endpoints with multi-region availability.

ap-east-1us-east-1eu-west-1

Amazon Bedrock

https://bedrock-runtime.us-east-1.amazonaws.com

Fully managed AWS service offering foundation models from leading AI companies including Anthropic, Meta, and Mistral through a unified API. Supports OpenAI-compatible endpoints via Mantle inference engine with cross-region inference capabilities.

us-east-1us-west-2eu-west-1ap-southeast-1

Anthropic

https://api.anthropic.com/v1

AI safety company providing API access to the Claude family of models, known for helpfulness, harmlessness, and honesty with strong reasoning and analysis capabilities.

anthropic compatible

Anyscale

https://api.endpoints.anyscale.com/v1

Serverless inference platform built on Ray, offering high-throughput access to popular open-weight models. OpenAI-compatible API with competitive pricing and enterprise-grade scalability.

Azure AI

https://models.inference.ai.azure.com

Microsoft's cloud AI platform providing access to OpenAI models, open-source models, and enterprise AI services through Azure AI Studio. Offers global deployment with enterprise-grade security and compliance.

us-east-1us-west-2eu-west-1ap-southeast-1

Baseten

https://bridge.baseten.co/v1

ML infrastructure platform for deploying and serving machine learning models at scale, offering managed GPU inference with auto-scaling and OpenAI-compatible API endpoints for popular LLMs.

Cerebras

https://api.cerebras.ai/v1

Ultra-fast inference provider powered by the Cerebras Wafer Scale Engine, known for extremely high tokens/sec throughput. Offers OpenAI-compatible API with free tier access.

us-east-1

Cloudflare Workers AI

https://api.cloudflare.com/client/v4/accounts

Serverless AI inference platform running on Cloudflare's global edge network. Offers 10,000 neurons/day free allocation with OpenAI-compatible API and wide model selection including vision models.

us-east-1eu-west-1global

Cohere

https://api.cohere.com/v2

Enterprise AI platform specializing in language models for business applications including RAG, search, and text generation. Offers Command and Embed model families with strong multilingual support.

us-east-1eu-west-1global

cohere compatible

Deep Infra

https://api.deepinfra.com/v1/openai

Serverless inference platform offering fast and cost-effective access to popular open-weight models. OpenAI-compatible API with pay-per-token pricing and no minimum commitments.

DeepSeek

https://api.deepseek.com/v1

Chinese AI research company providing direct API access to their DeepSeek family of models with competitive pricing and strong performance on coding and reasoning tasks.

Featherless

https://api.featherless.ai/v1

Serverless LLM inference platform hosting 20,000+ open-source models from Hugging Face with flat-rate subscription pricing and unlimited token usage. OpenAI-compatible API with no per-token billing — access any model up to a given size based on subscription tier. Largest Hugging Face inference provider by model count.

Fireworks

https://api.fireworks.ai/inference/v1

High-speed AI inference platform optimized for low-latency serving of open-source models, offering OpenAI-compatible API endpoints with custom model deployment and fine-tuning capabilities.

Google AI Studio

https://generativelanguage.googleapis.com/v1beta/openai

Google's developer platform for accessing Gemini and Gemma models via OpenAI-compatible API. Free tier available with generous rate limits. Data may be used for training outside EU/EEA/UK/CH regions.

us-east-1eu-west-1global

Google Cloud Vertex AI

https://generativelanguage.googleapis.com/v1

Google Cloud's enterprise AI platform providing access to Gemini models with enterprise-grade security, compliance, and global infrastructure. Supports streaming, function calling, and multimodal inputs.

oauth2

us-east-1us-west-2eu-west-1global

google compatible

Groq

https://api.groq.com/openai/v1

Ultra-fast LPU (Language Processing Unit) inference provider offering extremely low latency. Supports streaming, function calling, and audio transcription via Whisper models. Per-model rate limits apply.

Hugging Face Inference

https://router.huggingface.co/hf-inference/v1

Inference API and dedicated endpoints for open-source models hosted on the Hugging Face Hub. Offers serverless inference for popular models and dedicated GPU endpoints for production workloads.

Hyperbolic

https://api.hyperbolic.xyz/v1

Decentralized AI compute platform offering affordable GPU inference for open-source models, providing OpenAI-compatible API endpoints with competitive pricing and global availability.

us-west-2

IBM watsonx.ai

https://us-south.ml.cloud.ibm.com/ml/v1

IBM's enterprise AI platform providing access to Granite foundation models and third-party models via a REST API. Supports text generation, embeddings, and fine-tuning with regional deployments across IBM Cloud.

us-south-1eu-de-1eu-gb-1

InclusionAI

Ant Group's AI research lab focused on open-source large language models. Offers inference via ZenMux platform with OpenAI-compatible API. Develops the Ling (non-thinking) and Ring (reasoning) model families at trillion-parameter scale.

https://zenmux.ai/api/v1

cn-east-1global

Inference.net

https://api.inference.net/v1

Distributed AI inference network providing affordable access to open-source language models through a decentralized GPU marketplace, offering OpenAI-compatible API endpoints with competitive per-token pricing.

Lambda

https://api.lambda.chat/v1

GPU cloud and inference provider offering on-demand access to NVIDIA GPUs for AI training and inference. Provides both cloud instances and managed inference API for open-source LLMs with competitive pricing.

Meta AI

Meta's AI research division providing the open-weight Llama family of models. Models are available through various inference providers and can be self-hosted.

https://www.meta.ai/api

none

MiniMax

https://api.minimax.chat/v1

Chinese AI company providing large language models with strong multilingual and multimodal capabilities. Known for competitive pricing and high-quality text generation.

Mistral AI

https://api.mistral.ai/v1

European AI company providing high-performance language models with strong multilingual capabilities. Offers both open-weight and proprietary models through an OpenAI-compatible API.

eu-west-1us-east-1global

Modal

Serverless cloud platform for running AI workloads with on-demand GPU access, offering custom model deployment and OpenAI-compatible inference endpoints with automatic scaling and pay-per-second pricing.

https://api.modal.com/v1

Moonshot AI

https://api.moonshot.cn/v1

Chinese AI company behind the Kimi series of models, known for ultra-long context windows and strong reasoning capabilities. Offers OpenAI-compatible API access.

Nebius

https://api.studio.nebius.ai/v1

Cloud AI platform providing scalable GPU infrastructure and managed inference services for large language models, with data centers in Europe and competitive pricing for open-source model hosting.

eu-west-1

NLP Cloud

https://api.nlpcloud.io/v1

Production-ready AI inference API offering managed deployment of open-source and proprietary language models with dedicated GPU instances, providing high availability and data privacy compliance.

Novita

https://api.novita.ai/v3/openai

AI model inference platform providing affordable access to open-source language models with OpenAI-compatible API endpoints, offering pay-per-token pricing and global availability.

us-east-1

NVIDIA NIM

https://integrate.api.nvidia.com/v1

NVIDIA's inference microservice platform providing optimized deployment of LLMs on GPU infrastructure. Offers free endpoints for select models and partner endpoints through Deep Infra, Together AI, Bitdeer, GMI Cloud, and CoreWeave.

us-east-1us-west-2global

OpenAI

https://api.openai.com/v1

Leading AI research company providing API access to GPT-4, GPT-3.5, DALL-E, and other foundation models through a developer-friendly REST API with global availability.

OpenRouter

https://openrouter.ai/api/v1

Unified API gateway providing access to hundreds of models from multiple providers through a single OpenAI-compatible endpoint. Free models available with shared quota, up to 1000 requests/day.

us-east-1

Perplexity

https://api.perplexity.ai

AI-powered answer engine offering API access to proprietary and open-source models with built-in web search grounding. Specializes in providing accurate, cited responses with real-time information access.

Reka AI

AI research company building multimodal language models that process text, images, video, and audio in a single model. Offers Flash (21B) and Edge (7B) models optimized for reasoning, coding, and physical AI applications.

https://api.reka.ai/v1

Replicate

https://api.replicate.com/v1

Cloud platform for running open-source AI models with a simple API. Hosts over 1000 community models with serverless GPU inference, pay-per-second pricing, and no infrastructure management required.

Sakana AI

Tokyo-based AI research lab building nature-inspired and evolutionary approaches to foundation models. Provides the Fugu family of multi-agent orchestration models through a single OpenAI-compatible API that coordinates a pool of specialist LLMs behind one endpoint, available via pay-as-you-go and subscription plans.

https://api.sakana.ai/v1

ap-northeast-1global

SambaNova

https://api.sambanova.ai/v1

AI hardware and software platform offering high-performance inference services powered by custom DataScale systems, providing OpenAI-compatible API endpoints for open-source models with industry-leading throughput.

us-west-2

Sarvam AI

Indian AI company building sovereign language models and a full-stack GenAI platform optimized for Indian languages. Provides an OpenAI-compatible API for the Sarvam model family (Sarvam-M, Sarvam-1, Sarvam-30B, Sarvam-105B) along with speech and translation services, hosted on India-based infrastructure.

https://api.sarvam.ai/v1

ap-south-1

Sber

https://gigachat.devices.sberbank.ru/api/v1

Russian banking and technology conglomerate providing access to GigaChat family of language models through a dedicated API. Offers models ranging from compact Lightning to flagship Ultra with strong multilingual and reasoning capabilities.

oauth2

ru-central-1

Scaleway

https://api.scaleway.ai/v1

European cloud provider offering managed AI inference endpoints with GPU instances across European data centers, providing OpenAI-compatible API access to popular open-source models.

eu-west-1

SiliconFlow

https://api.siliconflow.cn/v1

High-performance AI inference platform offering ultra-low latency and cost-effective access to open-source models. Supports models with up to 1M token context windows and OpenAI-compatible API endpoints.

ap-east-1us-west-2global

Snowflake Cortex AI

https://api.snowflake.com/v1

Snowflake's managed AI service providing access to Arctic family models and other LLMs through Cortex AI. Integrated with Snowflake's data platform for enterprise SQL generation, document understanding, and AI-powered analytics.

oauth2

us-east-1us-west-2eu-west-1

Together AI

https://api.together.xyz/v1

Cloud platform for running and fine-tuning open-source AI models, offering competitive pricing and OpenAI-compatible API endpoints for popular open-weight models.

Upstage

https://api.upstage.ai/v1/solar

South Korean AI company providing enterprise-grade language models optimized for Korean, English, and Japanese. Offers the Solar model family through a direct API with competitive pricing and high throughput.

xAI

AI company founded by Elon Musk providing the Grok family of models. Known for real-time information access and strong reasoning capabilities with OpenAI-compatible API.

Xiaomi MiMo

https://mimo.xiaomi.com/api/v1

Xiaomi's AI inference platform providing access to the MiMo family of models via an OpenAI-compatible API endpoint. Offers flagship agentic and multimodal models with competitive pricing.

Yandex Cloud

https://ai.api.cloud.yandex.net/v1

Russian cloud platform providing access to YandexGPT foundation models through Yandex Cloud AI Studio. Offers OpenAI-compatible API endpoints with strong Russian and English language capabilities.

ru-central-1