Models

Browse 32 canonical LLM models across all providers

32 models

Gemini 3 Flash

2 providers1.0M ctx

Google's balanced model combining Gemini 3 Pro's reasoning capabilities with the Flash line's latency, efficiency, and cost. Features configurable thinking levels, multimodal function responses, and streaming function calling for complex agentic workflows.

$0.50 – $3.00 / 1M tokens

chatcompletionfunction-callingvision+3

textimageaudiovideocode

Granite 4.1 30B

IBM's largest dense decoder-only 30B parameter language model from the Granite 4.1 family. Trained on approximately 15T tokens with long-context extension up to 512K tokens. Supports tool calling, RAG, code generation, multilingual tasks across 12 languages. Released under Apache 2.0.

$0.60 – $1.20 / 1M tokens

chatcompletionfunction-callingcode-generation+1

Laguna M.1

Poolside AI's flagship agentic coding model with 225B total parameters and 23B active (MoE). Trained from scratch in-house on 30T tokens across 6,144 NVIDIA Hopper GPUs. Optimized for complex multi-step software engineering tasks including codebase exploration, file editing, test running, and iterative debugging.

$0.00 – $0.00 / 1M tokens

chatcompletionfunction-callingcode-generation+1

GPT-5.5

OpenAI's most capable model designed for complex real-world work including coding, online research, information analysis, and document creation. Features advanced agentic capabilities with tool search and multi-step task execution.

$12.00 – $48.00 / 1M tokens

chatcompletionfunction-callingvision+3

textimageaudiocode

GPT-5.4 Mini

OpenAI's compact reasoning model optimized for coding, computer use, and subagent tasks. Approaches GPT-5.4 performance on several benchmarks while running more than 2x faster.

$0.75 – $4.50 / 1M tokens

chatcompletionfunction-callingvision+2

Muse Spark

Meta Superintelligence Labs' first model, featuring advanced reasoning, multimodal understanding, and agentic capabilities. Processes voice, text, and image inputs with tool use and multi-agent orchestration. Powers Meta AI across its product ecosystem.

$5.00 – $25.00 / 1M tokens

chatcompletionfunction-callingvision+3

textimageaudiocode

Gemma 4 31B

Google's flagship open-weight dense model with 31 billion parameters from the Gemma 4 family. All parameters active per forward pass with top-tier performance on reasoning benchmarks including AIME 2026 and MMLU Pro. Supports vision and extended 256K context window.

chatcompletionvisioncode-generation+1

Gemma 4 26B

Google's high-performance open-weight dense model with 26 billion parameters from the Gemma 4 family. Supports multimodal inputs including text and images with a 256K extended context window. Strong reasoning and code generation capabilities with all parameters active per forward pass.

chatcompletionvisioncode-generation+1

Gemma 4 31B

4 providers262K ctx

Google's flagship open-weight dense model with 31B parameters. All parameters active per forward pass. Ranks among top open models with strong performance on AIME 2026 (89.2%) and MMLU Pro (85.2%). Supports vision and extended context.

$0.00 – $0.50 / 1M tokens

chatcompletionvisioncode-generation+1

GPT-OSS 120B

2 providers131K ctx

OpenAI's first open-weight large model with 120 billion parameters. Released under Apache 2.0 license, offering strong performance on reasoning and coding tasks while being fully self-hostable.

$1.80 – $6.00 / 1M tokens

chatcompletionfunction-callingcode-generation+1

Claude Opus 4.7

Anthropic's latest and most advanced model with state-of-the-art reasoning, coding, and analysis capabilities. Features improved tool use, extended thinking, and enhanced safety alignment.

$15.00 – $75.00 / 1M tokens

chatcompletionfunction-callingvision+2

Grok 4.3

xAI's latest and most intelligent model with strong agentic tool calling, minimal hallucinations, and configurable reasoning. Supports 1M token context window with competitive pricing.

$1.25 – $2.50 / 1M tokens

chatcompletionfunction-callingvision+2

Nemotron 3 Super 120B

NVIDIA's open hybrid Mamba-Transformer MoE model with 120B total parameters (12B active). Features 1M token context window and excels at agentic reasoning, coding, planning, and tool calling.

$0.00 – $0.00 / 1M tokens

chatcompletionfunction-callingcode-generation+1

GPT-5.4

2 providers1.1M ctx

OpenAI's frontier reasoning model combining advances in coding, reasoning, and agentic workflows. Features 1.1M token context window and strong performance on complex multi-step problems.

$2.50 – $15.00 / 1M tokens

chatcompletionfunction-callingvision+2

GPT-5.5 Pro

OpenAI's premium tier model with extended reasoning capabilities, higher accuracy on complex tasks, and priority access. Optimized for professional and enterprise workloads requiring maximum quality.

$30.00 – $120.00 / 1M tokens

chatcompletionfunction-callingvision+3

textimageaudiocode

Gemini 3.1 Pro

2 providers2.0M ctx

Google's latest flagship multimodal model with state-of-the-art performance on reasoning, coding, and multimodal understanding. Features native tool use, grounding, and million-token context window.

$7.00 – $21.00 / 1M tokens

chatcompletionfunction-callingvision+3

textimageaudiovideocode

Grok 4

xAI's latest model with real-time information access, strong reasoning capabilities, and competitive performance on coding and analysis tasks. Features improved tool use and multimodal understanding.

$10.00 – $30.00 / 1M tokens

chatcompletionfunction-callingvision+2

Grok 4.20

xAI's multi-agent capable model with 2M token context window. Available in reasoning, non-reasoning, and multi-agent variants for diverse enterprise workloads.

$1.25 – $2.50 / 1M tokens

chatcompletionfunction-callingvision+2

Claude Opus 4.6

2 providers300K ctx

Anthropic's most capable model in the Claude 4 family, excelling at complex analysis, extended reasoning, scientific research, and advanced code generation. Features significantly improved accuracy and reduced hallucinations.

$15.00 – $75.00 / 1M tokens

chatcompletionfunction-callingvision+2

Claude Sonnet 4.6

2 providers200K ctx

Anthropic's balanced model offering strong performance at lower cost and latency than Opus. Excellent for everyday coding, analysis, and content generation tasks with good reasoning capabilities.

$3.00 – $15.00 / 1M tokens

chatcompletionfunction-callingvision+2

Grok 4.1 Fast

xAI's fast and cost-effective model with 2M token context window. Offers both reasoning and non-reasoning modes at significantly lower pricing than flagship models.

$0.20 – $0.50 / 1M tokens

chatcompletionfunction-callingvision+2

Claude Haiku 4.5

2 providers200K ctx

Anthropic's fastest model with near-frontier intelligence. Optimized for high-throughput, low-latency applications requiring quick responses at minimal cost. Supports extended thinking.

$0.80 – $5.00 / 1M tokens

chatcompletionfunction-callingvision+2

Claude Sonnet 4.5

Anthropic's previous-generation balanced model with strong coding and analysis capabilities. Offers excellent price-performance ratio for production workloads requiring reliable quality.

$3.00 – $15.00 / 1M tokens

chatcompletionfunction-callingvision+2

Gemini 2.5 Pro

2 providers1.0M ctx

Google's high-capability reasoning model with adaptive thinking for complex agentic and multimodal challenges. Features 1M token context window and strong performance on coding and scientific tasks.

$2.50 – $15.00 / 1M tokens

chatcompletionfunction-callingvision+3

textimageaudiovideocode

Gemini 2.5 Flash

2 providers1.0M ctx

Google's cost-effective model optimized for high throughput tasks. Balances speed and intelligence with strong multimodal capabilities and 1M token context window.

$0.15 – $0.60 / 1M tokens

chatcompletionfunction-callingvision+3

textimageaudiovideocode

GPT-5

2 providers256K ctx

OpenAI's fifth-generation flagship model with significant improvements in reasoning, multimodal understanding, and code generation. Features enhanced instruction following and expanded context window.

$10.00 – $40.00 / 1M tokens

chatcompletionfunction-callingvision+3

textimageaudiocode

Nemotron Nano 9B v2

NVIDIA's compact 9B parameter model trained from scratch for both reasoning and non-reasoning tasks. Generates reasoning traces before final responses. Efficient for edge and on-device deployment.

$0.00 – $0.00 / 1M tokens

chatcompletioncode-generationreasoning

Llama 4 Maverick

8 providers1.0M ctx

Meta's quality-focused MoE model with 17B active parameters (400B total, 128 experts). Targets quality-critical tasks with benchmark scores competitive with GPT-4o and Gemini 2.5 Pro.

$0.20 – $0.99 / 1M tokens

chatcompletionfunction-callingvision+2

Gemma 3 27B

Google's largest open-weight model in the Gemma 3 family with 27 billion parameters. Supports multimodal inputs including text and images with a 128K context window. Delivers strong performance across reasoning, code generation, and vision tasks, competitive with larger proprietary models.

chatcompletionvisioncode-generation+1

Gemma 3 12B

Google's mid-size open-weight model with 12 billion parameters from the Gemma 3 family. Supports multimodal inputs including text and images with a 128K context window. Strong performance on reasoning and code generation tasks at moderate compute cost.

chatcompletionvisioncode-generation+1

Command A

Cohere's flagship 111B parameter model optimized for demanding enterprises requiring fast, secure, and high-quality AI. Excels at RAG, tool use, and multilingual tasks with strong reasoning capabilities.

$2.50 – $10.00 / 1M tokens

chatcompletionfunction-callingcode-generation+1

Phi-4

Microsoft's Phi-4 model with 14B parameters excelling at reasoning and code generation tasks, delivering strong performance relative to its compact size with efficient inference characteristics.

chatcompletionreasoningcode-generation