Models
Browse 32 canonical LLM models across all providers
Gemini 3 Flash
Google's balanced model combining Gemini 3 Pro's reasoning capabilities with the Flash line's latency, efficiency, and cost. Features configurable thinking levels, multimodal function responses, and streaming function calling for complex agentic workflows.
$0.50 – $3.00 / 1M tokens
Granite 4.1 30B
IBM's largest dense decoder-only 30B parameter language model from the Granite 4.1 family. Trained on approximately 15T tokens with long-context extension up to 512K tokens. Supports tool calling, RAG, code generation, multilingual tasks across 12 languages. Released under Apache 2.0.
$0.60 – $1.20 / 1M tokens
Laguna M.1
Poolside AI's flagship agentic coding model with 225B total parameters and 23B active (MoE). Trained from scratch in-house on 30T tokens across 6,144 NVIDIA Hopper GPUs. Optimized for complex multi-step software engineering tasks including codebase exploration, file editing, test running, and iterative debugging.
$0.00 – $0.00 / 1M tokens
GPT-5.5
OpenAI's most capable model designed for complex real-world work including coding, online research, information analysis, and document creation. Features advanced agentic capabilities with tool search and multi-step task execution.
$12.00 – $48.00 / 1M tokens
GPT-5.4 Mini
OpenAI's compact reasoning model optimized for coding, computer use, and subagent tasks. Approaches GPT-5.4 performance on several benchmarks while running more than 2x faster.
$0.75 – $4.50 / 1M tokens
Muse Spark
Meta Superintelligence Labs' first model, featuring advanced reasoning, multimodal understanding, and agentic capabilities. Processes voice, text, and image inputs with tool use and multi-agent orchestration. Powers Meta AI across its product ecosystem.
$5.00 – $25.00 / 1M tokens
Gemma 4 31B
Google's flagship open-weight dense model with 31 billion parameters from the Gemma 4 family. All parameters active per forward pass with top-tier performance on reasoning benchmarks including AIME 2026 and MMLU Pro. Supports vision and extended 256K context window.
Gemma 4 26B
Google's high-performance open-weight dense model with 26 billion parameters from the Gemma 4 family. Supports multimodal inputs including text and images with a 256K extended context window. Strong reasoning and code generation capabilities with all parameters active per forward pass.
Gemma 4 31B
Google's flagship open-weight dense model with 31B parameters. All parameters active per forward pass. Ranks among top open models with strong performance on AIME 2026 (89.2%) and MMLU Pro (85.2%). Supports vision and extended context.
$0.00 – $0.50 / 1M tokens
GPT-OSS 120B
OpenAI's first open-weight large model with 120 billion parameters. Released under Apache 2.0 license, offering strong performance on reasoning and coding tasks while being fully self-hostable.
$1.80 – $6.00 / 1M tokens
Claude Opus 4.7
Anthropic's latest and most advanced model with state-of-the-art reasoning, coding, and analysis capabilities. Features improved tool use, extended thinking, and enhanced safety alignment.
$15.00 – $75.00 / 1M tokens
Grok 4.3
xAI's latest and most intelligent model with strong agentic tool calling, minimal hallucinations, and configurable reasoning. Supports 1M token context window with competitive pricing.
$1.25 – $2.50 / 1M tokens
Nemotron 3 Super 120B
NVIDIA's open hybrid Mamba-Transformer MoE model with 120B total parameters (12B active). Features 1M token context window and excels at agentic reasoning, coding, planning, and tool calling.
$0.00 – $0.00 / 1M tokens
GPT-5.4
OpenAI's frontier reasoning model combining advances in coding, reasoning, and agentic workflows. Features 1.1M token context window and strong performance on complex multi-step problems.
$2.50 – $15.00 / 1M tokens
GPT-5.5 Pro
OpenAI's premium tier model with extended reasoning capabilities, higher accuracy on complex tasks, and priority access. Optimized for professional and enterprise workloads requiring maximum quality.
$30.00 – $120.00 / 1M tokens
Gemini 3.1 Pro
Google's latest flagship multimodal model with state-of-the-art performance on reasoning, coding, and multimodal understanding. Features native tool use, grounding, and million-token context window.
$7.00 – $21.00 / 1M tokens
Grok 4
xAI's latest model with real-time information access, strong reasoning capabilities, and competitive performance on coding and analysis tasks. Features improved tool use and multimodal understanding.
$10.00 – $30.00 / 1M tokens
Grok 4.20
xAI's multi-agent capable model with 2M token context window. Available in reasoning, non-reasoning, and multi-agent variants for diverse enterprise workloads.
$1.25 – $2.50 / 1M tokens
Claude Opus 4.6
Anthropic's most capable model in the Claude 4 family, excelling at complex analysis, extended reasoning, scientific research, and advanced code generation. Features significantly improved accuracy and reduced hallucinations.
$15.00 – $75.00 / 1M tokens
Claude Sonnet 4.6
Anthropic's balanced model offering strong performance at lower cost and latency than Opus. Excellent for everyday coding, analysis, and content generation tasks with good reasoning capabilities.
$3.00 – $15.00 / 1M tokens
Grok 4.1 Fast
xAI's fast and cost-effective model with 2M token context window. Offers both reasoning and non-reasoning modes at significantly lower pricing than flagship models.
$0.20 – $0.50 / 1M tokens
Claude Haiku 4.5
Anthropic's fastest model with near-frontier intelligence. Optimized for high-throughput, low-latency applications requiring quick responses at minimal cost. Supports extended thinking.
$0.80 – $5.00 / 1M tokens
Claude Sonnet 4.5
Anthropic's previous-generation balanced model with strong coding and analysis capabilities. Offers excellent price-performance ratio for production workloads requiring reliable quality.
$3.00 – $15.00 / 1M tokens
Gemini 2.5 Pro
Google's high-capability reasoning model with adaptive thinking for complex agentic and multimodal challenges. Features 1M token context window and strong performance on coding and scientific tasks.
$2.50 – $15.00 / 1M tokens
Gemini 2.5 Flash
Google's cost-effective model optimized for high throughput tasks. Balances speed and intelligence with strong multimodal capabilities and 1M token context window.
$0.15 – $0.60 / 1M tokens
GPT-5
OpenAI's fifth-generation flagship model with significant improvements in reasoning, multimodal understanding, and code generation. Features enhanced instruction following and expanded context window.
$10.00 – $40.00 / 1M tokens
Nemotron Nano 9B v2
NVIDIA's compact 9B parameter model trained from scratch for both reasoning and non-reasoning tasks. Generates reasoning traces before final responses. Efficient for edge and on-device deployment.
$0.00 – $0.00 / 1M tokens
Llama 4 Maverick
Meta's quality-focused MoE model with 17B active parameters (400B total, 128 experts). Targets quality-critical tasks with benchmark scores competitive with GPT-4o and Gemini 2.5 Pro.
$0.20 – $0.99 / 1M tokens
Gemma 3 27B
Google's largest open-weight model in the Gemma 3 family with 27 billion parameters. Supports multimodal inputs including text and images with a 128K context window. Delivers strong performance across reasoning, code generation, and vision tasks, competitive with larger proprietary models.
Gemma 3 12B
Google's mid-size open-weight model with 12 billion parameters from the Gemma 3 family. Supports multimodal inputs including text and images with a 128K context window. Strong performance on reasoning and code generation tasks at moderate compute cost.
Command A
Cohere's flagship 111B parameter model optimized for demanding enterprises requiring fast, secure, and high-quality AI. Excels at RAG, tool use, and multilingual tasks with strong reasoning capabilities.
$2.50 – $10.00 / 1M tokens
Phi-4
Microsoft's Phi-4 model with 14B parameters excelling at reasoning and code generation tasks, delivering strong performance relative to its compact size with efficient inference characteristics.