Models

Alibaba's multimodal variant in the Qwen 3.7 family, optimized for vision understanding and multimodal tasks. Ranked

$0.80 – $2.40 / 1M tokens

Gemini 3.1 Flash-Lite

Google's most cost-efficient Gemini model optimized for high-volume, low-latency use cases. Delivers 2.5x faster time to first token versus Gemini 2.5 Flash with full multimodal support. Ideal for agentic tasks, data extraction, translation, and classification.

$0.25 – $1.50 / 1M tokens

Gemini 3 Flash

Google's balanced model combining Gemini 3 Pro's reasoning capabilities with the Flash line's latency, efficiency, and cost. Features configurable thinking levels, multimodal function responses, and streaming function calling for complex agentic workflows.

$0.50 – $3.00 / 1M tokens

GPT-5.5

1.0M ctx

OpenAI's most capable model designed for complex real-world work including coding, online research, information analysis, and document creation. Features advanced agentic capabilities with tool search and multi-step task execution.

$12.00 – $48.00 / 1M tokens

Qwen 3.6 27B

Alibaba's dense 27B parameter model that outperforms its own 397B MoE predecessor on agentic coding benchmarks. Strong multilingual and reasoning capabilities released under Apache 2.0.

$0.20 – $0.60 / 1M tokens

Qwen 3.6 35B-A3B

Alibaba's efficient Mixture-of-Experts model with 35B total parameters and 3B active per token. Frontier-level agentic coding performance with 73.4% on SWE-bench Verified and 92.7 on AIME 2026. Released under Apache 2.0.

$0.14 – $0.42 / 1M tokens

GPT-5.4 Mini

1.1M ctx

OpenAI's compact reasoning model optimized for coding, computer use, and subagent tasks. Approaches GPT-5.4 performance on several benchmarks while running more than 2x faster.

$0.75 – $4.50 / 1M tokens

Muse Spark

256K ctx

Meta Superintelligence Labs' first model, featuring advanced reasoning, multimodal understanding, and agentic capabilities. Processes voice, text, and image inputs with tool use and multi-agent orchestration. Powers Meta AI across its product ecosystem.

$5.00 – $25.00 / 1M tokens

Qwen 3.6 Plus

Alibaba's proprietary flagship model in the Qwen 3.6 family, targeting enterprise AI workflows with stronger agentic coding capability, visual coding support, and end-to-end enterprise engineering features.

$0.80 – $2.40 / 1M tokens

Grok 4.3

1.0M ctx

xAI's latest and most intelligent model with strong agentic tool calling, minimal hallucinations, and configurable reasoning. Supports 1M token context window with competitive pricing.

$1.25 – $2.50 / 1M tokens

Claude Opus 4.7

300K ctx

Anthropic's latest and most advanced model with state-of-the-art reasoning, coding, and analysis capabilities. Features improved tool use, extended thinking, and enhanced safety alignment.

$15.00 – $75.00 / 1M tokens

GPT-5.4

2 providers1.1M ctx

OpenAI's frontier reasoning model combining advances in coding, reasoning, and agentic workflows. Features 1.1M token context window and strong performance on complex multi-step problems.

$2.50 – $15.00 / 1M tokens

Gemini 3.1 Pro

2 providers2.0M ctx

Google's latest flagship multimodal model with state-of-the-art performance on reasoning, coding, and multimodal understanding. Features native tool use, grounding, and million-token context window.

$7.00 – $21.00 / 1M tokens

GPT-5.5 Pro

256K ctx

OpenAI's premium tier model with extended reasoning capabilities, higher accuracy on complex tasks, and priority access. Optimized for professional and enterprise workloads requiring maximum quality.

$30.00 – $120.00 / 1M tokens

Grok 4.20

2.0M ctx

xAI's multi-agent capable model with 2M token context window. Available in reasoning, non-reasoning, and multi-agent variants for diverse enterprise workloads.

$1.25 – $2.50 / 1M tokens

Grok 4

256K ctx

xAI's latest model with real-time information access, strong reasoning capabilities, and competitive performance on coding and analysis tasks. Features improved tool use and multimodal understanding.

$10.00 – $30.00 / 1M tokens

Claude Opus 4.6

2 providers300K ctx

Anthropic's most capable model in the Claude 4 family, excelling at complex analysis, extended reasoning, scientific research, and advanced code generation. Features significantly improved accuracy and reduced hallucinations.

$15.00 – $75.00 / 1M tokens

Claude Sonnet 4.6

2 providers200K ctx

Anthropic's balanced model offering strong performance at lower cost and latency than Opus. Excellent for everyday coding, analysis, and content generation tasks with good reasoning capabilities.

$3.00 – $15.00 / 1M tokens

Mistral Large 3

4 providers256K ctx

Mistral AI's largest open-weight model with 41B active parameters (675B total MoE). State-of-the-art general-purpose multimodal model with 256K context window and powerful agentic capabilities. Released under Apache 2.0.

$1.80 – $6.00 / 1M tokens

Grok 4.1 Fast

2.0M ctx

xAI's fast and cost-effective model with 2M token context window. Offers both reasoning and non-reasoning modes at significantly lower pricing than flagship models.

$0.20 – $0.50 / 1M tokens

Claude Haiku 4.5

2 providers200K ctx

Anthropic's fastest model with near-frontier intelligence. Optimized for high-throughput, low-latency applications requiring quick responses at minimal cost. Supports extended thinking.

$0.80 – $5.00 / 1M tokens

Claude Sonnet 4.5

200K ctx

Anthropic's previous-generation balanced model with strong coding and analysis capabilities. Offers excellent price-performance ratio for production workloads requiring reliable quality.

$3.00 – $15.00 / 1M tokens

Gemini 2.5 Flash

Google's cost-effective model optimized for high throughput tasks. Balances speed and intelligence with strong multimodal capabilities and 1M token context window.

$0.15 – $0.60 / 1M tokens

Gemini 2.5 Pro

Google's high-capability reasoning model with adaptive thinking for complex agentic and multimodal challenges. Features 1M token context window and strong performance on coding and scientific tasks.

$2.50 – $15.00 / 1M tokens

GPT-5

2 providers256K ctx

OpenAI's fifth-generation flagship model with significant improvements in reasoning, multimodal understanding, and code generation. Features enhanced instruction following and expanded context window.

$10.00 – $40.00 / 1M tokens

chatcompletionfunction-callingvision+1

Llama 4 Scout

5 providers10.0M ctx

Meta's efficient MoE model with 17B active parameters (109B total, 16 experts). Supports up to 10M token context — the longest of any production model. Strong performance on reasoning and multilingual tasks.

$0.06 – $0.60 / 1M tokens

Llama 4 Maverick

8 providers1.0M ctx

Meta's quality-focused MoE model with 17B active parameters (400B total, 128 experts). Targets quality-critical tasks with benchmark scores competitive with GPT-4o and Gemini 2.5 Pro.

$0.20 – $0.99 / 1M tokens

chatcompletionfunction-callingvision+1

Mistral Small 3.1

128K ctx

Mistral AI's Small 3.1 model with 24B parameters offering efficient multimodal capabilities including vision, function calling, and code generation with a large 128K context window.

Claude 3 Opus

200K ctx

Anthropic's most powerful model in the Claude 3 family, excelling at complex analysis, nuanced content generation, scientific reasoning, and code generation with extended context support.

$15.00 – $75.00 / 1M tokens