Models

Browse 43 canonical LLM models across all providers

Sort by

Compare Models

Showing 1–24 of 43 models

Claude Sonnet 5

USA1.0M ctx

Anthropic's most capable Sonnet-class model, bringing frontier coding, agentic, and professional-work performance to the midsize tier while closing the gap with Opus 4.8 at a lower price. Supports adaptive thinking with selectable reasoning effort levels, a 1M-token context window, and text, image, and file inputs. Codenamed Fennec.

$2.00 – $10.00 / 1M tokens

Command A+

USA256K ctx

Cohere's enterprise flagship model building on Command A with stronger reasoning, agentic tool use, and multilingual performance across 23 languages. Optimized for secure, high-throughput RAG, retrieval, and long-horizon agent workflows in regulated environments, with private and on-premise deployment options.

$2.50 – $10.00 / 1M tokens

DiffusionGemma

USA2 providers262K ctx

Google DeepMind's experimental diffusion-based member of the Gemma 4 open model family. Unlike autoregressive models that generate text one token at a time, DiffusionGemma denoises a canvas of placeholder tokens to produce up to 256 tokens in parallel, finalizing output in one block. A Mixture-of-Experts model with 26B total parameters and 3.8B active per inference, delivering roughly 4x the throughput of similarly sized autoregressive Gemma models on local hardware. Excels at non-linear tasks like in-line editing, molecular sequencing, mathematical graphing, and self-correcting puzzles.

$0.00 – $0.15 / 1M tokens

Claude Fable 5

USA300K ctx

Anthropic's first publicly available Mythos-class model, exceeding the capabilities of any model the company has previously made generally available. State-of-the-art on nearly all tested benchmarks, with exceptional performance in software engineering, knowledge work, vision, and scientific research. Its lead grows on longer and more complex tasks. Ships with built-in safeguards that route sensitive cybersecurity, biology, chemistry, and distillation queries to Claude Opus 4.8.

$15.00 – $75.00 / 1M tokens

Claude Mythos 5

USA300K ctx

Anthropic's frontier Mythos-class model — the same underlying model as Claude Fable 5 but with safeguards lifted in some areas. It has the strongest cybersecurity capabilities of any model in the world, alongside state-of-the-art performance in software engineering, knowledge work, vision, and scientific research. Access is restricted to a small group of trusted cyberdefenders and infrastructure providers through Project Glasswing.

Nemotron 3 Ultra

USA4 providers1.0M ctx

NVIDIA's flagship open 550B-parameter Mixture-of-Experts model with 55B active parameters, built for frontier reasoning and orchestration in long-running agentic systems. Features hybrid Mamba-Transformer architecture, LatentMoE routing, multi-token prediction, and NVFP4 precision for 5x higher throughput. Achieves 30% lower cost-to-task-completion on agentic benchmarks. Supports 1M+ token context window with 95% accuracy on Ruler@1M.

$0.00 – $1.60 / 1M tokens

Gemma 4 12B

USA3 providers262K ctx

Google's medium-size open-weight model with 12 billion parameters from the Gemma 4 family. Encoder-free unified multimodal architecture that natively processes text, image, audio, and video inputs without dedicated encoders. Features a 256K context window and supports 140+ languages. First medium-sized model capable of natively ingesting audio. Suitable for local deployment on GPUs with 16GB VRAM.

$0.00 – $0.10 / 1M tokens

Claude Opus 4.8

USA300K ctx

Anthropic's most advanced model, building on Opus 4.7 with improvements across benchmarks in coding, agentic skills, reasoning, and knowledge work. Features enhanced honesty, better tool use efficiency, dynamic workflows support, and improved alignment.

$15.00 – $75.00 / 1M tokens

Palmyra X5

USA1.0M ctx

Writer's most advanced adaptive reasoning model with a 1 million token context window. Processes full million-token prompts in approximately 22 seconds with multi-turn function calls in 300ms. Optimized for enterprise agentic AI workflows at 3-4x lower cost than GPT-4.1.

$5.00 – $15.00 / 1M tokens

Gemini 3 Flash

USA2 providers1.0M ctx

Google's balanced model combining Gemini 3 Pro's reasoning capabilities with the Flash line's latency, efficiency, and cost. Features configurable thinking levels, multimodal function responses, and streaming function calling for complex agentic workflows.

$0.50 – $3.00 / 1M tokens

Granite 4.1 30B

USA524K ctx

IBM's largest dense decoder-only 30B parameter language model from the Granite 4.1 family. Trained on approximately 15T tokens with long-context extension up to 512K tokens. Supports tool calling, RAG, code generation, multilingual tasks across 12 languages. Released under Apache 2.0.

$0.60 – $1.20 / 1M tokens

Laguna M.1

USA128K ctx

Poolside AI's flagship agentic coding model with 225B total parameters and 23B active (MoE). Trained from scratch in-house on 30T tokens across 6,144 NVIDIA Hopper GPUs. Optimized for complex multi-step software engineering tasks including codebase exploration, file editing, test running, and iterative debugging.

$0.00 – $0.00 / 1M tokens

GPT-5.5

USA1.0M ctx

OpenAI's most capable model designed for complex real-world work including coding, online research, information analysis, and document creation. Features advanced agentic capabilities with tool search and multi-step task execution.

$12.00 – $48.00 / 1M tokens

GPT-5.4 Mini

USA1.1M ctx

OpenAI's compact reasoning model optimized for coding, computer use, and subagent tasks. Approaches GPT-5.4 performance on several benchmarks while running more than 2x faster.

$0.75 – $4.50 / 1M tokens

Muse Spark

USA256K ctx

Meta Superintelligence Labs' first model, featuring advanced reasoning, multimodal understanding, and agentic capabilities. Processes voice, text, and image inputs with tool use and multi-agent orchestration. Powers Meta AI across its product ecosystem.

$5.00 – $25.00 / 1M tokens

Gemma 4 31B

USA4 providers262K ctx

Google's flagship open-weight dense model with 31B parameters. All parameters active per forward pass. Ranks among top open models with strong performance on AIME 2026 (89.2%) and MMLU Pro (85.2%). Supports vision and extended context.

$0.00 – $0.50 / 1M tokens

Gemma 4 31B

USA262K ctx

Google's flagship open-weight dense model with 31 billion parameters from the Gemma 4 family. All parameters active per forward pass with top-tier performance on reasoning benchmarks including AIME 2026 and MMLU Pro. Supports vision and extended 256K context window.

Gemma 4 26B

USA262K ctx

Google's high-performance open-weight dense model with 26 billion parameters from the Gemma 4 family. Supports multimodal inputs including text and images with a 256K extended context window. Strong reasoning and code generation capabilities with all parameters active per forward pass.

GPT-OSS 120B

USA2 providers131K ctx

OpenAI's first open-weight large model with 120 billion parameters. Released under Apache 2.0 license, offering strong performance on reasoning and coding tasks while being fully self-hostable.

$1.80 – $6.00 / 1M tokens

Claude Opus 4.7

USA300K ctx

Anthropic's latest and most advanced model with state-of-the-art reasoning, coding, and analysis capabilities. Features improved tool use, extended thinking, and enhanced safety alignment.

$15.00 – $75.00 / 1M tokens

Nemotron 3 Super 120B

USA1.0M ctx

NVIDIA's open hybrid Mamba-Transformer MoE model with 120B total parameters (12B active). Features 1M token context window and excels at agentic reasoning, coding, planning, and tool calling.

$0.00 – $0.00 / 1M tokens

Grok 4.3

USA1.0M ctx

xAI's latest and most intelligent model with strong agentic tool calling, minimal hallucinations, and configurable reasoning. Supports 1M token context window with competitive pricing.

$1.25 – $2.50 / 1M tokens

GPT-5.4

USA2 providers1.1M ctx

OpenAI's frontier reasoning model combining advances in coding, reasoning, and agentic workflows. Features 1.1M token context window and strong performance on complex multi-step problems.

$2.50 – $15.00 / 1M tokens

Gemini 3.1 Pro

USA2 providers2.0M ctx

Google's latest flagship multimodal model with state-of-the-art performance on reasoning, coding, and multimodal understanding. Features native tool use, grounding, and million-token context window.

$7.00 – $21.00 / 1M tokens

1 2 Next →