Models
Browse 34 canonical LLM models across all providers
Google's balanced model combining Gemini 3 Pro's reasoning capabilities with the Flash line's latency, efficiency, and cost. Features configurable thinking levels, multimodal function responses, and streaming function calling for complex agentic workflows.
Google's most cost-efficient Gemini model optimized for high-volume, low-latency use cases. Delivers 2.5x faster time to first token versus Gemini 2.5 Flash with full multimodal support. Ideal for agentic tasks, data extraction, translation, and classification.
IBM's dense decoder-only 8B parameter language model from the Granite 4.1 family. Supports 131K-token context, tool calling, RAG, code generation with fill-in-the-middle, text summarization, classification, and extraction across 12 languages. Released under Apache 2.0.
IBM's largest dense decoder-only 30B parameter language model from the Granite 4.1 family. Trained on approximately 15T tokens with long-context extension up to 512K tokens. Supports tool calling, RAG, code generation, multilingual tasks across 12 languages. Released under Apache 2.0.
Poolside AI's flagship agentic coding model with 225B total parameters and 23B active (MoE). Trained from scratch in-house on 30T tokens across 6,144 NVIDIA Hopper GPUs. Optimized for complex multi-step software engineering tasks including codebase exploration, file editing, test running, and iterative debugging.
OpenAI's most capable model designed for complex real-world work including coding, online research, information analysis, and document creation. Features advanced agentic capabilities with tool search and multi-step task execution.
OpenAI's compact reasoning model optimized for coding, computer use, and subagent tasks. Approaches GPT-5.4 performance on several benchmarks while running more than 2x faster.
Meta Superintelligence Labs' first model, featuring advanced reasoning, multimodal understanding, and agentic capabilities. Processes voice, text, and image inputs with tool use and multi-agent orchestration. Powers Meta AI across its product ecosystem.
NVIDIA's open hybrid Mamba-Transformer MoE model with 120B total parameters (12B active). Features 1M token context window and excels at agentic reasoning, coding, planning, and tool calling.
Anthropic's latest and most advanced model with state-of-the-art reasoning, coding, and analysis capabilities. Features improved tool use, extended thinking, and enhanced safety alignment.
OpenAI's first open-weight large model with 120 billion parameters. Released under Apache 2.0 license, offering strong performance on reasoning and coding tasks while being fully self-hostable.
OpenAI's compact open-weight model with 20 billion parameters. Released under Apache 2.0 license, designed for efficient deployment on consumer hardware while maintaining strong coding and reasoning capabilities.
xAI's latest and most intelligent model with strong agentic tool calling, minimal hallucinations, and configurable reasoning. Supports 1M token context window with competitive pricing.
OpenAI's frontier reasoning model combining advances in coding, reasoning, and agentic workflows. Features 1.1M token context window and strong performance on complex multi-step problems.
OpenAI's premium tier model with extended reasoning capabilities, higher accuracy on complex tasks, and priority access. Optimized for professional and enterprise workloads requiring maximum quality.
Google's latest flagship multimodal model with state-of-the-art performance on reasoning, coding, and multimodal understanding. Features native tool use, grounding, and million-token context window.
xAI's latest model with real-time information access, strong reasoning capabilities, and competitive performance on coding and analysis tasks. Features improved tool use and multimodal understanding.
xAI's multi-agent capable model with 2M token context window. Available in reasoning, non-reasoning, and multi-agent variants for diverse enterprise workloads.
Anthropic's most capable model in the Claude 4 family, excelling at complex analysis, extended reasoning, scientific research, and advanced code generation. Features significantly improved accuracy and reduced hallucinations.
Anthropic's balanced model offering strong performance at lower cost and latency than Opus. Excellent for everyday coding, analysis, and content generation tasks with good reasoning capabilities.
xAI's fast and cost-effective model with 2M token context window. Offers both reasoning and non-reasoning modes at significantly lower pricing than flagship models.
Anthropic's fastest model with near-frontier intelligence. Optimized for high-throughput, low-latency applications requiring quick responses at minimal cost. Supports extended thinking.
Anthropic's previous-generation balanced model with strong coding and analysis capabilities. Offers excellent price-performance ratio for production workloads requiring reliable quality.
Google's cost-effective model optimized for high throughput tasks. Balances speed and intelligence with strong multimodal capabilities and 1M token context window.
Google's high-capability reasoning model with adaptive thinking for complex agentic and multimodal challenges. Features 1M token context window and strong performance on coding and scientific tasks.
OpenAI's fifth-generation flagship model with significant improvements in reasoning, multimodal understanding, and code generation. Features enhanced instruction following and expanded context window.
Meta's quality-focused MoE model with 17B active parameters (400B total, 128 experts). Targets quality-critical tasks with benchmark scores competitive with GPT-4o and Gemini 2.5 Pro.
Meta's efficient MoE model with 17B active parameters (109B total, 16 experts). Supports up to 10M token context — the longest of any production model. Strong performance on reasoning and multilingual tasks.
Cohere's flagship 111B parameter model optimized for demanding enterprises requiring fast, secure, and high-quality AI. Excels at RAG, tool use, and multilingual tasks with strong reasoning capabilities.
Meta's flagship open-weight model with 70 billion parameters. Strong multilingual capabilities with competitive performance on reasoning and coding benchmarks. Available for self-hosting and through various inference providers.
Cohere's compact 7B parameter model optimized for RAG, tool use, and code tasks. Delivers top-tier speed and efficiency on commodity GPUs and edge devices with 128K context window.
Meta's efficient open-weight model with 8 billion parameters from the Llama 3.1 family. Optimized for instruction following with strong performance on general tasks, coding, and multilingual benchmarks. Ideal for cost-effective deployment and edge inference scenarios.
Anthropic's most powerful model in the Claude 3 family, excelling at complex analysis, nuanced content generation, scientific reasoning, and code generation with extended context support.
OpenAI's flagship large language model with advanced reasoning, instruction following, and code generation capabilities. Supports multimodal inputs including text and images.