Models
Browse 6 canonical LLM models across all providers
Gemini 3.1 Flash-Lite
Google's most cost-efficient Gemini model optimized for high-volume, low-latency use cases. Delivers 2.5x faster time to first token versus Gemini 2.5 Flash with full multimodal support. Ideal for agentic tasks, data extraction, translation, and classification.
$0.25 – $1.50 / 1M tokens
Gemini 3 Flash
Google's balanced model combining Gemini 3 Pro's reasoning capabilities with the Flash line's latency, efficiency, and cost. Features configurable thinking levels, multimodal function responses, and streaming function calling for complex agentic workflows.
$0.50 – $3.00 / 1M tokens
MiniCPM-V 4.6
Ultra-efficient multimodal language model from OpenBMB built on SigLIP2-400M and Qwen3.5-0.8B (~1B parameters). Supports single-image, multi-image, and video understanding with mixed 4x/16x visual token compression. Designed for edge deployment on iOS, Android, and HarmonyOS.
Gemini 3.1 Pro
Google's latest flagship multimodal model with state-of-the-art performance on reasoning, coding, and multimodal understanding. Features native tool use, grounding, and million-token context window.
$7.00 – $21.00 / 1M tokens
Gemini 2.5 Flash
Google's cost-effective model optimized for high throughput tasks. Balances speed and intelligence with strong multimodal capabilities and 1M token context window.
$0.15 – $0.60 / 1M tokens
Gemini 2.5 Pro
Google's high-capability reasoning model with adaptive thinking for complex agentic and multimodal challenges. Features 1M token context window and strong performance on coding and scientific tasks.
$2.50 – $15.00 / 1M tokens