LLM API Pricing · 2026

LLM API Pricing in 2026: Live Numbers from Every Provider

What every major LLM costs in 2026 and what it can actually do, in one table. GPT-5.5, Claude Opus 4.7, Gemini 2.5 Pro, Llama 4, DeepSeek, Grok, plus the smaller variants worth shipping. Pricing syncs daily from provider APIs. We're LLMTest, the AI proxy that benchmarks all of them on real tasks so you don't have to.

Verified 2026-05-02 · Pricing synced daily · Need capability data? See the LLM capabilities matrix.

Don't pick the perfect model. Ship it rough.

LLMTest is an AI proxy. On every call, we auto-pick the cheapest model that hits your quality bar. We also rewrite weak prompts, handle fallbacks when an API goes down, and run weekly benchmarks across 340+ models so we know what's actually working right now. Drop it in once. Ship features instead of comparing pricing tables.

Start optimizing

Filter:

Model▲	Input $/M▲	Output $/M▲	Context▲	Vision	Tools	JSON	Cache	Max Out▲
GPT-5.5 Flagship OpenAI · openai/gpt-5.5	$5.00	$30.00	1.1M	✓	✓	✓	✓	16K
GPT-5 Flagship OpenAI · openai/gpt-5	$1.25	$10.00	400K	✓	✓	✓	✓	16K
GPT-4.1 Flagship OpenAI · openai/gpt-4.1	$2.00	$8.00	1.0M	✓	✓	✓	✓	16K
Claude Opus 4.7 Flagship Anthropic · anthropic/claude-opus-4.7	$5.00	$25.00	1M	✓	✓	✓	✓	8K
Claude Opus 4 Flagship Anthropic · anthropic/claude-opus-4	$15.00	$75.00	200K	✓	✓	✓	✓	8K
Gemini 2.5 Pro Flagship Google · google/gemini-2.5-pro	$1.25	$10.00	1.0M	✓	✓	✓	✓	66K
Grok 4 Flagship xAI · x-ai/grok-4	$3.00	$15.00	256K	✓	✓	✓	—	8K
Sonar Pro Flagship Perplexity · perplexity/sonar-pro	$3.00	$15.00	200K	—	—	✓	—	8K
o3 Reasoning OpenAI · openai/o3	$2.00	$8.00	200K	✓	✓	✓	✓	100K
o3-mini Reasoning OpenAI · openai/o3-mini	$1.10	$4.40	200K	—	✓	✓	✓	100K
GPT-5 Mini Mid OpenAI · openai/gpt-5-mini	$0.25	$2.00	400K	✓	✓	✓	✓	8K
GPT-4o Mid OpenAI · openai/gpt-4o	$2.50	$10.00	128K	✓	✓	✓	✓	16K
Claude Sonnet 4.6 Mid Anthropic · anthropic/claude-sonnet-4.6	$3.00	$15.00	1M	✓	✓	✓	✓	8K
Claude Sonnet 4 Mid Anthropic · anthropic/claude-sonnet-4	$3.00	$15.00	1M	✓	✓	✓	✓	8K
Gemini 2.5 Flash Mid Google · google/gemini-2.5-flash	$0.30	$2.50	1.0M	✓	✓	✓	✓	66K
Mistral Medium 3 Mid Mistral · mistralai/mistral-medium-3	$0.40	$2.00	131K	—	✓	✓	—	8K
GPT-5 Nano Small OpenAI · openai/gpt-5-nano	$0.05	$0.40	400K	—	✓	✓	✓	4K
GPT-4o Mini Small OpenAI · openai/gpt-4o-mini	$0.15	$0.60	128K	✓	✓	✓	✓	16K
Claude Haiku 4.5 Small Anthropic · anthropic/claude-haiku-4.5	$1.00	$5.00	200K	✓	✓	✓	✓	8K
Gemini 2.5 Flash Lite Small Google · google/gemini-2.5-flash-lite	$0.10	$0.40	1.0M	✓	✓	✓	✓	8K

Prices in USD per million tokens. Audio input, batch API, and knowledge cutoff aren't shown here. Check provider docs for those.

How to choose: price is half the story

The cheapest model usually isn't the right one. If a model costs half as much but only gets things right 60% of the time, you're spending the savings on retries plus the humans who clean up its mess. Real-world cost is cost per correct answer, not cost per million tokens.

A few things the table above won't tell you:

Match the model to the task. A flagship at coding can be mid-tier at translation. Our SQL benchmark caught a $0.15/M model beating a $5/M model 5× cleaner.
Use prompt caching if your system prompt is stable across calls. It cuts production costs 60% to 80%. That's a bigger lever than picking a cheaper base model.
Cascade for cost. Send each call to a small model first, escalate to a flagship only when the small one can't hack it. Almost no one does this. Why fallback chains beat single-model setups.
Test on your data. No leaderboard predicts how a model behaves on your specific prompts. Not even ours. Pick the top 2 or 3 candidates and run 50 to 100 representative samples through each.

For the full framework, read How to choose an LLM in 2026: the definitive guide.