LLM API Pricing · 2026

LLM API Pricing in 2026: Live Numbers from Every Provider

What every major LLM costs in 2026 and what it can actually do, in one table. GPT-5.5, Claude Opus 4.7, Gemini 2.5 Pro, Llama 4, DeepSeek, Grok, plus the smaller variants worth shipping. Pricing syncs daily from provider APIs. We're LLMTest, the AI proxy that benchmarks all of them on real tasks so you don't have to.

Verified 2026-05-02 · Pricing synced daily · Need capability data? See the LLM capabilities matrix.

Don't pick the perfect model. Ship it rough.

LLMTest is an AI proxy. On every call, we auto-pick the cheapest model that hits your quality bar. We also rewrite weak prompts, handle fallbacks when an API goes down, and run weekly benchmarks across 340+ models so we know what's actually working right now. Drop it in once. Ship features instead of comparing pricing tables.

Start optimizing
Filter:
Model Input $/M Output $/M Context Vision Tools JSON Cache Max Out
GPT-5.5 Flagship
OpenAI · openai/gpt-5.5
$5.00 $30.00 1.1M 16K
GPT-5 Flagship
OpenAI · openai/gpt-5
$1.25 $10.00 400K 16K
GPT-4.1 Flagship
OpenAI · openai/gpt-4.1
$2.00 $8.00 1.0M 16K
Claude Opus 4.7 Flagship
Anthropic · anthropic/claude-opus-4.7
$5.00 $25.00 1M 8K
Claude Opus 4 Flagship
Anthropic · anthropic/claude-opus-4
$15.00 $75.00 200K 8K
Gemini 2.5 Pro Flagship
Google · google/gemini-2.5-pro
$1.25 $10.00 1.0M 66K
Grok 4 Flagship
xAI · x-ai/grok-4
$3.00 $15.00 256K 8K
Sonar Pro Flagship
Perplexity · perplexity/sonar-pro
$3.00 $15.00 200K 8K
o3 Reasoning
OpenAI · openai/o3
$2.00 $8.00 200K 100K
o3-mini Reasoning
OpenAI · openai/o3-mini
$1.10 $4.40 200K 100K
GPT-5 Mini Mid
OpenAI · openai/gpt-5-mini
$0.25 $2.00 400K 8K
GPT-4o Mid
OpenAI · openai/gpt-4o
$2.50 $10.00 128K 16K
Claude Sonnet 4.6 Mid
Anthropic · anthropic/claude-sonnet-4.6
$3.00 $15.00 1M 8K
Claude Sonnet 4 Mid
Anthropic · anthropic/claude-sonnet-4
$3.00 $15.00 1M 8K
Gemini 2.5 Flash Mid
Google · google/gemini-2.5-flash
$0.30 $2.50 1.0M 66K
Mistral Medium 3 Mid
Mistral · mistralai/mistral-medium-3
$0.40 $2.00 131K 8K
GPT-5 Nano Small
OpenAI · openai/gpt-5-nano
$0.05 $0.40 400K 4K
GPT-4o Mini Small
OpenAI · openai/gpt-4o-mini
$0.15 $0.60 128K 16K
Claude Haiku 4.5 Small
Anthropic · anthropic/claude-haiku-4.5
$1.00 $5.00 200K 8K
Gemini 2.5 Flash Lite Small
Google · google/gemini-2.5-flash-lite
$0.10 $0.40 1.0M 8K

Prices in USD per million tokens. Audio input, batch API, and knowledge cutoff aren't shown here. Check provider docs for those.

How to choose: price is half the story

The cheapest model usually isn't the right one. If a model costs half as much but only gets things right 60% of the time, you're spending the savings on retries plus the humans who clean up its mess. Real-world cost is cost per correct answer, not cost per million tokens.

A few things the table above won't tell you:

  • Match the model to the task. A flagship at coding can be mid-tier at translation. Our SQL benchmark caught a $0.15/M model beating a $5/M model 5× cleaner.
  • Use prompt caching if your system prompt is stable across calls. It cuts production costs 60% to 80%. That's a bigger lever than picking a cheaper base model.
  • Cascade for cost. Send each call to a small model first, escalate to a flagship only when the small one can't hack it. Almost no one does this. Why fallback chains beat single-model setups.
  • Test on your data. No leaderboard predicts how a model behaves on your specific prompts. Not even ours. Pick the top 2 or 3 candidates and run 50 to 100 representative samples through each.

For the full framework, read How to choose an LLM in 2026: the definitive guide.