Skip to content
AI Economy Hub

AI API pricing comparison

Per-million-token input, output, and cache prices across OpenAI, Anthropic, Google, Mistral, xAI, and Cohere.

Loading tool…

Frequently asked questions

1.Who's actually cheapest for production in April 2026?

For bulk workloads, Gemini 3 Flash at $0.15 / $0.60 per MTok. For medium quality, Gemini 3 Pro or Sonnet 4.5. For top quality, the cheapest-per-quality-unit is almost always Sonnet with prompt caching.

2.Do I get enterprise discounts?

At ~$50k/mo+ committed spend, all three of Anthropic, OpenAI, and Google will negotiate 15-30% off sticker on an annual commit. Below that, focus on caching and routing β€” it's cheaper than procurement.

3.How often do prices change?

Major model launches (quarterly-ish) usually include a price reset β€” sometimes cheaper at the same quality, occasionally higher for new top tiers. Check per-MTok pricing quarterly.

4.Is there a true cross-provider comparison tool?

OpenRouter and LiteLLM both give you one API with multiple providers behind it. Use them for A/B testing and failover, not as a sole vendor β€” they add a small margin.

5.What about open-source alternatives?

Llama 4 and Qwen 3 at comparable sizes match Sonnet 4.5 on many benchmarks. Self-hosting costs $0.20-0.80 per MTok all-in on an H100 at decent utilization. Worth it above ~50M tokens/day of steady traffic, usually not below.

Every major AI API, priced (April 2026)

One table, one decision rule: the lowest sticker price rarely wins. Pick the model that clears your quality bar, then optimize cost with caching, routing, and response-length caps. Here is the April 2026 snapshot across OpenAI, Anthropic, Google, Mistral, xAI, and Cohere.

ProviderModelInput $/MTokOutput $/MTokContextNotable feature
OpenAIGPT-5$5.00$20.00400kBest structured output / tool use
OpenAIGPT-5 mini$0.40$1.60400kCheap OpenAI tier
OpenAIo4$12.00$48.00200kReasoning model (slow, accurate)
AnthropicClaude Opus 4.7$15.00$75.00200kTop coding + long agent
AnthropicClaude Sonnet 4.5$3.00$15.00200kProduction default
AnthropicClaude Haiku 4$0.80$4.00200kRouter / classifier
GoogleGemini 3 Pro$1.25$10.002M2M context, multimodal
GoogleGemini 3 Flash$0.15$0.601MCheapest production model
MistralMistral Large 3$2.00$6.00128kEU-hosted, multilingual
xAIGrok 4$3.00$15.00256kFresh data, X integration
CohereCommand R+$2.50$10.00128kRAG-optimized + private deploy

The real unit cost is not sticker

Three adjustments turn sticker into unit cost, and the adjustments are often larger than the spread between models:

  1. Prompt caching. Claude cache reads are 10% of input price. A 6,000-token system prompt cached at 75% hit rate drops effective input cost by 65-70%. Every major provider now offers caching; most teams forget to turn it on.
  2. Output token ratio. Output tokens cost 4-5Γ— input. A feature that returns 1,200 tokens when 300 would do is paying 4Γ— too much. Cap max_tokens aggressively.
  3. Retry rate. Schema validation failures, "please try again" wrappers, and agent loops silently 2-3Γ— your effective cost. Measure retries as a first-class metric.

Decision rule for picking a provider

  • Already on AWS / enterprise deal: Claude via Bedrock is often the shortest path.
  • EU-only data residency: Mistral Large 3 on La Plateforme or Gemini on Vertex EU.
  • On-prem / private-cloud: Cohere Command R+ (AWS/Oracle) or self-hosted Llama 4 / Qwen 3.
  • Multimodal (video / audio / charts): Gemini 3 Pro is the strongest multimodal tier.
  • Bulk classification / ETL: Gemini 3 Flash or GPT-5 mini.
  • Everything else: Sonnet 4.5 with an Opus 4.7 escalator.

Per-workload cost benchmarks

Typical monthly spend for three common workloads, running the cheapest production-grade pick in each family (assumes caching on):

WorkloadDaily callsSonnet 4.5GPT-5Gemini 3 Pro
Support chatbot (2k in, 400 out)5,000~$1,350~$2,100~$850
Coding assistant (10k in, 1.5k out)2,000~$1,800~$3,000~$1,300
Bulk extraction (3k in, 200 out)50,000~$1,700~$1,250~$780
Keep going

Digital Dashboard Hub

Track your AI tool costs, ROI, and productivity metrics

DDH helps you measure whether AI is actually saving you money β€” with 162 business and productivity calculators in one place. Free 14-day trial.

Track your AI ROI free β†’

More free tools