Skip to content
AI Economy Hub

2026 AI Pricing Cheat Sheet

Updated April 2026

Every major AI model on one page — text, image, voice, video. Verified against provider public pricing pages. Re-checked monthly. From AI Economy Hub.

Text models — $ per million tokens

ModelInputOutputContextPick when
Claude Opus 4.7$15.00$75.00200kReasoning, hard agent loops
Claude Sonnet 4.5$3.00$15.00200kDefault workhorse — cache drops input to $0.30
Claude Haiku 4$0.80$4.00200kClassifier / router tier
GPT-5$5.00$20.00400kTool use + structured output king
GPT-5 mini$0.40$1.60400kDirect Haiku competition
GPT-4o$2.50$10.00128kStill the default for many mid-tier flows
o4 (reasoning)$15.00$60.00200kBest-in-class on math + planning
Gemini 2.5 Pro$1.25$10.001MCheap long-context; uneven on hard reasoning
Gemini 2.5 Flash$0.15$0.601MThroughput champ for bulk classification
Mistral Large 2$2.00$6.00128kEU-hosted; strong on multilingual
Grok 3 (xAI)$3.00$15.00131kRealtime web + X integration
Cohere Command R+$2.50$10.00128kEnterprise RAG; strong on grounded answers

Image — per generation

ModelPriceStrength
Midjourney v7$0.020/image (Pro)Best-in-class style, no API
DALL·E 4 (OpenAI)$0.040/image (1024×1024 HD)Strong API, strict policy filter
Flux 1.1 Pro$0.040/imageBest open-weights flagship via Replicate/Together
Stable Diffusion 4$0.005–0.020/image (API)Cheapest at scale; self-host saves more
Imagen 4 (Google)$0.030/imageCleanest text-in-image, strong realism

Voice (TTS) — per 1k characters

ModelPriceStrength
ElevenLabs v3$0.18/1k chars (Creator)Best voice cloning + emotion
OpenAI TTS HD$30/M charsCheap, lifelike, no clone
PlayHT 3.0$0.012/1k charsGood clone, real-time API
Cartesia Sonic$0.05/1k charsSub-300ms latency, real-time

Video — per generated clip

ModelPriceStrength
OpenAI Sora 2$1.20–$3.00 / 10sBest fidelity, 10–20s clips
Runway Gen-4$0.50–$1.00 / 5sDirector-grade controls
Pika 2.0$0.20–$0.40 / 5sCheap social-video factory
Kling 2.0$0.25–$0.60 / 5sBest motion realism on humans

Five rules that move the bill

  1. Output tokens cost 4–5× input — short responses save serious money.
  2. Anthropic + OpenAI prompt cache shaves up to 90% off repeated context (system prompts, RAG).
  3. Batch APIs (OpenAI, Anthropic, Gemini) give a flat 50% discount for non-interactive jobs.
  4. Fine-tuning rarely beats a well-cached system prompt + retrieval until you exceed ~5M monthly calls.
  5. Embedding cost is a rounding error; vector-DB hosting is not. Pinecone serverless at scale > self-hosted pgvector.
© 2026 AI Economy Hub · https://aieconomyhub.co · Sources: provider public pricing pages, verified April 2026.