AI Economy Hub

AI startup runway

How many months of runway you have at current burn plus AI costs.

Results

Runway
15.2 months
Net burn
$33,000.00
Cash
$500,000.00
AI as % of burn
17%

Get weekly marketing insights

Join 1,200+ readers. One email per week. Unsubscribe anytime.

Frequently asked questions

1.When should I worry?

Under 9 months of runway means you're already raising or cutting. Under 6 is emergency.

AI startup runway math: the AI tax nobody budgeted for in 2022

Classical SaaS startups in 2022 modeled ~$8k–$15k/month in variable infra at Series A. AI-native startups in 2026 model $30k–$200k/month at the equivalent stage, dominated by LLM API and inference infrastructure. For any founder running runway scenarios, the AI cost line has moved from footnote to top-three expense and needs explicit modeling.

Typical monthly burn at each stage (AI-native startup)

StageTeam sizePayrollAI / infraOtherTotal burn
Pre-seed2-4$30-50k$2-10k$5k~$50-75k
Seed4-10$80-180k$10-40k$15k~$125-250k
Series A10-30$300-700k$30-150k$40-80k~$450k-1M
Series B30-80$900k-2.3M$100-500k$150-400k~$1.3-3.5M

The four AI-cost drivers you must model separately

  1. Per-user variable cost.This scales with growth. The most dangerous line: if you don't model it explicitly, a hot quarter burns cash faster than revenue grows.
  2. Training / fine-tuning runs. Episodic — $5k–$100k each, usually once or twice a quarter.
  3. Evals + experimentation. Fixed but real. Expect 20–30% of production LLM spend goes to internal eng experimentation.
  4. Reserved infra commitments.GPU reservations, Anthropic/OpenAI enterprise commits — usually 12-month obligations that can't flex with usage.

Runway scenarios that move the decision

  • Best case: Growth track closes Series A/B before cash runs out. 18+ months of runway.
  • Base case: 15 months. Enough to hit a milestone, but a soft quarter shrinks it fast.
  • Danger zone:<12 months. Either cut now, raise on weak metrics, or bridge with a convertible.
  • <6 months: You are fundraising, not operating.

Where AI-native startups most often under-plan

  1. Scaling with a loss-leader plan. Launching at $20/month to win share, then discovering gross margin is negative at P80 usage. Fix: usage caps + transparency.
  2. Fine-tune dependency without budget.Shipping a product that requires a monthly fine-tune pass on growing data — easy to budget $8k/run, harder when it's $50k/run 9 months later.
  3. Ignoring eval + observability cost. Langfuse, Braintrust, Helicone, or custom — budget $1k–$10k/mo at seed, $10–$50k/mo at Series A for this line.
  4. Counting on price drops. Model prices did drop 3–10× between 2022 and 2026. Betting on another 10× drop to rescue unit economics is a coin flip.

Runway-extending moves that actually work

  • Prompt caching + model routing: 30–60% API cost reduction in 2–4 weeks of engineering.
  • Self-host bulk workloads on Together/Modal or L40S GPUs.
  • Switch from Opus/GPT-5 to Sonnet/GPT-5-mini with evals-validated quality.
  • Eliminate stuck / unused seats and consolidate tool stack (20–35% SaaS line reduction).
  • Negotiate Anthropic / OpenAI volume commits for 10–20% rate discounts.
  • If customer acquisition cost is healthy, borrow against revenue (Pipe, Capchase, Arc) before dilutive raise.

Three burn scenarios with real token math

Abstract burn numbers are useless. Model the AI line against concrete usage. These three stages are drawn from actual Series Seed / Series A plans run in the last two quarters.

Scenario 1: Support chatbot product, seed stage, 250,000 requests/month

One anchor customer plus a closed-beta cohort driving 250k requests/mo total. Per request: 2,350 input + 280 output tokens on Claude Sonnet 4.5 ($3/$15). Uncached: 587.5M × $3 + 70M × $15 = $1,762 + $1,050 = $2,812/mo. Turn on Anthropic prompt caching on the 800-token system prefix (90% cache-read discount, ~73% hit rate) and the bill drops to roughly $1,657. Route FAQ-style intents (65% of traffic) to Haiku 4 ($0.80/$4): final bill $1,062/mo. Add $300/mo Langfuse + $200/mo Pinecone = ~$1,600/mo AI-line at seed stage. If the next six months grow traffic 10×, the AI line becomes $16k/mo — plan for it now.

Scenario 2: Internal RAG assistant, Series A, 50,000 queries/month

A 1,200-employee customer running a policy-docs assistant. Per query: 3,200-token system prompt + 3,900 tokens of retrieved chunks + 120-token user + 550-token response. Uncached Sonnet 4.5: $1,496/mo. With 92% cache hit on the system prefix during business hours: $1,108/mo. Add Cohere Rerank 3.5 ($50/mo), Pinecone ($700/mo), Langfuse ($400/mo): AI line ~$2,250/mo for this workload. Model it per-customer and multiply by the ARR plan.

Scenario 3: Code assistant, 10 devs × 40 queries/day

8,800 queries/mo × 5,600 input + 900 output on Sonnet 4.5 = $267/mo. Add an occasional Opus 4.1 escalation (<5% of queries) and you land at $320/mo. Scale to 500 devs and the same pattern produces $16k/mo. The lever that actually matters: if Opus creeps from 5% to 15% of queries, the bill triples.

Cost levers with math

  • Anthropic prompt cache (90% read discount). A 1,000-token system prompt at 200,000 queries per month generates 200M cached-read tokens — $60 at $0.30/M vs $600 uncached at $3/M. Free 10× savings on the static prefix.
  • OpenAI automatic cache (50% discount). GPT-5 drops from $5 to $2.50/M on matching prefix tokens over 1,024 in length. No code change required.
  • Gemini explicit context cache (75% discount). Gemini 2.5 Pro drops from $1.25 to roughly $0.31/M on cached input. Worth setting up for long-context RAG.
  • Batch APIs (50% off, up to 24h latency). For eval runs, offline tagging, nightly enrichment — always on. For user-facing traffic — never.
  • Model routing. A Haiku-first router with escalation to Sonnet saves 55–75% on typical support workloads with less than 3 percentage points of quality loss.

Model selection rules at each burn tier

  • Pre-seed:Default to Sonnet 4.5. Do not optimize for cost; optimize for product velocity. The delta between "it works" and "doesn't work" is worth ten times whatever you save switching to Haiku.
  • Seed, >$2k/mo spend: Add Haiku 4 for classifiers and routers. Turn on prompt caching everywhere. Build a one-day router that sends 60–70% of traffic to Haiku.
  • Series A, >$20k/mo spend: Multi-provider fallback. Negotiate Anthropic and OpenAI volume commits for 10–20% rate discounts. Consider Gemini 2.5 Flash for bulk enrichment workloads.
  • Series B, >$100k/mo spend: Evaluate self-hosting Llama 4 70B or Qwen 3 on Modal/Together for the bulk workload. Break-even is typically 10–40M output tokens/mo.

Production patterns that preserve runway

Wrap every provider call in a retry budget (3–5 attempts, absolute token ceiling) so a malformed prompt cannot loop 50 times and eat $80. Put a circuit breaker on upstream providers — trip at 20% error rate over a 2-minute window and fail over to the secondary. Maintain a fallback chain (Sonnet 4.5 → GPT-5 → Haiku 4 + simplified prompt → static "email us"). Enforce per-tenant monthly token caps; one runaway B2B customer can burn your month's budget in 48 hours without one. Every AI line in the runway model should be the sum of a fixed monthly floor (evals, observability, reserved commits) and a variable cost-per-active-user that scales with growth — not a single flat number the CFO later discovers was a three-month-old estimate.

Frequently asked questions

How much runway should I target at each stage? 18+ months at seed, 24+ at Series A, 30+ at Series B. AI-native fundraising cycles are longer because unit economics questions take longer to answer; build in buffer.

Is it ever right to over-spend on model cost early? Yes. For the first 90 days post-launch, defaulting to the strongest model available removes one variable from debugging. Cost-optimize in month 4+ once you know the product works.

Should I model model-price cuts into my plan? Budget as if prices stay flat. Sonnet 4.5 is 40% cheaper than its 3.7 predecessor was a year ago, but betting runway on another cut is gambling.

What are typical provider commit rates? Anthropic enterprise commits in 2026 offer 10–20% discounts on $10k+/mo commits. OpenAI similar. Both require 6–12 month terms. Only commit if your traffic is stable.

What line does the board actually care about? Variable cost per paying customer, trended over 6 months. Flat or declining is healthy; rising is an existential problem regardless of absolute burn.

How do I budget fine-tune runs? Model each run as a one-off capex line. A full fine-tune pass on Llama 4 70B with 5B tokens runs $15k-$40k on Together depending on hardware. Assume one run per quarter at Series A scale.

Does observability really cost $10-50k/mo at Series A? It can, if you are running millions of traces. Langfuse Cloud is roughly $0.0002/trace; 50M traces/mo is $10k. Helicone and Braintrust are comparable. Rolling your own into Datadog is a one-time engineering sprint.

When should I hire a dedicated AI ops engineer? Around $20-40k/mo AI spend or 10+ production features on LLMs. Before that, distribute the work across existing engineers.

Keep going

More free tools