Under 9 months of runway means you're already raising or cutting. Under 6 is emergency.

AI Startup Runway Calculator

AI startup runway math: the AI tax nobody budgeted for in 2022

Classical SaaS startups in 2022 modeled ~$8k–$15k/month in variable infra at Series A. AI-native startups in 2026 model $30k–$200k/month at the equivalent stage, dominated by LLM API and inference infrastructure. For any founder running runway scenarios, the AI cost line has moved from footnote to top-three expense and needs explicit modeling.

Typical monthly burn at each stage (AI-native startup)

Stage	Team size	Payroll	AI / infra	Other	Total burn
Pre-seed	2-4	$30-50k	$2-10k	$5k	~$50-75k
Seed	4-10	$80-180k	$10-40k	$15k	~$125-250k
Series A	10-30	$300-700k	$30-150k	$40-80k	~$450k-1M
Series B	30-80	$900k-2.3M	$100-500k	$150-400k	~$1.3-3.5M

The four AI-cost drivers you must model separately

Per-user variable cost.This scales with growth. The most dangerous line: if you don't model it explicitly, a hot quarter burns cash faster than revenue grows.
Training / fine-tuning runs. Episodic — $5k–$100k each, usually once or twice a quarter.
Evals + experimentation. Fixed but real. Expect 20–30% of production LLM spend goes to internal eng experimentation.
Reserved infra commitments.GPU reservations, Anthropic/OpenAI enterprise commits — usually 12-month obligations that can't flex with usage.

Runway scenarios that move the decision

Best case: Growth track closes Series A/B before cash runs out. 18+ months of runway.
Base case: 15 months. Enough to hit a milestone, but a soft quarter shrinks it fast.
Danger zone:<12 months. Either cut now, raise on weak metrics, or bridge with a convertible.
<6 months: You are fundraising, not operating.

Where AI-native startups most often under-plan

Scaling with a loss-leader plan. Launching at $20/month to win share, then discovering gross margin is negative at P80 usage. Fix: usage caps + transparency.
Fine-tune dependency without budget.Shipping a product that requires a monthly fine-tune pass on growing data — easy to budget $8k/run, harder when it's $50k/run 9 months later.
Ignoring eval + observability cost. Langfuse, Braintrust, Helicone, or custom — budget $1k–$10k/mo at seed, $10–$50k/mo at Series A for this line.
Counting on price drops. Model prices did drop 3–10× between 2022 and 2026. Betting on another 10× drop to rescue unit economics is a coin flip.

Runway-extending moves that actually work

Prompt caching + model routing: 30–60% API cost reduction in 2–4 weeks of engineering.
Self-host bulk workloads on Together/Modal or L40S GPUs.
Switch from Opus/GPT-5 to Sonnet/GPT-5-mini with evals-validated quality.
Eliminate stuck / unused seats and consolidate tool stack (20–35% SaaS line reduction).
Negotiate Anthropic / OpenAI volume commits for 10–20% rate discounts.
If customer acquisition cost is healthy, borrow against revenue (Pipe, Capchase, Arc) before dilutive raise.

Three burn scenarios with real token math

Abstract burn numbers are useless. Model the AI line against concrete usage. These three stages are drawn from actual Series Seed / Series A plans run in the last two quarters.

Scenario 1: Support chatbot product, seed stage, 250,000 requests/month

One anchor customer plus a closed-beta cohort driving 250k requests/mo total. Per request: 2,350 input + 280 output tokens on Claude Sonnet 4.5 ($3/$15). Uncached: 587.5M × $3 + 70M × $15 = $1,762 + $1,050 = $2,812/mo. Turn on Anthropic prompt caching on the 800-token system prefix (90% cache-read discount, ~73% hit rate) and the bill drops to roughly $1,657. Route FAQ-style intents (65% of traffic) to Haiku 4 ($0.80/$4): final bill $1,062/mo. Add $300/mo Langfuse + $200/mo Pinecone = ~$1,600/mo AI-line at seed stage. If the next six months grow traffic 10×, the AI line becomes $16k/mo — plan for it now.

Scenario 2: Internal RAG assistant, Series A, 50,000 queries/month

A 1,200-employee customer running a policy-docs assistant. Per query: 3,200-token system prompt + 3,900 tokens of retrieved chunks + 120-token user + 550-token response. Uncached Sonnet 4.5: $1,496/mo. With 92% cache hit on the system prefix during business hours: $1,108/mo. Add Cohere Rerank 3.5 ($50/mo), Pinecone ($700/mo), Langfuse ($400/mo): AI line ~$2,250/mo for this workload. Model it per-customer and multiply by the ARR plan.

Scenario 3: Code assistant, 10 devs × 40 queries/day

8,800 queries/mo × 5,600 input + 900 output on Sonnet 4.5 = $267/mo. Add an occasional Opus 4.1 escalation (<5% of queries) and you land at $320/mo. Scale to 500 devs and the same pattern produces $16k/mo. The lever that actually matters: if Opus creeps from 5% to 15% of queries, the bill triples.

Cost levers with math

Anthropic prompt cache (90% read discount). A 1,000-token system prompt at 200,000 queries per month generates 200M cached-read tokens — $60 at $0.30/M vs $600 uncached at $3/M. Free 10× savings on the static prefix.
OpenAI automatic cache (50% discount). GPT-5 drops from $5 to $2.50/M on matching prefix tokens over 1,024 in length. No code change required.
Gemini explicit context cache (75% discount). Gemini 2.5 Pro drops from $1.25 to roughly $0.31/M on cached input. Worth setting up for long-context RAG.
Batch APIs (50% off, up to 24h latency). For eval runs, offline tagging, nightly enrichment — always on. For user-facing traffic — never.
Model routing. A Haiku-first router with escalation to Sonnet saves 55–75% on typical support workloads with less than 3 percentage points of quality loss.

Model selection rules at each burn tier

Pre-seed:Default to Sonnet 4.5. Do not optimize for cost; optimize for product velocity. The delta between "it works" and "doesn't work" is worth ten times whatever you save switching to Haiku.
Seed, >$2k/mo spend: Add Haiku 4 for classifiers and routers. Turn on prompt caching everywhere. Build a one-day router that sends 60–70% of traffic to Haiku.
Series A, >$20k/mo spend: Multi-provider fallback. Negotiate Anthropic and OpenAI volume commits for 10–20% rate discounts. Consider Gemini 2.5 Flash for bulk enrichment workloads.
Series B, >$100k/mo spend: Evaluate self-hosting Llama 4 70B or Qwen 3 on Modal/Together for the bulk workload. Break-even is typically 10–40M output tokens/mo.

Production patterns that preserve runway

Wrap every provider call in a retry budget (3–5 attempts, absolute token ceiling) so a malformed prompt cannot loop 50 times and eat $80. Put a circuit breaker on upstream providers — trip at 20% error rate over a 2-minute window and fail over to the secondary. Maintain a fallback chain (Sonnet 4.5 → GPT-5 → Haiku 4 + simplified prompt → static "email us"). Enforce per-tenant monthly token caps; one runaway B2B customer can burn your month's budget in 48 hours without one. Every AI line in the runway model should be the sum of a fixed monthly floor (evals, observability, reserved commits) and a variable cost-per-active-user that scales with growth — not a single flat number the CFO later discovers was a three-month-old estimate.

Frequently asked questions

How much runway should I target at each stage? 18+ months at seed, 24+ at Series A, 30+ at Series B. AI-native fundraising cycles are longer because unit economics questions take longer to answer; build in buffer.

Is it ever right to over-spend on model cost early? Yes. For the first 90 days post-launch, defaulting to the strongest model available removes one variable from debugging. Cost-optimize in month 4+ once you know the product works.

Should I model model-price cuts into my plan? Budget as if prices stay flat. Sonnet 4.5 is 40% cheaper than its 3.7 predecessor was a year ago, but betting runway on another cut is gambling.

What are typical provider commit rates? Anthropic enterprise commits in 2026 offer 10–20% discounts on $10k+/mo commits. OpenAI similar. Both require 6–12 month terms. Only commit if your traffic is stable.

What line does the board actually care about? Variable cost per paying customer, trended over 6 months. Flat or declining is healthy; rising is an existential problem regardless of absolute burn.

How do I budget fine-tune runs? Model each run as a one-off capex line. A full fine-tune pass on Llama 4 70B with 5B tokens runs $15k-$40k on Together depending on hardware. Assume one run per quarter at Series A scale.

Does observability really cost $10-50k/mo at Series A? It can, if you are running millions of traces. Langfuse Cloud is roughly $0.0002/trace; 50M traces/mo is $10k. Helicone and Braintrust are comparable. Rolling your own into Datadog is a one-time engineering sprint.

When should I hire a dedicated AI ops engineer? Around $20-40k/mo AI spend or 10+ production features on LLMs. Before that, distribute the work across existing engineers.

Runway hygiene: the monthly review that keeps you out of a cash crisis

Build a 5-minute monthly review that tracks: net burn, months of runway remaining, AI line as % of revenue, customer count, MRR, and the 3 largest line items on the burn. Review at the same time every month; trend the numbers in a single shared spreadsheet or Notion board the whole team sees. The signal that triggers an immediate runway correction: any month where burn increased more than 15% while MRR did not. Teams that skip this ritual run out of cash by surprise; teams that run it never do.

What investors actually check on runway calls in 2026

Gross margin on the AI product. Under 50% triggers concern; under 30% and they will not invest at the valuation you want. The fix is caching, routing, and a pricing update before the round.
Revenue concentration. If any single customer is over 20% of MRR, you have revenue risk. Diversify before raising.
Cost predictability. Variable AI costs that swing 30%+ month-to-month without corresponding revenue growth signal uncontrolled spend. Fix the observability layer first.
Growth rate under the AI assumption.Investors are increasingly asking: "What is your growth rate if your AI spend stays flat?" If the growth requires more AI spend, it is not a real growth rate.
Fundraise cadence. You need to close a new round with 4-6 months of runway left, not 1-2. Start outreach at the 10-month mark, not the 4-month mark.

Keep going

Compute break-even — unit economics for extending runway.
AI SaaS pricing — fix pricing to fix runway.
Product launch cost — pre-launch MVP budget.
LLM API cost — the single biggest burn line.

AI startup runway

Results

Frequently asked questions

AI startup runway math: the AI tax nobody budgeted for in 2022

Typical monthly burn at each stage (AI-native startup)

The four AI-cost drivers you must model separately

Runway scenarios that move the decision

Where AI-native startups most often under-plan

Runway-extending moves that actually work

Three burn scenarios with real token math

Scenario 1: Support chatbot product, seed stage, 250,000 requests/month

Scenario 2: Internal RAG assistant, Series A, 50,000 queries/month

Scenario 3: Code assistant, 10 devs × 40 queries/day

Cost levers with math

Model selection rules at each burn tier

Production patterns that preserve runway

Frequently asked questions

Runway hygiene: the monthly review that keeps you out of a cash crisis

What investors actually check on runway calls in 2026

Track your AI tool costs, ROI, and productivity metrics

More free tools