How long should a migration take?

4 weeks is the minimum for a production-grade workload. Week 1 baseline, weeks 2-3 shadow eval, week 4 cutover with canary. Shorter timelines skip the shadow period and land in pain.

Can I skip the shadow phase if both models are on the same vendor?

No. Same-vendor model updates (e.g. Opus 4.1 → 4.7) still have behavior changes. Shadow eval catches regressions before users do.

What's a good quality gate?

Pass rate within 2 percentage points of baseline, P95 latency within 20%, no Sev1 incidents during canary. Tighten for user-facing surfaces.

How long should I keep the old model deployable after cutover?

30 days minimum. Roll back fast is cheaper than debugging.

When should I NOT migrate?

If sticker price is your only reason and quality hasn't been measured. A 30% cost cut at the expense of 8-point pass-rate drop is usually net negative once you count error cost.

LLM Migration Planner — Swap Models Without Breaking Production

How to swap LLMs without breaking production

Model swaps break things. Not every time — but often enough that you need a migration process, not a prayer. This planner is the process: 4 weeks, 12 items, 3 phases. Used as written, it takes a model swap from "we launched and hope" to "we shadowed, evaluated, canaried, and rolled back capability preserved."

Why migrations fail

No baseline — you don't know if the new model is better or worse than the old one.
No eval set — "it looks the same in testing" is a vibe, not a measurement.
Hard cutover — when users hit regression, you can't roll back fast enough.
No shadow period — you haven't seen the new model under real traffic before you shipped.

The 4-week plan

Week 1: scope + baseline

Inventory: which workloads use the old model today and at what volume.
Baseline: pass rate, cost, and P95 latency per workload.
Quality gate: non-regression threshold per workload (e.g. pass rate within 2 points).
Go/no-go criteria: document what would trigger a rollback.

Weeks 2-3: shadow eval

Deploy the new model in shadow mode: 100% of traffic, both models run, only the old one serves.
Score both against the golden eval set nightly.
Diff outputs on 200 real requests with human review.
Red-team the new model for injection, jailbreak, and output leak.

Week 4: cutover

Canary 1% for 48 hours. If metrics stable, promote.
Promote to 10% → 50% → 100% with 24-hour pauses.
Rollback plan rehearsed and wired (one-click feature flag).
Old model stays deployable for 30 days after 100% rollout.

Common migration scenarios

From	To	Typical risk	Typical saving
Opus 4.1	Opus 4.7	Low — same family, newer version	Similar cost, ~10-15% quality lift
Opus 4.x	Sonnet 4.5	Medium — quality regression possible	80% cost cut
GPT-4o	GPT-5	Low — OpenAI minimizes API breaks	Similar cost, quality lift
GPT-5	Sonnet 4.5	High — cross-vendor, different tool-use behavior	40-60% cost cut
Sonnet 4.5	Haiku 4 (routing)	Medium — needs confidence gate	60-85% cost cut

What the planner gives you

The interactive plan above tracks each item per phase and produces a downloadable markdown plan you can drop into your project tracker. Tick items as you complete them; the progress bar updates and the final export is a timestamped checklist of what was done.

Keep going

Prompt Performance Tracker — A/B old vs new on pass rate + cost + latency.
AI Spend Tracker — Quantify the cost saving post-migration.
Which AI model? — Pick the migration target.
Enterprise AI Security Checklist — Red-team the new model before cutover.

Use the data programmatically

Every calculator on this site is also exposed as a free, CORS-open JSON endpoint. No auth, no rate limit (fair-use, please cache). License is CC-BY-4.0 — link back to attribution.canonicalUrl in the response.

Endpoint: https://aieconomyhub.co/api/page/llm-migration-planner

curl

curl -s 'https://aieconomyhub.co/api/page/llm-migration-planner' | jq .

Python

import requests

r = requests.get("https://aieconomyhub.co/api/page/llm-migration-planner", timeout=10)
r.raise_for_status()
data = r.json()
print(data["title"])
for faq in data.get("faqs", []):
    print("Q:", faq["q"])

JavaScript / Node

// Node 20+ / modern browser
const res = await fetch("https://aieconomyhub.co/api/page/llm-migration-planner");
if (!res.ok) throw new Error("HTTP " + res.status);
const llm_migration_planner = await res.json();
console.log(llm_migration_planner.title);
for (const faq of llm_migration_planner.faqs ?? []) {
  console.log("Q:", faq.q);
}

Spec: /api/openapi.yaml · Docs: /api/docs

LLM migration planner

Frequently asked questions