AI Economy Hub

AI chatbot cost

Build your own LangChain bot vs. Intercom AI, Drift, or Ada — cost and time-to-launch.

Results

Build total (over horizon)
$46,400.00
Buy total (over horizon)
$19,200.00
Savings from build
-$27,200.00
Break-even (months)
160
Insight: Build wins long-term if break-even is under 18 months. Beyond that, the maintenance cost usually erodes the advantage.

Visualization

Get weekly marketing insights

Join 1,200+ readers. One email per week. Unsubscribe anytime.

Frequently asked questions

1.Which SaaS chatbot is best?

Intercom AI for support, Drift for sales, Chatbase for quick site Q&A, Ada for enterprise. Test on real company data, not demo data — quality varies wildly.

2.What's the right build stack?

LangChain or LlamaIndex + Claude or GPT-4o + Pinecone or pgvector. Vercel AI SDK simplifies the frontend; Fluid Compute handles the backend without cold starts.

3.Do I need a vector DB?

Only if you have a large, changing knowledge base. Under 100 docs, stuff them in the system prompt with prompt caching and skip the vector store entirely.

4.What about AI Gateway?

Vercel AI Gateway lets you swap providers via string changes and gives observability — worth using even if you don't want multi-provider today.

5.How do I measure quality?

Build a gold-set of 50–100 real queries and expected-good answers. Re-run it weekly. Quality drifts as you add content.

Build vs. buy for AI chatbots in 2026

The AI chatbot platform market (Intercom Fin, Zendesk AI Agents, Ada, Decagon, Cresta, Maven AGI) has matured to the point where out-of-the-box deflection for most B2C use cases beats what an ad-hoc LangChain build ships. The economics of custom builds have inverted compared to 2023: off-the-shelf is now usually cheaper and more capable for typical ticket-deflection work. Custom builds now win only in specific conditions.

Indicative pricing, April 2026

OptionPricingTime to shipBest for
Intercom Fin$0.99/resolution + seats2-6 weeksExisting Intercom customers
Zendesk AI AgentsBundled w/ Zendesk + $0.50-1/resolution3-8 weeksZendesk stack
Ada~$2,500+/mo + volume4-10 weeksMid-large B2C
DecagonCustom enterprise, $50-250k/yr6-12 weeksLarge enterprises with complex KBs
Maven AGICustom, usage-based4-10 weeksEnterprise with advanced RAG needs
Custom LangChain + Sonnet 4.5$40-200k build + $2-20k/mo ops2-4 monthsSpecific IP, complex workflow
Custom no-code (Voiceflow, Botpress)$50-500/mo + build time2-6 weeksVery simple bots, SMB

When to buy

  • Your use case is standard ticket-deflection with a reasonable KB.
  • You need to ship in under 2 months.
  • You have <500k tickets/year — the resolution fees are manageable.
  • You don't have engineering capacity for ongoing bot maintenance.
  • Your helpdesk (Zendesk, Intercom, Front) has a tight-integrated AI option.

When to build

  • Volume is high enough that resolution fees > $100k/year. At that point, ~$100k of build is easily justified.
  • Your product has proprietary workflows (account actions, multi-step transactions, specialized tool use) that off-the-shelf can't execute.
  • You're in a regulated vertical (financial services, healthcare) where data residency, audit, or model-selection control matters more than out-of-box convenience.
  • The bot is a core product feature, not just an internal cost-saver.

Realistic custom build cost breakdown

  • Initial build (8–16 weeks of senior eng × 1–2 engineers): $40k–$150k.
  • Evals infrastructure (often missed): $15k–$50k.
  • Integration with helpdesk + auth + analytics: $10k–$40k.
  • Ongoing ops + tuning (0.5–1 FTE): $75k–$200k/year.
  • LLM + vector DB runtime: $2k–$30k/mo at scale.
  • Total year-1 TCO: $200k–$800k for a real production custom bot.

Hybrid: the quiet winner

A common architecture in 2026 is buying a platform (Intercom Fin, Ada) for the 70% of standard deflection use cases, and building custom integrations for the 30% involving proprietary actions. Gets you fast time-to-value plus IP control where it matters.

Three worked scenarios with real token math

Decoupling "build vs buy" from actual run-cost requires the token arithmetic of each deployment. Three representative workloads follow.

Scenario 1: B2C support bot, 250,000 tickets/month

Per request: 2,350 input + 280 output on Sonnet 4.5. Uncached: $2,812/mo. With Anthropic prompt caching on the 800-token system prefix (90% read discount, 73% hit): $1,657/mo. Route 65% of FAQ intents to Haiku 4 ($0.80/$4): $1,062/mo. Add Pinecone ($700/mo), Langfuse ($400/mo), and ops time (~$3k/mo of 0.25 FTE): ~$5.2k/mo at 250k ticket volume. Compare against Intercom Fin at $0.99/resolution × 137,500 resolved (55% deflection): $136k/mo. Custom build is $130k/mo cheaper — but it took 4 months and $120k in initial eng. Break-even vs Intercom: month 1 of operation.

Scenario 2: Internal RAG-based IT helpdesk bot, 50,000 queries/month

Per query: 7,220 input + 550 output = uncached $1,496/mo. With 92% cache hit on 3,200-tok system prompt: $1,108/mo. Add Cohere Rerank 3.5 ($50/mo), Pinecone ($700/mo), Langfuse ($400/mo): $2,258/mo all-in. A platform alternative (Moveworks, Aisera) would run $60-120k annually plus per-resolution fees. Custom wins on cost but requires a small eng ops team.

Scenario 3: Code-assistant bot for 10 devs × 40 queries/day

8,800 queries/mo × 5,600 input + 900 output on Sonnet 4.5 = $267/mo. Add 5% Opus escalations: $320/mo. A buy option (Cursor Business at $40/seat × 10) is $400/mo. Basically equivalent. Buy wins on operational simplicity every time for this scale.

Cost levers with math on run cost

  • Anthropic prompt cache (90% read): 1,000-token system prompt at 200k QPM saves $540/mo per tenant ($600 → $60). Multiplied across a 30-tenant deployment, $16,200/mo of recaptured margin.
  • OpenAI 50% automatic cache on ≥1,024-token matching prefix. Works without code changes.
  • Gemini 75% context cache on long-context deployments. Good fit for multi-document RAG.
  • Haiku 4 routing on 60-70% of simple intents: saves ~70% on the routed portion.
  • Batch API (50% off) for eval and retraining runs, not for user-facing traffic.

Model selection rules for chatbots

  • Haiku 4 for intent classification, FAQ lookups, PII scrubbing. The router tier.
  • Sonnet 4.5 for natural-language synthesis over retrieved context. The workhorse.
  • GPT-5 mini ($0.40/$1.60) for strict JSON tool calls and OpenAI-native pipelines.
  • Opus 4.1 almost never in a chatbot. Wrong latency profile and 5× the cost for 2-3pp quality.
  • Gemini 2.5 Flash for bulk summarization of transcripts, offline tagging, cheap enrichment pipelines around the bot.

Production patterns for custom chatbot builds

The 20% of work that kills "simple" custom bots is hardening. Before launch, you need: (1) a fallback chain (Sonnet 4.5 → GPT-5 → Haiku 4 + simplified prompt → static escalation to human); (2) circuit breakers per provider at 20% error rate over 2-minute windows; (3) retry budgets on every agent-like call (3 attempts, hard token ceiling); (4) per-tenant monthly token caps so a runaway customer cannot burn your margin; (5) PII scrubbing on both input and output with a deterministic redaction pipeline, not a model call; (6) an eval harness that runs nightly against 200 held-out tickets and alerts on regressions. Shipping a bot without these costs less than $50k; productionizing them costs $60-200k and 3-6 months. Budget accordingly.

Frequently asked questions

Is Intercom Fin actually worth $0.99/resolution? Under 50k resolutions/mo, yes — the build-cost crossover is not there. Above 100k resolutions/mo, custom becomes compelling.

How long does a custom chatbot really take? 2-4 months to demo, 6-10 months to production-hardened. The long tail is the expensive part.

Can I ship with LangChain in a weekend? You can demo. Production requires evals, retries, fallbacks, observability — all of which LangChain does not give you out of box.

What is a realistic deflection rate? 40-65% for mature B2C with a good KB. 25-45% for B2B with complex products. Higher numbers almost always mean the bot is escalating too aggressively.

Do I need a vector DB? Yes if you have more than 50 help-center articles. Pinecone Serverless, pgvector, or Weaviate Cloud all work. Budget $50-$700/mo depending on scale.

How much does a fine-tune help? Usually 3-8pp in deflection rate at $500- $5k cost. Worth it above ~30k resolutions/mo where margin matters.

What does ops actually look like post-launch? 0.25-0.5 FTE weekly: monitoring eval drift, reviewing escalations, updating the KB, tuning prompts. Skip this and quality decays measurably within 3 months.

When should I switch from buy to build? When platform resolution fees exceed $100k/year. At that point, $100k of build easily amortizes.

Does multi-tenant custom chatbot architecture share caches? Only if the system prompt is tenant-agnostic. Tenant-specific prefixes break cache and triple input cost. Architect for a shared-prefix + per-tenant-delta pattern.

What vision/multimodal inputs cost extra? Images on Anthropic are tile-billed at roughly 1,500 tokens per 1024×1024 image. Budget accordingly if your bot handles screenshots or receipts.

How much does a voice layer add? ElevenLabs Turbo at $0.10/1k chars for TTS plus Deepgram Nova-3 at $0.0043/min for STT adds roughly $0.04-0.08/ticket for a typical voice bot. Not trivial at volume.

Can I bring-your-own-key for customers on custom builds? Yes, and many enterprise buyers now require it. Adds 2-3 weeks of integration work and simplifies compliance review.

Keep going

More free tools