Build a chatbot via VerticalAPI

Building a production chatbot in 2026 means picking the right model for tone, latency and cost — and switching between providers as your traffic patterns evolve. VerticalAPI gives you one OpenAI-compatible endpoint that you can repoint at GPT-4o, Claude Haiku, Gemini Flash or open-weights Llama without changing your code. Here's the recommended setup.

How it fits together

Frontend (React/Next) → POST /api/chat → your backend → VerticalAPI /v1/chat/completions (streaming) → token-streamed response back to frontend. Add a Redis cache for system prompts and a per-user rate limit at your edge.

Working example in python

chatbot.pythonPython
from openai import OpenAI

client = OpenAI(
    base_url="https://api.verticalapi.com/v1",
    api_key="vapi_...",
    default_headers={"X-Provider-Key": "sk-ant-..."}
)

response = client.chat.completions.create(
    model="claude-haiku-4-5",
    messages=[
        {"role": "system", "content": "You are Acme's friendly support bot."},
        {"role": "user", "content": user_message}
    ],
    stream=True,
)
for chunk in response:
    yield chunk.choices[0].delta.content or ""

Typical cost at production volume

Typical chatbot serving 100K conversations/month at ~10 turns each (~500 tokens per turn) costs roughly $50-150/month on Claude Haiku 4.5, $20-60 on Gemini Flash, or $30-80 on Llama 3.3 70B via Groq. <!-- TODO Hugo: refine with actual conversion data -->

See VerticalAPI plan pricing →

Common questions

Should I stream responses?

Yes — streaming reduces perceived latency dramatically. VerticalAPI passes streaming through unchanged; the OpenAI SDK's stream=True works on every supported provider.

How do I switch model based on user tier?

Pass model='claude-sonnet-4-5' for paid users, model='claude-haiku-4-5' for free. Same endpoint, same auth, just a different model field.

What about tool calling / function calling?

Standard OpenAI tools[] array works on Claude, GPT-4o, Gemini, Mistral, Llama (via Together/Groq/Fireworks). VerticalAPI normalizes provider differences so your tool definitions are portable.