AI21 Labs via VerticalAPI

AI21 Jamba 1.5 Large and Mini (hybrid Mamba/Transformer, 256K context) via VerticalAPI's OpenAI-compatible endpoint. BYOK, zero markup.

Start free with your AI21 Labs key → Read the docs

Endpoint: https://api.verticalapi.com/v1/chat/completions · BYOK header: X-Provider-Key: <ai21-key>

Supported models

AI21 Labs models routed by VerticalAPI

Pass the model ID below as model in any OpenAI-compatible request. New AI21 Labs models are typically supported within 24h of release.

Model ID	Name	Context	Pricing (provider)
`jamba-1.5-large`	Jamba 1.5 Large	256K	$2 / $8 per 1M tok
`jamba-1.5-mini`	Jamba 1.5 Mini	256K	$0.20 / $0.40 per 1M tok

Pricing reflects AI21 Labs's rates — you pay AI21 Labs directly. VerticalAPI adds zero markup on tokens.

Quickstart

5-line AI21 Labs call via VerticalAPI

Drop-in replacement for the OpenAI SDK. Works with the OpenAI Python client, Node, Go, curl — anything that speaks HTTP.

                ai21_quickstart.py
                Python
            
from openai import OpenAI

client = OpenAI(
    base_url="https://api.verticalapi.com/v1",
    api_key="vapi_...",
    default_headers={"X-Provider-Key": "..."}
)

response = client.chat.completions.create(
    model="jamba-1.5-mini",  # AI21 Labs
    messages=[{"role": "user", "content": "Hello"}]
)
print(response.choices[0].message.content)

Why use AI21 Labs via VerticalAPI

Four reasons developers route AI21 Labs through us

Zero token markup

You pay AI21 Labs directly with your own key. VerticalAPI's revenue is the gateway subscription, not a tax on your tokens.

One key, every provider

AI21 Labs alongside OpenAI, Anthropic, Gemini and 12 more — same OpenAI-compatible endpoint, same SDK, switchable per-request.

Latency & cost monitoring

Per-request token counts, p50/p95 latency and cost dashboards out of the box. Compare AI21 Labs to other providers on identical prompts.

Observability built in

Every AI21 Labs call gets a trace ID, replayable payload and audit log entry. Wire to Datadog or Sentry via OpenTelemetry.

Best for

Where AI21 Labs shines

256K context at low cost structured JSON output long-doc QA hybrid architecture experiments

FAQ

Common questions about AI21 Labs on VerticalAPI

What makes Jamba different?

Jamba's hybrid Mamba/Transformer architecture is faster on long-context inputs than pure-Transformer rivals, especially in the 100K+ token range.

Switch providers

All supported LLM providers

Same endpoint, same SDK — just change the model and the BYOK header.

OpenAI Anthropic Google Gemini Mistral AI Meta Llama xAI Grok Groq Together AI Fireworks AI Perplexity Sonar Cohere AI21 Labs AWS Bedrock Azure OpenAI Google Vertex AI

Ship on AI21 Labs in 60 seconds

Free tier — bring your own AI21 Labs key, zero markup, OpenAI-compatible endpoint.

Get your VerticalAPI key →