Cohere via VerticalAPI

Cohere Command R+, Embed v3 and Rerank via VerticalAPI's OpenAI-compatible endpoint. BYOK with your Cohere key, zero markup, RAG-first toolkit.

Start free with your Cohere key → Read the docs

Endpoint: https://api.verticalapi.com/v1/chat/completions · BYOK header: X-Provider-Key: <cohere-key>

Supported models

Cohere models routed by VerticalAPI

Pass the model ID below as model in any OpenAI-compatible request. New Cohere models are typically supported within 24h of release.

Model ID	Name	Context	Pricing (provider)
`command-r-plus`	Command R+	128K	$2.50 / $10 per 1M tok
`command-r`	Command R	128K	$0.15 / $0.60 per 1M tok
`embed-english-v3`	Embed v3	512	$0.10 per 1M tok
`rerank-v3.5`	Rerank 3.5	—	$2 per 1K queries

Pricing reflects Cohere's rates — you pay Cohere directly. VerticalAPI adds zero markup on tokens.

Quickstart

5-line Cohere call via VerticalAPI

Drop-in replacement for the OpenAI SDK. Works with the OpenAI Python client, Node, Go, curl — anything that speaks HTTP.

                cohere_quickstart.py
                Python
            
from openai import OpenAI

client = OpenAI(
    base_url="https://api.verticalapi.com/v1",
    api_key="vapi_...",
    default_headers={"X-Provider-Key": "..."}
)

response = client.chat.completions.create(
    model="command-r",  # Cohere
    messages=[{"role": "user", "content": "Hello"}]
)
print(response.choices[0].message.content)

Why use Cohere via VerticalAPI

Four reasons developers route Cohere through us

Zero token markup

You pay Cohere directly with your own key. VerticalAPI's revenue is the gateway subscription, not a tax on your tokens.

One key, every provider

Cohere alongside OpenAI, Anthropic, Gemini and 12 more — same OpenAI-compatible endpoint, same SDK, switchable per-request.

Latency & cost monitoring

Per-request token counts, p50/p95 latency and cost dashboards out of the box. Compare Cohere to other providers on identical prompts.

Observability built in

Every Cohere call gets a trace ID, replayable payload and audit log entry. Wire to Datadog or Sentry via OpenTelemetry.

Best for

Where Cohere shines

RAG with citations embedding pipelines rerank for retrieval multilingual

FAQ

Common questions about Cohere on VerticalAPI

Does Cohere's RAG citation feature work?

Yes. Command R+ returns citations and documents arrays — VerticalAPI surfaces them in the response payload alongside the standard chat fields.

Can I use Cohere Embed and Rerank too?

Yes. POST /v1/embeddings (OpenAI-format) is routed to Cohere's Embed v3. Rerank is exposed at /v1/rerank with Cohere's native shape.

Switch providers

All supported LLM providers

Same endpoint, same SDK — just change the model and the BYOK header.

OpenAI Anthropic Google Gemini Mistral AI Meta Llama xAI Grok Groq Together AI Fireworks AI Perplexity Sonar Cohere AI21 Labs AWS Bedrock Azure OpenAI Google Vertex AI

Ship on Cohere in 60 seconds

Free tier — bring your own Cohere key, zero markup, OpenAI-compatible endpoint.

Get your VerticalAPI key →