OctoAI via VerticalAPI

Updated May 04, 2026·By VerticalAPI Team

OctoAI's optimized open-weights inference and image-gen via VerticalAPI's OpenAI-compatible endpoint. BYOK with your OctoAI key, zero markup, custom-model hosting.

Start free with your OctoAI key → Read the docs

Endpoint: https://api.verticalapi.com/v1/chat/completions · BYOK header: X-Provider-Key: <octoai-key>

Supported models

OctoAI models routed by VerticalAPI

Pass the model ID below as model in any OpenAI-compatible request. New OctoAI models are typically supported within 24h of release.

Model ID	Name	Context	Pricing (provider)
`meta-llama-3.3-70b-instruct`	Llama 3.3 70B (Octo)	128K	$0.90 per 1M tok
`qwen2.5-32b-instruct`	Qwen 2.5 32B (Octo)	32K	$0.50 per 1M tok
`stable-diffusion-xl`	Stable Diffusion XL	image	$0.005 per image

Pricing reflects OctoAI's rates — you pay OctoAI directly. VerticalAPI adds zero markup on tokens.

Quickstart

5-line OctoAI call via VerticalAPI

Drop-in replacement for the OpenAI SDK. Works with the OpenAI Python client, Node, Go, curl — anything that speaks HTTP.

                octoai_quickstart.py
                Python
            
from openai import OpenAI

client = OpenAI(
    base_url="https://api.verticalapi.com/v1",
    api_key="vapi_...",
    default_headers={"X-Provider-Key": "..."}
)

response = client.chat.completions.create(
    model="meta-llama-3.3-70b-instruct",  # OctoAI
    messages=[{"role": "user", "content": "Hello"}]
)
print(response.choices[0].message.content)

Why use OctoAI via VerticalAPI

Four reasons developers route OctoAI through us

Zero token markup

You pay OctoAI directly with your own key. VerticalAPI's revenue is the gateway subscription, not a tax on your tokens.

One key, every provider

OctoAI alongside OpenAI, Anthropic, Gemini and 12 more — same OpenAI-compatible endpoint, same SDK, switchable per-request.

Latency & cost monitoring

Per-request token counts, p50/p95 latency and cost dashboards out of the box. Compare OctoAI to other providers on identical prompts.

Observability built in

Every OctoAI call gets a trace ID, replayable payload and audit log entry. Wire to Datadog or Sentry via OpenTelemetry.

Best for

Where OctoAI shines

custom-model hosting image generation (SDXL) OctoStack on-prem deploy fine-tuned variants

FAQ

Common questions about OctoAI on VerticalAPI

Is OctoAI still independent?

OctoAI was acquired by NVIDIA in late 2024. The hosted inference API remains operational; VerticalAPI tracks endpoint changes and surfaces deprecation notices in the dashboard.

Switch providers

All supported LLM providers

Same endpoint, same SDK — just change the model and the BYOK header.

Ship on OctoAI in 60 seconds

Free tier — bring your own OctoAI key, zero markup, OpenAI-compatible endpoint.

Get your VerticalAPI key →