OctoAI via VerticalAPI

OctoAI's optimized open-weights inference and image-gen via VerticalAPI's OpenAI-compatible endpoint. BYOK with your OctoAI key, zero markup, custom-model hosting.

Endpoint: https://api.verticalapi.com/v1/chat/completions  ·  BYOK header: X-Provider-Key: <octoai-key>

OctoAI models routed by VerticalAPI

Pass the model ID below as model in any OpenAI-compatible request. New OctoAI models are typically supported within 24h of release.

Model IDNameContextPricing (provider)
meta-llama-3.3-70b-instruct Llama 3.3 70B (Octo) 128K $0.90 per 1M tok
qwen2.5-32b-instruct Qwen 2.5 32B (Octo) 32K $0.50 per 1M tok
stable-diffusion-xl Stable Diffusion XL image $0.005 per image

Pricing reflects OctoAI's rates — you pay OctoAI directly. VerticalAPI adds zero markup on tokens.

5-line OctoAI call via VerticalAPI

Drop-in replacement for the OpenAI SDK. Works with the OpenAI Python client, Node, Go, curl — anything that speaks HTTP.

octoai_quickstart.py Python
from openai import OpenAI

client = OpenAI(
    base_url="https://api.verticalapi.com/v1",
    api_key="vapi_...",
    default_headers={"X-Provider-Key": "..."}
)

response = client.chat.completions.create(
    model="meta-llama-3.3-70b-instruct",  # OctoAI
    messages=[{"role": "user", "content": "Hello"}]
)
print(response.choices[0].message.content)

Four reasons developers route OctoAI through us

Zero token markup

You pay OctoAI directly with your own key. VerticalAPI's revenue is the gateway subscription, not a tax on your tokens.

One key, every provider

OctoAI alongside OpenAI, Anthropic, Gemini and 12 more — same OpenAI-compatible endpoint, same SDK, switchable per-request.

Latency & cost monitoring

Per-request token counts, p50/p95 latency and cost dashboards out of the box. Compare OctoAI to other providers on identical prompts.

Observability built in

Every OctoAI call gets a trace ID, replayable payload and audit log entry. Wire to Datadog or Sentry via OpenTelemetry.

Where OctoAI shines

custom-model hosting image generation (SDXL) OctoStack on-prem deploy fine-tuned variants

Common questions about OctoAI on VerticalAPI

Is OctoAI still independent?

OctoAI was acquired by NVIDIA in late 2024. The hosted inference API remains operational; VerticalAPI tracks endpoint changes and surfaces deprecation notices in the dashboard. <!-- TODO Hugo: confirm OctoAI inference API SLA post-acquisition -->