OctoAI via VerticalAPI
OctoAI's optimized open-weights inference and image-gen via VerticalAPI's OpenAI-compatible endpoint. BYOK with your OctoAI key, zero markup, custom-model hosting.
OctoAI models routed by VerticalAPI
Pass the model ID below as model in any OpenAI-compatible request. New OctoAI models are typically supported within 24h of release.
| Model ID | Name | Context | Pricing (provider) |
|---|---|---|---|
meta-llama-3.3-70b-instruct |
Llama 3.3 70B (Octo) | 128K | $0.90 per 1M tok |
qwen2.5-32b-instruct |
Qwen 2.5 32B (Octo) | 32K | $0.50 per 1M tok |
stable-diffusion-xl |
Stable Diffusion XL | image | $0.005 per image |
Pricing reflects OctoAI's rates — you pay OctoAI directly. VerticalAPI adds zero markup on tokens.
5-line OctoAI call via VerticalAPI
Drop-in replacement for the OpenAI SDK. Works with the OpenAI Python client, Node, Go, curl — anything that speaks HTTP.
from openai import OpenAI client = OpenAI( base_url="https://api.verticalapi.com/v1", api_key="vapi_...", default_headers={"X-Provider-Key": "..."} ) response = client.chat.completions.create( model="meta-llama-3.3-70b-instruct", # OctoAI messages=[{"role": "user", "content": "Hello"}] ) print(response.choices[0].message.content)
Four reasons developers route OctoAI through us
Zero token markup
You pay OctoAI directly with your own key. VerticalAPI's revenue is the gateway subscription, not a tax on your tokens.
One key, every provider
OctoAI alongside OpenAI, Anthropic, Gemini and 12 more — same OpenAI-compatible endpoint, same SDK, switchable per-request.
Latency & cost monitoring
Per-request token counts, p50/p95 latency and cost dashboards out of the box. Compare OctoAI to other providers on identical prompts.
Observability built in
Every OctoAI call gets a trace ID, replayable payload and audit log entry. Wire to Datadog or Sentry via OpenTelemetry.
Where OctoAI shines
Common questions about OctoAI on VerticalAPI
Is OctoAI still independent?
OctoAI was acquired by NVIDIA in late 2024. The hosted inference API remains operational; VerticalAPI tracks endpoint changes and surfaces deprecation notices in the dashboard. <!-- TODO Hugo: confirm OctoAI inference API SLA post-acquisition -->
All supported LLM providers
Same endpoint, same SDK — just change the model and the BYOK header.
Ship on OctoAI in 60 seconds
Free tier — bring your own OctoAI key, zero markup, OpenAI-compatible endpoint.
Get your VerticalAPI key →