Fireworks AI via VerticalAPI

Fireworks AI's optimized Llama 3.3, DeepSeek V3 and function-calling models via VerticalAPI's OpenAI-compatible endpoint. BYOK, zero markup.

Endpoint: https://api.verticalapi.com/v1/chat/completions  ·  BYOK header: X-Provider-Key: fw_...

Fireworks AI models routed by VerticalAPI

Pass the model ID below as model in any OpenAI-compatible request. New Fireworks AI models are typically supported within 24h of release.

Model IDNameContextPricing (provider)
accounts/fireworks/models/llama-v3p3-70b-instruct Llama 3.3 70B (FW) 128K $0.90 per 1M tok
accounts/fireworks/models/deepseek-v3 DeepSeek V3 (FW) 64K $1.20 per 1M tok
accounts/fireworks/models/firefunction-v2 FireFunction v2 32K $0.90 per 1M tok — tool-tuned

Pricing reflects Fireworks AI's rates — you pay Fireworks AI directly. VerticalAPI adds zero markup on tokens.

5-line Fireworks AI call via VerticalAPI

Drop-in replacement for the OpenAI SDK. Works with the OpenAI Python client, Node, Go, curl — anything that speaks HTTP.

fireworks_quickstart.py Python
from openai import OpenAI

client = OpenAI(
    base_url="https://api.verticalapi.com/v1",
    api_key="vapi_...",
    default_headers={"X-Provider-Key": "fw_..."}
)

response = client.chat.completions.create(
    model="accounts/fireworks/models/llama-v3p3-70b-instruct",  # Fireworks AI
    messages=[{"role": "user", "content": "Hello"}]
)
print(response.choices[0].message.content)

Four reasons developers route Fireworks AI through us

Zero token markup

You pay Fireworks AI directly with your own key. VerticalAPI's revenue is the gateway subscription, not a tax on your tokens.

One key, every provider

Fireworks AI alongside OpenAI, Anthropic, Gemini and 12 more — same OpenAI-compatible endpoint, same SDK, switchable per-request.

Latency & cost monitoring

Per-request token counts, p50/p95 latency and cost dashboards out of the box. Compare Fireworks AI to other providers on identical prompts.

Observability built in

Every Fireworks AI call gets a trace ID, replayable payload and audit log entry. Wire to Datadog or Sentry via OpenTelemetry.

Where Fireworks AI shines

function-calling tuned models DeepSeek hosted FireOptimizer fine-tunes

Common questions about Fireworks AI on VerticalAPI

What's FireFunction v2?

Fireworks' Llama-based model fine-tuned for high-accuracy function/tool calling — often >90% on tool-call benchmarks. Available via VerticalAPI's standard tools[] interface.

All supported LLM providers

Same endpoint, same SDK — just change the model and the BYOK header.