Fireworks AI via VerticalAPI

Fireworks AI's optimized Llama 3.3, DeepSeek V3 and function-calling models via VerticalAPI's OpenAI-compatible endpoint. BYOK, zero markup.

Start free with your Fireworks AI key → Read the docs

Endpoint: https://api.verticalapi.com/v1/chat/completions · BYOK header: X-Provider-Key: fw_...

Supported models

Fireworks AI models routed by VerticalAPI

Pass the model ID below as model in any OpenAI-compatible request. New Fireworks AI models are typically supported within 24h of release.

Model ID	Name	Context	Pricing (provider)
`accounts/fireworks/models/llama-v3p3-70b-instruct`	Llama 3.3 70B (FW)	128K	$0.90 per 1M tok
`accounts/fireworks/models/deepseek-v3`	DeepSeek V3 (FW)	64K	$1.20 per 1M tok
`accounts/fireworks/models/firefunction-v2`	FireFunction v2	32K	$0.90 per 1M tok — tool-tuned

Pricing reflects Fireworks AI's rates — you pay Fireworks AI directly. VerticalAPI adds zero markup on tokens.

Quickstart

5-line Fireworks AI call via VerticalAPI

Drop-in replacement for the OpenAI SDK. Works with the OpenAI Python client, Node, Go, curl — anything that speaks HTTP.

                fireworks_quickstart.py
                Python
            
from openai import OpenAI

client = OpenAI(
    base_url="https://api.verticalapi.com/v1",
    api_key="vapi_...",
    default_headers={"X-Provider-Key": "fw_..."}
)

response = client.chat.completions.create(
    model="accounts/fireworks/models/llama-v3p3-70b-instruct",  # Fireworks AI
    messages=[{"role": "user", "content": "Hello"}]
)
print(response.choices[0].message.content)

Why use Fireworks AI via VerticalAPI

Four reasons developers route Fireworks AI through us

Zero token markup

You pay Fireworks AI directly with your own key. VerticalAPI's revenue is the gateway subscription, not a tax on your tokens.

One key, every provider

Fireworks AI alongside OpenAI, Anthropic, Gemini and 12 more — same OpenAI-compatible endpoint, same SDK, switchable per-request.

Latency & cost monitoring

Per-request token counts, p50/p95 latency and cost dashboards out of the box. Compare Fireworks AI to other providers on identical prompts.

Observability built in

Every Fireworks AI call gets a trace ID, replayable payload and audit log entry. Wire to Datadog or Sentry via OpenTelemetry.

Best for

Where Fireworks AI shines

function-calling tuned models DeepSeek hosted FireOptimizer fine-tunes

FAQ

Common questions about Fireworks AI on VerticalAPI

What's FireFunction v2?

Fireworks' Llama-based model fine-tuned for high-accuracy function/tool calling — often >90% on tool-call benchmarks. Available via VerticalAPI's standard tools[] interface.

Switch providers

All supported LLM providers

Same endpoint, same SDK — just change the model and the BYOK header.

OpenAI Anthropic Google Gemini Mistral AI Meta Llama xAI Grok Groq Together AI Fireworks AI Perplexity Sonar Cohere AI21 Labs AWS Bedrock Azure OpenAI Google Vertex AI

Ship on Fireworks AI in 60 seconds

Free tier — bring your own Fireworks AI key, zero markup, OpenAI-compatible endpoint.

Get your VerticalAPI key →