All LLM provider verticals — VerticalAPI BYOK gateway

OpenAI openai

GPT-4o, GPT-4 Turbo, o1 reasoning models

View OpenAI integration →

Anthropic anthropic

Claude Sonnet 4.5, Opus 4.6, Haiku 4.5

View Anthropic integration →

Google Gemini google

Gemini 2.5 Pro, 2.5 Flash, Flash-8B

View Google Gemini integration →

Mistral AI mistral

Mistral Large, Codestral, Pixtral

View Mistral AI integration →

Meta Llama meta

Llama 3.3, Llama 3.2 Vision, Llama 4

View Meta Llama integration →

xAI Grok xai

Grok-3, Grok-2 Vision

View xAI Grok integration →

Groq groq

Sub-100ms inference for Llama, Mixtral, Whisper

View Groq integration →

Together AI together-ai

200+ open-weights models — Llama, Qwen, DeepSeek

View Together AI integration →

Fireworks AI fireworks

Fast inference for Llama, DeepSeek, function calling

View Fireworks AI integration →

Perplexity Sonar perplexity

Sonar Pro — web-grounded answers with citations

View Perplexity Sonar integration →

Cohere cohere

Command R+, Embed v3, Rerank

View Cohere integration →

AI21 Labs ai21

Jamba 1.5 Large — hybrid Mamba-Transformer

View AI21 Labs integration →

AWS Bedrock aws-bedrock

Claude, Llama, Titan, Mistral — through your AWS account

View AWS Bedrock integration →

Azure OpenAI azure-openai

GPT-4o on Azure — your Azure subscription, your data residency

View Azure OpenAI integration →

Google Vertex AI vertex-ai

Gemini + Claude + Llama on GCP — your project, your residency

View Google Vertex AI integration →

OpenRouter openrouter

300+ models across providers — universal router

View OpenRouter integration →

DeepInfra deepinfra

Cheap open-weights inference — Llama, Qwen, Mixtral

View DeepInfra integration →

Replicate replicate

Run any open-weights model — Llama, FLUX, Whisper, ComfyUI

View Replicate integration →

Cerebras cerebras

Wafer-scale inference — fastest tokens/sec on the market

View Cerebras integration →

Lambda Labs lambdalabs

GPU-cloud inference — Hermes 3, Llama 3.3

View Lambda Labs integration →

OctoAI octoai

Optimized inference — Llama, Stable Diffusion, custom

View OctoAI integration →

Lepton AI lepton

Production-grade open-weights inference

View Lepton AI integration →

NVIDIA NIM nvidia-nim

NVIDIA-optimized microservices for open-weights LLMs

View NVIDIA NIM integration →

Databricks Mosaic databricks-mosaic

DBRX, Llama, Mixtral served on Databricks

View Databricks Mosaic integration →

AI21 Jamba jamba

Hybrid Mamba-Transformer with 256K context — open weights

View AI21 Jamba integration →