Overview

Apifuse — OpenAI-Compatible API Gateway (GPT-5, Claude, Gemini)

OpenAI-compatible API for GPT-5, Claude and Gemini. Multi-provider routing, transparent pricing, low latency. Enterprise SLA, SDK support, pay-as-you-go.

One unified OpenAI-compatible endpoint for GPT-5, Claude and Gemini. Migrate from OpenAI in minutes by simply changing your base URL to https://api.apifuse.net/v1 — keep your existing SDKs with no code rewrite. Multi-provider routing ensures low latency and high uptime, with transparent pricing, an enterprise SLA and global CDN acceleration.

Quick Start

Get Started Now

OpenAI-compatible Chat API with GPT-5, Claude Sonnet 4.5, Gemini 2.0. Switch base URL, no code changes needed.

Create an API key, point your client at the Apifuse base URL, and call any of the 500+ models from the Model Market.

Text Series

Every text endpoint mirrors a provider's native request and response shape, so existing tooling, SDKs and prompts work unchanged. Pick the schema you already use — OpenAI, Anthropic or Google — and only the base URL changes.

General Chat API (Streaming)

Standard /v1/chat/completions with "stream": true. Tokens arrive as server-sent events for low time-to-first-token. Works with GPT-5, Claude and Gemini through the same OpenAI schema.

# Streaming chat completion — OpenAI-compatible
curl https://api.apifuse.net/v1/chat/completions \
  -H "Authorization: Bearer <token>" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5",
    "messages": [{ "role": "user", "content": "Hello" }],
    "stream": true
  }'

General Chat API (Non-Streaming)

The same /v1/chat/completions endpoint with "stream": false returns one complete JSON response — the simplest option for batch jobs and server-to-server calls.

# Non-streaming chat completion
curl https://api.apifuse.net/v1/chat/completions \
  -H "Authorization: Bearer <token>" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5",
    "messages": [{ "role": "user", "content": "Hello" }],
    "stream": false
  }'

Claude Messages API

Call Claude models with Anthropic's native /v1/messages schema — system prompts, multi-turn messages and max_tokens behave exactly as they do against Anthropic directly.

# Claude Messages API — Anthropic-native schema
curl https://api.apifuse.net/v1/messages \
  -H "Authorization: Bearer <token>" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-5",
    "max_tokens": 1024,
    "messages": [{ "role": "user", "content": "Hello" }]
  }'

Gemini Native Format

Use Google's native /v1beta/models/{model}:generateContent format with contents and parts — ideal if your code already targets the Gemini SDK.

# Gemini native generateContent
curl https://api.apifuse.net/v1beta/models/gemini-2.5-pro:generateContent \
  -H "Authorization: Bearer <token>" \
  -H "Content-Type: application/json" \
  -d '{
    "contents": [{ "parts": [{ "text": "Hello" }] }]
  }'

OpenAI Multimodal Responses API

The newer /v1/responses endpoint accepts mixed text and image input and returns structured multimodal output — a drop-in for OpenAI's Responses API.

# OpenAI Responses API — multimodal input
curl https://api.apifuse.net/v1/responses \
  -H "Authorization: Bearer <token>" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5",
    "input": "Describe this product in one line."
  }'

Image Series

Generate and edit images via /v1/images/generations across Nano Banana, GPT Image and Imagen — one schema, per-resolution pricing tiers, and billing only on successful renders. Switch model by changing the model field.

Nano Banana 2

Google's gemini-3-pro-image-preview (alias nano-banana-2) for high-fidelity 1K–4K generation with strong text rendering.

# Image generation — Nano Banana 2
curl https://api.apifuse.net/v1/images/generations \
  -H "Authorization: Bearer <token>" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "nano-banana-2",
    "prompt": "A red panda barista, studio lighting",
    "size": "1024x1024"
  }'

GPT Image 2

OpenAI's gpt-image-2 — near-perfect text rendering and native 4K output. Pass "model": "gpt-image-2" to the same generations endpoint.

Imagen 4.0

Google DeepMind's flagship text-to-image model — up to 2K resolution with improved prompt adherence. Use "model": "imagen-4.0".

Video Series

Asynchronous video generation via /v1/videos for Sora, Veo and Kling. Submit a job, poll the returned id for completion, and pay only for successful renders.

Sora 2 (Text-to-Video)

OpenAI's sora-2 for cinematic motion, synced audio and up to 20-second 1080p clips. Submit a render job, then poll for the result URL.

# Submit a Sora 2 render job
curl https://api.apifuse.net/v1/videos \
  -H "Authorization: Bearer <token>" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "sora-2",
    "prompt": "A timelapse of a city at dusk",
    "duration": 5,
    "resolution": "1080p"
  }'

# Poll for completion
curl https://api.apifuse.net/v1/videos/<id> \
  -H "Authorization: Bearer <token>"

Veo 3.1

Google DeepMind's veo-3.1 — low-latency generation with strong physics and prompt fidelity. Same submit-and-poll flow with "model": "veo-3.1".

Kling V3

Kuaishou's kling-v3 — high-motion video with extended duration and precise camera control. Use "model": "kling-v3" on the same endpoint.

Platform Benefits

Transparent Pricing

Per-model, per-spec rates you can forecast — up to 70% below official, billed only on success.

Multi-Provider Routing & SLA

Requests routed across redundant upstreams for low latency and an enterprise-grade SLA.

99.9% Uptime SLA

Resilient infrastructure with automatic failover and a public real-time status page.

Global CDN Acceleration

Edge points of presence worldwide keep round-trips short wherever your users are.

Rate Limit Management

Generous defaults with elastic, per-key limits you can raise as you scale.

Real-Time Status

Live latency and availability metrics per model, updated continuously.

SDKs & Code Examples

Use the official OpenAI SDKs — only the base URL changes.

Python SDK

from openai import OpenAI

client = OpenAI(
    api_key="<token>",
    base_url="https://api.apifuse.net/v1",
)

resp = client.chat.completions.create(
    model="gpt-5",
    messages=[{"role": "user", "content": "Hello"}],
)
print(resp.choices[0].message.content)

Node.js SDK

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: "<token>",
  baseURL: "https://api.apifuse.net/v1",
});

const resp = await client.chat.completions.create({
  model: "gpt-5",
  messages: [{ role: "user", content: "Hello" }],
});
console.log(resp.choices[0].message.content);