Kimi K2
Long-context agentic model optimized for tool use and document workflows.
Try Kimi K2
Prototype a request in the browser, then ship it with one base-URL change.
This is a UI preview. Create a free key →
Pricing details
Per-spec rates for Kimi K2 — every tier up to 20% below official.
| Spec | Our price | Official | Save |
|---|---|---|---|
| Input · per 1M tokens | $0.6000 | $0.7500 | 20% ↓ |
| Cached input · per 1M | $0.2000 | $0.2500 | 20% ↓ |
| Output · per 1M tokens | $2.00 | $2.50 | 20% ↓ |
Core capabilities
Why teams call Kimi K2 through Apifuse.
OpenAI-compatible
Call Kimi K2 with the exact request and response schema you already use.
Transparent pricing
$2.00/1M per 1M tokens — about 20% below the $2.50/1M official rate.
Multi-provider routing
Low latency with automatic failover across redundant upstreams.
Fast integration
One base-URL change, no SDK rewrite, working in minutes.
Volume discounts
Tiered savings apply automatically as your usage grows.
Human support
Engineers on chat when an integration or invoice needs attention.
Start using Kimi K2 in minutes
Three steps, no migration headaches.
What developers say
Teams shipping with Kimi K2 on Apifuse.
“Switched my base URL and the bill dropped instantly. Same outputs, a fraction of the cost.”
“One key, one invoice across every model we use. Procurement finally stopped complaining.”
“Routing has been rock solid under load — latency is consistently lower than going direct.”
“Dropped it into our toolchain in an afternoon. The OpenAI-compatible schema just works.”
Kimi K2 FAQ
Common questions before you integrate.
What is Kimi K2 and what can it do? +
How much does Kimi K2 cost on Apifuse? +
How do I call Kimi K2? +
Is there an SLA for Kimi K2? +
Related models
More llm models on Apifuse.
GPT-5
Frontier reasoning and coding model with long context and strong tool use.
Claude Sonnet 4.5
Balanced flagship for agents and coding — fast, steerable, and reliable.
Gemini 2.5 Pro
Multimodal reasoning with a very large context window and native tool calling.
Grok-4
Real-time-aware model with strong reasoning and a playful, direct voice.