API reference
WorldRouter exposes an OpenAI-compatible /chat/completions endpoint. Any SDK or HTTP client that accepts a custom base URL can call it with no code changes.
New to WorldRouter? Run through the Quickstart first for an API key, the base URL, and a connection test. This page assumes you already have both.
Endpoint
POST https://inference-api.worldrouter.ai/v1/chat/completionsAuthentication
Pass your API key as a Bearer token in the Authorization header:
Authorization: Bearer your_api_keyKeys are scoped to your account. Create and rotate them in the API Keys dashboard.
Request body
| Field | Required | Description |
|---|---|---|
model | yes | Model ID to route to, e.g. gpt-5.4. See the Models catalog. |
messages | yes | Array of chat messages. Must contain at least one entry. |
temperature | no | Sampling temperature. Defaults depend on the model. |
max_tokens | no | Upper bound on output tokens. On reasoning models the budget can be fully consumed by hidden reasoning tokens, which returns an empty content with finish_reason: "length". |
stream | no | When true, the response is a Server-Sent Events stream of deltas. |
tools / tool_choice | no | Function calling, same schema as OpenAI. |
Any other field in the OpenAI /chat/completions schema (top_p, stop, seed, response_format, …) is accepted unchanged.
Response
A non-streaming response matches the OpenAI shape:
{
"id": "chatcmpl-...",
"object": "chat.completion",
"created": 1738960610,
"model": "gpt-5.4",
"choices": [
{
"index": 0,
"message": { "role": "assistant", "content": "Hello! How can I help you today?" },
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 13,
"completion_tokens": 9,
"total_tokens": 22
}
}Examples
curl https://inference-api.worldrouter.ai/v1/chat/completions \
-H "Authorization: Bearer your_api_key" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-5.4",
"messages": [{ "role": "user", "content": "Hello" }]
}'from openai import OpenAI
client = OpenAI(
api_key="your_api_key",
base_url="https://inference-api.worldrouter.ai/v1",
)
response = client.chat.completions.create(
model="gpt-5.4",
messages=[{"role": "user", "content": "Hello"}],
)
print(response.choices[0].message.content)import OpenAI from "openai";
const client = new OpenAI({
apiKey: "your_api_key",
baseURL: "https://inference-api.worldrouter.ai/v1",
});
const response = await client.chat.completions.create({
model: "gpt-5.4",
messages: [{ role: "user", content: "Hello" }],
});
console.log(response.choices[0].message.content);Streaming
Set stream: true in the request body. The response becomes a Server-Sent Events stream. Each chunk follows the OpenAI chat.completion.chunk shape, and the stream terminates with a data: [DONE] line:
curl https://inference-api.worldrouter.ai/v1/chat/completions \
-H "Authorization: Bearer your_api_key" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-5.4",
"stream": true,
"messages": [{ "role": "user", "content": "Hello" }]
}'from openai import OpenAI
client = OpenAI(
api_key="your_api_key",
base_url="https://inference-api.worldrouter.ai/v1",
)
stream = client.chat.completions.create(
model="gpt-5.4",
messages=[{"role": "user", "content": "Hello"}],
stream=True,
)
for chunk in stream:
delta = chunk.choices[0].delta.content
if delta:
print(delta, end="", flush=True)import OpenAI from "openai";
const client = new OpenAI({
apiKey: "your_api_key",
baseURL: "https://inference-api.worldrouter.ai/v1",
});
const stream = await client.chat.completions.create({
model: "gpt-5.4",
messages: [{ role: "user", content: "Hello" }],
stream: true,
});
for await (const chunk of stream) {
const delta = chunk.choices[0]?.delta?.content;
if (delta) process.stdout.write(delta);
}Error codes
| Code | Meaning | Fix |
|---|---|---|
| 400 | Invalid request (unknown model, malformed body, unsupported parameter) | Verify the model ID matches the Models page (IDs are case-sensitive) and check the request body against the field table above. |
| 401 | Invalid or missing API key | Check that your key is set correctly and has not been revoked in the dashboard. |
| 402 | Insufficient credits | Top up on the Credits page, then retry. |
| 429 | Rate limited | Back off and retry with exponential delay. Consider spreading load across models. |
| 500 | Server error | Retry the request. If it persists, try a different model or contact support. |
See also
- OpenAI Chat Completions reference : WorldRouter mirrors this schema.
- Models: full catalog with live pricing.
- Quickstart: API key, base URL, and first call.