Chat Completions
The Chat Completions API is the primary endpoint for generating text responses from AI models. It is fully compatible with the OpenAI Chat Completions API format.
Endpoint
POST https://toprouter.cc/chat/completionsRequest Format
json
{
"model": "google/gemini-3.5-flash",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello!"}
],
"temperature": 0.7,
"max_tokens": 1024,
"stream": false
}Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
model | string | ✅ | Model ID to use (e.g., google/gemini-3.5-flash) |
messages | array | ✅ | Array of message objects |
temperature | number | ❌ | Sampling temperature (0-2), default varies by model |
max_tokens | integer | ❌ | Maximum tokens in the response |
stream | boolean | ❌ | Enable streaming response, default false |
top_p | number | ❌ | Nucleus sampling parameter |
frequency_penalty | number | ❌ | Frequency penalty (-2 to 2) |
presence_penalty | number | ❌ | Presence penalty (-2 to 2) |
stop | string/array | ❌ | Stop sequences |
Message Roles
| Role | Description |
|---|---|
system | Sets the behavior and context for the AI |
user | The user's input message |
assistant | Previous AI responses (for multi-turn conversations) |
Response Format
json
{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"created": 1717500000,
"model": "google/gemini-3.5-flash",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I'm here to help. What can I do for you?"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 20,
"completion_tokens": 15,
"total_tokens": 35
}
}Streaming
Enable streaming to receive the response incrementally:
python
from openai import OpenAI
client = OpenAI(
api_key="sk-your-toprouter-key",
base_url="https://toprouter.cc"
)
stream = client.chat.completions.create(
model="anthropic/claude-4.6-sonnet",
messages=[{"role": "user", "content": "Write a haiku about coding"}],
stream=True
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="", flush=True)Multi-Turn Conversations
Maintain conversation context by including previous messages:
python
messages = [
{"role": "system", "content": "You are a helpful coding assistant."},
{"role": "user", "content": "What is a Python decorator?"},
{"role": "assistant", "content": "A Python decorator is a function that modifies another function..."},
{"role": "user", "content": "Can you show me an example?"}
]
response = client.chat.completions.create(
model="anthropic/claude-4.6-sonnet",
messages=messages
)Vision (Multimodal)
Some models support image inputs:
python
response = client.chat.completions.create(
model="openai/gpt-5.5",
messages=[
{
"role": "user",
"content": [
{"type": "text", "text": "What's in this image?"},
{
"type": "image_url",
"image_url": {"url": "https://example.com/image.jpg"}
}
]
}
]
)TIP
Not all models support vision. Check the Models page for multimodal capabilities.
