Llama, Qwen, DeepSeek, Mistral — drop-in OpenAI-compatible.
curl --request POST \
--url https://api.tera.gw/v1/chat/completions \
--header 'Authorization: Bearer <token>' \
--header 'Content-Type: application/json' \
--data '
{
"model": "Qwen/Qwen2.5-7B-Instruct",
"messages": [
{
"role": "system",
"content": "<string>",
"name": "<string>",
"tool_call_id": "<string>",
"tool_calls": [
{
"id": "<string>",
"type": "function",
"function": {
"name": "<string>",
"arguments": "<string>"
}
}
]
}
],
"max_tokens": 256,
"temperature": 0.7,
"top_p": 0.5,
"top_k": 123,
"stop": "<string>",
"seed": 123,
"frequency_penalty": 0,
"presence_penalty": 0,
"repetition_penalty": 123,
"stream": false,
"tools": [
{
"type": "function"
}
],
"tool_choice": "none",
"response_format": {}
}
'{
"id": "<string>",
"object": "chat.completion",
"created": 123,
"model": "<string>",
"choices": [
{
"index": 123,
"message": {
"role": "<string>",
"content": "<string>",
"reasoning_content": "<string>",
"tool_calls": [
{
"id": "<string>",
"type": "function",
"function": {
"name": "<string>",
"arguments": "<string>"
}
}
]
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 123,
"completion_tokens": 123,
"total_tokens": 123
}
}OpenAI-compatible chat completions endpoint. Set stream: true for
Server-Sent Events. Some models also return a reasoning_content field
with chain-of-thought traces — see Reasoning models.
curl --request POST \
--url https://api.tera.gw/v1/chat/completions \
--header 'Authorization: Bearer <token>' \
--header 'Content-Type: application/json' \
--data '
{
"model": "Qwen/Qwen2.5-7B-Instruct",
"messages": [
{
"role": "system",
"content": "<string>",
"name": "<string>",
"tool_call_id": "<string>",
"tool_calls": [
{
"id": "<string>",
"type": "function",
"function": {
"name": "<string>",
"arguments": "<string>"
}
}
]
}
],
"max_tokens": 256,
"temperature": 0.7,
"top_p": 0.5,
"top_k": 123,
"stop": "<string>",
"seed": 123,
"frequency_penalty": 0,
"presence_penalty": 0,
"repetition_penalty": 123,
"stream": false,
"tools": [
{
"type": "function"
}
],
"tool_choice": "none",
"response_format": {}
}
'{
"id": "<string>",
"object": "chat.completion",
"created": 123,
"model": "<string>",
"choices": [
{
"index": 123,
"message": {
"role": "<string>",
"content": "<string>",
"reasoning_content": "<string>",
"tool_calls": [
{
"id": "<string>",
"type": "function",
"function": {
"name": "<string>",
"arguments": "<string>"
}
}
]
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 123,
"completion_tokens": 123,
"total_tokens": 123
}
}Documentation Index
Fetch the complete documentation index at: https://docs.tera.gw/llms.txt
Use this file to discover all available pages before exploring further.
Bearer authentication header of the form Bearer <token>, where <token> is your auth token.
HuggingFace model id. See /v1/models.
"Qwen/Qwen2.5-7B-Instruct"
Show child attributes
Maximum tokens to generate.
256
0 <= x <= 20.7
0 <= x <= 1vLLM-specific. Top-k sampling.
Deterministic seed for sampling.
-2 <= x <= 2-2 <= x <= 2vLLM-specific. Penalty for repeated tokens.
Show child attributes
none, auto, required Optional response constraints — e.g. {"type": "json_object"}
for JSON mode, or {"type": "json_schema", "json_schema": {...}}
for structured outputs.