Skip to main content
POST
/
v1
/
chat
/
completions
Create a chat completion
curl --request POST \
  --url https://api.tera.gw/v1/chat/completions \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "Qwen/Qwen2.5-7B-Instruct",
  "messages": [
    {
      "role": "system",
      "content": "<string>",
      "name": "<string>",
      "tool_call_id": "<string>",
      "tool_calls": [
        {
          "id": "<string>",
          "type": "function",
          "function": {
            "name": "<string>",
            "arguments": "<string>"
          }
        }
      ]
    }
  ],
  "max_tokens": 256,
  "temperature": 0.7,
  "top_p": 0.5,
  "top_k": 123,
  "stop": "<string>",
  "seed": 123,
  "frequency_penalty": 0,
  "presence_penalty": 0,
  "repetition_penalty": 123,
  "stream": false,
  "tools": [
    {
      "type": "function"
    }
  ],
  "tool_choice": "none",
  "response_format": {}
}
'
{
  "id": "<string>",
  "object": "chat.completion",
  "created": 123,
  "model": "<string>",
  "choices": [
    {
      "index": 123,
      "message": {
        "role": "<string>",
        "content": "<string>",
        "reasoning_content": "<string>",
        "tool_calls": [
          {
            "id": "<string>",
            "type": "function",
            "function": {
              "name": "<string>",
              "arguments": "<string>"
            }
          }
        ]
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 123,
    "completion_tokens": 123,
    "total_tokens": 123
  }
}

Documentation Index

Fetch the complete documentation index at: https://docs.tera.gw/llms.txt

Use this file to discover all available pages before exploring further.

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json
model
string
required

HuggingFace model id. See /v1/models.

Example:

"Qwen/Qwen2.5-7B-Instruct"

messages
object[]
required
max_tokens
integer

Maximum tokens to generate.

Example:

256

temperature
number
Required range: 0 <= x <= 2
Example:

0.7

top_p
number
Required range: 0 <= x <= 1
top_k
integer

vLLM-specific. Top-k sampling.

stop
seed
integer

Deterministic seed for sampling.

frequency_penalty
number
Required range: -2 <= x <= 2
presence_penalty
number
Required range: -2 <= x <= 2
repetition_penalty
number

vLLM-specific. Penalty for repeated tokens.

stream
boolean
default:false
tools
object[]
tool_choice
Available options:
none,
auto,
required
response_format
object

Optional response constraints — e.g. {"type": "json_object"} for JSON mode, or {"type": "json_schema", "json_schema": {...}} for structured outputs.

Response

A chat completion (or an SSE stream when stream: true).

id
string
object
string
Example:

"chat.completion"

created
integer
model
string
choices
object[]
usage
object