Chat completions

curl --request POST \ --url https://api.tera.gw/v1/chat/completions \ --header 'Authorization: Bearer <token>' \ --header 'Content-Type: application/json' \ --data ' { "model": "Qwen/Qwen2.5-7B-Instruct", "messages": [ { "role": "system", "content": "<string>", "name": "<string>", "tool_call_id": "<string>", "tool_calls": [ { "id": "<string>", "type": "function", "function": { "name": "<string>", "arguments": "<string>" } } ] } ], "max_tokens": 256, "temperature": 0.7, "top_p": 0.5, "top_k": 123, "stop": "<string>", "seed": 123, "frequency_penalty": 0, "presence_penalty": 0, "repetition_penalty": 123, "stream": false, "tools": [ { "type": "function" } ], "tool_choice": "none", "response_format": {} } '

{ "id": "<string>", "object": "chat.completion", "created": 123, "model": "<string>", "choices": [ { "index": 123, "message": { "role": "<string>", "content": "<string>", "reasoning_content": "<string>", "tool_calls": [ { "id": "<string>", "type": "function", "function": { "name": "<string>", "arguments": "<string>" } } ] }, "finish_reason": "stop" } ], "usage": { "prompt_tokens": 123, "completion_tokens": 123, "total_tokens": 123 } }

Authorizations

Authorization

string

header

required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json

model

string

required

HuggingFace model id. See /v1/models.

Example:

"Qwen/Qwen2.5-7B-Instruct"

messages

object[]

required

Show child attributes

max_tokens

integer

Maximum tokens to generate.

Example:

256

temperature

number

Required range: 0 <= x <= 2

Example:

0.7

top_p

number

Required range: 0 <= x <= 1

top_k

integer

vLLM-specific. Top-k sampling.

stop

seed

integer

Deterministic seed for sampling.

frequency_penalty

number

Required range: -2 <= x <= 2

presence_penalty

number

Required range: -2 <= x <= 2

repetition_penalty

number

vLLM-specific. Penalty for repeated tokens.

stream

boolean

default:false

tools

object[]

Show child attributes

tool_choice

Available options:

none,

auto,

required

response_format

object

Optional response constraints — e.g. {"type": "json_object"} for JSON mode, or {"type": "json_schema", "json_schema": {...}} for structured outputs.

Response

A chat completion (or an SSE stream when stream: true).

string

object

string

Example:

"chat.completion"

created

integer

model

string

choices

object[]

Show child attributes

usage

object

Show child attributes

API Reference

Documentation Index

Authorizations

Body

Response