reasoning_content field so they don’t pollute the visible response.
Thinking models in the catalog today (partial list):
Response shape
Non-streaming responses get areasoning_content sibling of content:
reasoning_content is the model’s internal trace; content is the user-facing answer.
Streaming
Whenstream: true, reasoning deltas arrive first, then content deltas:
Should you show reasoning to end users?
Up to you. Common patterns:- Hide entirely — drop
reasoning_content, display onlycontent. - Show collapsible — UI affordance like “Show reasoning” that reveals the trace.
- Use for logging only — keep traces server-side for debugging and feedback loops.
max_tokens to bound total generation length.
Why a separate field?
OpenAI clients expectcontent to be the user-facing answer. Mixing reasoning markers (the literal think tags emitted by the model) into content breaks downstream parsers. By separating the two, OpenAI SDKs work without modification and reasoning becomes an opt-in feature on the client side.