Responses
OpenAI-Compatible Endpoints
Responses
Create a response using OpenAI Responses API format
POST
Responses
Overview
The/v1/responses endpoint provides OpenAI Responses API compatibility. It accepts structured input with instructions and forwards requests to upstream with proper validation, sanitization, and error handling.
This endpoint supports both streaming and non-streaming modes, handles conversation context, and provides full tool calling capabilities.
Authentication
Bearer token for API authentication. Format:
Bearer YOUR_API_KEYRequest Body
ID of the model to use. Must be a valid model slug from the
/v1/models endpoint.Example: "gpt-4.1", "gpt-5.2"User input to the model.Can be:
- String: Plain text input (normalized to single
input_textitem) - Array: Structured input items with role-based messages
role(string):"user","assistant", or"tool"content(string | array): Message contenttype(string): Item type (e.g.,"input_text","input_image","function_call_output")
input_file.file_id is not supported and will return error.System-level instructions for the model. Equivalent to system/developer messages in Chat Completions.
Alternative to
input. Array of chat-formatted messages.Cannot be used together with input. Provide either input or messages, not both.Messages are coerced into instructions (for system/developer) and input items (for user/assistant/tool).Array of tool definitions available to the model.Each tool object:
type(string): Tool typename(string): Tool name (for function tools)description(string): Tool descriptionparameters(object): JSON Schema for parameters
function: Custom function callsweb_searchorweb_search_preview: Web search capability
file_search,code_interpreter,computer_use,computer_use_preview,image_generation
Controls which tool the model should use.Options:
"none": Model will not call tools"auto": Model decides whether to call tools"required": Model must call at least one tool- Object:
{"type": "...", "name": "..."}
Whether to enable parallel tool calling.
Reasoning controls for the model.Properties:
effort(string): Reasoning effort level (e.g.,"low","medium","high")summary(string): Reasoning summary mode
Text output controls.Properties:
verbosity(string): Output verbosity levelformat(object): Output format specificationtype(string):"text","json_object", or"json_schema"schema(object): JSON Schema (forjson_schematype)name(string): Schema namestrict(boolean): Strict schema adherence
Whether to stream the response as server-sent events.
true: Returnstext/event-streamwith Responses eventsfalse: Returns a single response object
Additional data to include in the response.Allowed values:
"code_interpreter_call.outputs""computer_call_output.output.image_url""file_search_call.results""message.input_image.image_url""message.output_text.logprobs""reasoning.encrypted_content""web_search_call.action.sources"
400 error.Conversation ID for multi-turn context.Cannot be used with
previous_response_id.Not supported. Returns
400 error.Use conversation instead for multi-turn context.Must be
false or omitted. Setting to true returns 400 error.Not supported. Returns
400 error if provided.Cache key for prompt caching optimization.
Response (Non-Streaming)
Whenstream is false or omitted, returns a response object:
Unique identifier for the response.
Always
"response".Response status:
"completed": Successfully completed"incomplete": Incomplete (e.g., max tokens reached)"failed": Failed with error
Array of output items generated by the model.Each output item has:
type(string): Output type (e.g.,"message","function_call","web_search_call")- Additional fields based on type
Token usage information.Properties:
input_tokens(integer): Tokens in the inputoutput_tokens(integer): Tokens in the outputtotal_tokens(integer): Total tokens usedinput_tokens_details(object | null):cached_tokens(integer): Cached input tokens
output_tokens_details(object | null):reasoning_tokens(integer): Tokens used for reasoning
Error information (only present when
status is "failed").Properties:message(string): Error messagetype(string): Error typecode(string): Error code
Response (Streaming)
Whenstream is true, returns text/event-stream with event objects:
Event Types
Emitted when response is created.Contains
response object with id and initial metadata.Emitted during response generation.May include partial
response data.Emitted for text output deltas.Properties:
delta(string): Text fragment
Emitted for refusal text deltas.Properties:
delta(string): Refusal fragment
Emitted for tool call deltas.Properties:
call_id(string): Tool call IDname(string): Tool namearguments(string): Arguments fragment
Emitted when response completes successfully.Contains full
response object with output and usage.Emitted when response is incomplete.Contains
response with incomplete_details:reason(string): Why incomplete (e.g.,"max_output_tokens","content_filter")
Emitted when response fails.Contains
response with error object.Emitted for immediate errors.Contains
error object with error details.Examples
Basic Text Response
Streaming Response
With Instructions
Structured Input with Messages
Tool Calling
Web Search
Reasoning with Summary
JSON Output Format
Conversation Context
Input Sanitization
The service automatically sanitizes input before forwarding to upstream:Interleaved Reasoning Removal
Unsupported interleaved reasoning fields are stripped from input:reasoning_contentreasoning_detailstool_calls(in input context)function_call(in input context)
reasoning controls are preserved.
Content Type Normalization
- Assistant text content is rewritten to use
output_texttype - Tool messages are converted to
function_call_outputformat withcall_id
Unsupported Field Removal
Before upstream forwarding, these fields are stripped:safety_identifierprompt_cache_retentiontemperaturemax_output_tokens
Error Handling
All errors return OpenAI-compatible error envelopes:invalid_request_error: Invalid request parametersmodel_not_allowed: API key lacks access to requested modelno_accounts: No upstream accounts available (503 status)upstream_error: Upstream service error (502 status)not_implemented: Feature not implemented (501 status)
response.failed or error events.
Validation Rules
Input Validation
- Either
inputormessagesrequired (not both) inputmust be string or arrayinput_file.file_idis rejected
Tool Validation
web_search_previewnormalized toweb_search- Unsupported tool types rejected:
file_search,code_interpreter,computer_use,image_generation
Conversation Validation
- Cannot provide both
conversationandprevious_response_id previous_response_idis not supported
Store Validation
storemust befalseor omitted- Setting to
truereturns error
Include Validation
- Only allowlisted include values accepted
- Unknown values return error
Truncation Validation
truncationis not supported- Any value returns error
Model Restrictions
If your API key hasallowed_models configured, only those models can be used. Requests for other models return:
/v1/models.