API Reference

Chat Completions

Description

Interact with the LLM model using a chat completions interface. The LLM model will respond with an assistant message.

{ "model": "gpt-4o", "messages": [ { "role": "system", "content": "You are a name generator. Do not generate anything else than names" }, { "role": "user", "content": "Generate 5 names" }, ] }
Log in to see full request history
Body Params
integer

Only sample from the top K options for each subsequent token

number
-2 to 2

Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far.

function_call
object

Deprecated in favor of tool_choice. Controls which function is called by the model.

functions
array of objects

Deprecated in favor of tools. A list of functions the model may generate JSON inputs for.

Functions
logit_bias
object

Modify the likelihood of specified tokens appearing in the completion. Maps tokens to bias values from -100 to 100.

boolean

Whether to return log probabilities of the output tokens or not.

integer

An upper bound for the number of tokens that can be generated, including visible output tokens and reasoning tokens.

integer

Deprecated in favor of max_completion_tokens. The maximum number of tokens to generate.

metadata
object

Developer-defined tags and values used for filtering completions in the dashboard.

modalities
array of strings

Output types that you would like the model to generate for this request.

Modalities
integer

How many chat completion choices to generate for each input message.

boolean

Whether to enable parallel function calling during tool use.

prediction
object

Static predicted output content, such as the content of a text file being regenerated.

number
-2 to 2

Number between -2.0 and 2.0. Positive values penalize tokens based on whether they appear in the text so far.

string

For o1 models only. Constrains effort on reasoning. Values: low, medium, high.

response_format
object

An object specifying the format that the model must output.

integer

If specified, system will attempt to sample deterministically for repeated requests with same seed.

Up to 4 sequences where the API will stop generating further tokens.

boolean

Whether to store the output for use in model distillation or evals products.

boolean

If true, partial message deltas will be sent as server-sent events.

stream_options
object

Options for streaming response. Only set this when stream is true.

number
0 to 2

What sampling temperature to use. Higher values make output more random, lower more focused.

Controls which tool is called by the model. Values: none, auto, required, or specific tool.

tools
array of objects

A list of tools the model may call. Currently, only functions are supported. Max 128 functions.

Tools
integer
0 to 20

Number of most likely tokens to return at each position, with associated log probability.

number
0 to 1

Alternative to temperature. Only tokens comprising top_p probability mass are considered.

audio
object

Parameters for audio output. Required when audio output is requested with modalities: ['audio'].

string
required

model specified as model_vendor/model, for example openai/gpt-4o

messages
array of objects
required

openai standard message format

Messages*
Responses

Language
Credentials
Request
Click Try It! to start a request and see the response here! Or choose an example:
application/json