post https://api.egp.scale.com/v4/models//chat-completions
Description
Interact with the LLM model using the specified model_deployment_id. You can include a list of messages as the conversation history. The conversation can feature multiple messages from the roles user, assistant, and system. If the chosen model does not support chat completion, the API will revert to simple completion, disregarding the provided history. The endpoint manages context length exceedance optimistically: it estimates the token count from the provided history and prompt, and if it exceeds the context or approaches 80% of it, the exact token count will be calculated, and the history will be trimmed to fit the context.
{
"prompt": "Generate 5 more",
"chat_history": [
{ "role": "system", "content": "You are a name generator. Do not generate anything else than names" },
{ "role": "user", "content": "Generate 5 names" },
{ "role": "assistant", "content": "1. Olivia Bennett\n2. Ethan Carter\n3. Sophia Ramirez\n4. Liam Thompson\n5. Ava Mitchell" }
],
}