Description
Model Deployments are unique endpoints created for custom models in the Scale GenAI Platform. They enable users to interact with and utilize specific instances of models through the API/SDK.
Each deployment is associated with a model instance, containing the necessary model template and model-metadata. Model templates describe the creation parameters that are configured on the deployment.
The model deployments provide a means to call upon models for inference, logging calls, and monitoring usage.
Built-in models also have deployments for creating a consistent interface for all models. But they don't represent a real deployment, they are just a way to interact with the built-in models. These deployments are created automatically when the model is created and they are immutable.
Endpoint details
This endpoint is used to deploy a model instance. The request payload schema depends on the model_request_parameters_schema
of the Model Template that the created model was created from.