Description
Creates and hosts a model based on a model template.
Base embedding models, chunk ranking functions, and LLMs are often not sufficient for customer use cases. We have shown in various blogs that fine-tuning these models on customer data can lead to significant improvements in performance.
- We Fine-Tuned GPT-4 to Beat the Industry Standard for Text2SQL
- OpenAI Names Scale as Preferred Partner to Fine-Tune GPT-3.5
- How to Fine-Tune GPT-3.5 Turbo With OpenAI API
Details
Before creating a model, you must first create a model template. A model template serves 2 purposes. First, it provides common scaffolding that is static across multiple models. Second, it exposes several variables that can be injected at model creation time to customize the model.
For example, a model template can define a docker image that contains code to run a HuggingFace or SentenceTransformers model. This docker image code also accepts environment variables that can be set to swap out the model weights or model name. Refer to the Create Model Template API for more details.
To create a new model, users must refer to an existing model template and provide the necessary parameters the the model template requires in its model_creation_parameters_schema
field. The combination of the model template and the model creation parameters will be used to create and deploy a new model.
Once a model has been created, it can be executed by calling the Execute Model API.
Coming Soon
Some of our EGP APIs depend on models, for example Knowledge Base APIs depend on embedding models, Chunk Ranking APIs depend on ranking models, and Completion APIs depend on LLMs.
In the near future, if a model is created from a model template that is compatible with one of these APIs (based on the model template's model_type field
), the model will automatically be registered with the API. This will allow users to immediately start using the model with those API without any additional setup.