Model Deployments Documentation
Introduction
The Deployments module enables you to transform your machine learning models into production-ready services with configurable resources and scaling options. This feature allows you to deploy models from five distinct sources to create scalable API endpoints:
- Model Hub: Preconfigured, ready-to-deploy models.
- Model Registry: Custom models registered within OICM.
- External Source: Direct deployments from external repositories like Hugging Face.
- Existing Volume: Models stored directly in your OICM data volumes.
- Custom Docker Image: Fully customized deployments using your own container images.
Key Concepts
Model Deployments
A model deployment is a running instance of a machine learning model that:
- Exposes the model as an API endpoint
- Allocates specific computing resources
- Manages scaling based on demand
- Monitors performance metrics