Model Deployments Documentation

Introduction

The Deployments module enables you to transform your machine learning models into production-ready services with configurable resources and scaling options. This feature allows you to deploy models from five distinct sources to create scalable API endpoints:

Model Hub: Preconfigured, ready-to-deploy models.
Model Registry: Custom models registered within OICM.
External Source: Direct deployments from external repositories like Hugging Face.
Existing Volume: Models stored directly in your OICM data volumes.
Custom Docker Image: Fully customized deployments using your own container images.

Key Concepts

Model Deployments

A model deployment is a running instance of a machine learning model that:

Exposes the model as an API endpoint
Allocates specific computing resources
Manages scaling based on demand
Monitors performance metrics