Skip to content

Model Deployments Documentation

Introduction

The Deployments module enables you to transform your machine learning models into production-ready services with configurable resources and scaling options. This feature allows you to deploy models from five distinct sources to create scalable API endpoints:

  • Model Hub: Preconfigured, ready-to-deploy models.
  • Model Registry: Custom models registered within OICM.
  • External Source: Direct deployments from external repositories like Hugging Face.
  • Existing Volume: Models stored directly in your OICM data volumes.
  • Custom Docker Image: Fully customized deployments using your own container images.

Key Concepts

Model Deployments

A model deployment is a running instance of a machine learning model that:

  • Exposes the model as an API endpoint
  • Allocates specific computing resources
  • Manages scaling based on demand
  • Monitors performance metrics