Skip to content

Model Deployments Documentation

Introduction

The Model Deployments module enables you to transform your machine learning models into production-ready services with configurable resources and scaling options. This feature works closely with the Model Registry, allowing you to deploy registered models or custom Docker images as scalable API endpoints.

Key Concepts

Model Deployments

A model deployment is a running instance of a machine learning model that:

  • Exposes the model as an API endpoint
  • Allocates specific computing resources
  • Manages scaling based on demand
  • Monitors performance metrics