Skip to content

Welcome to OICM+ Documentation

OICM+ is a secure, flexible platform to orchestrate GPU infrastructure across multi-tenant, multi-cluster, and multi-cloud environments​

OICM – AI Infrastructure Orchestration​​


Core Capabilities

  • AI workload orchestration for training, inference, fine-tuning, and evaluation​.
  • GPU resource scheduling, scaling, and job lifecycle management​.
  • Role-Based Access Control (RBAC) across tenants, workspaces, and users. ​
  • Data & network isolation​.
  • Integrated monitoring, logging, and usage tracking for cost and performance.

Infrastructure Flexibility

  • Multi-Cluster & Multi-Cloud: Manage workloads across clusters and environments.​
  • Hardware-Agnostic: Compatible with NVIDIA, AMD (ROCm), and future accelerators.

Integration Highlights

  • Identity systems: Keycloak, LDAP, Active Directory, SSO (OAuth2, SAML)​.
  • Storage & Networking Integration: Support for on-prem and cloud-native components.​
  • Model repositories: Hugging Face, internal model stores​.
  • Billing system integration: Export GPU hours, token usage, API calls, storage, and job metrics for external billing engines.​
  • Monitoring and logging: Prometheus, Grafana, ELK/EFK stack, with stream integration.​
  • RESTful APIs & SDKs: Full control over all the features of the platform.​

Explore OICM