Data Volumes
Data Volumes in OICM provide a convenient way to manage data such as models, datasets, and other files required for AI workloads.
OICM can be connected to external sources, for example Hugging Face or an S3-compatible Object Storage. Data from these sources is fetched and stored inside a Persistent Volume. In addition, Data Volumes can be used to store data generated within the platform such as trained or fine-tuned models. All stored data in Data Volumes contributes to the File Storage Quota.
Once created, Data Volumes make the stored data available to be attached to workloads within the same workspace.

How it works
When you create a Data Volume, you define:
- Name: A unique identifier for the volume
- Data Type: Choose Model, Dataset, or Other
-
File Storage Quota: The maximum size the volume can reach (in GiB)
note: Make sure your workspace has enough FS quota for your allocations
-
Tags (optional): Add searchable tags to organize and filter volumes later

After creating the volume, you can choose to import data from an external source. Supported sources are:
- Hugging Face (HF): Provide a saved secret blueprint containing your HF username and access token, then enter the repo name.

- OBS (S3): Provide the secret OBS blueprint, endpoint URL, bucket name, and directory.
Once configured, OICM automatically syncs the data into the Data Volume when you click Import.

Workloads and Data Volumes
Data Volumes can attach to Workloads for reading or writing data. Supported workload types are:
- Jobs
- Fine-tuning Coming Soon
- Model deployment
Jobs
Job workloads can read from or write to Data Volumes. To connect, configure your config.yaml with the following options based on your needs:
Read Model
# Read-only access mounted at "/data-volumes/input_model"
input_model_data_volume:
name: <identifier of the data volume>
Read Dataset
# Read-only access mounted at "/data-volumes/input_dataset"
input_dataset_data_volume:
name: <identifier of the data volume>
Write Data
# Read-Write Access mounted at "/data-volumes/output_checkpoints"
output_checkpoints_data_volume:
name: <identifier of the data volume>
Write Model
# Read-Write Access mounted at "/data-volumes/output_model"
output_model_data_volume:
name: <identifier of the data volume>
Testing Read-Only Access
You can verify that your job can read from Data Volumes with this script:
import os
def print_tree(folder_path, indent=""):
try:
items = os.listdir(folder_path)
except FileNotFoundError:
print(f"{folder_path} does not exist.")
return
except PermissionError:
print(f"{folder_path}: Permission Denied")
return
print(f"Folder {folder_path} is mounted")
for index, item in enumerate(sorted(items)):
full_path = os.path.join(folder_path, item)
connector = "├── " if index < len(items) - 1 else "└── "
print(indent + connector + item)
if os.path.isdir(full_path):
new_indent = indent + ("│ " if index < len(items) - 1 else " ")
print_tree(full_path, new_indent)
if __name__ == "__main__":
paths = ["/data-volumes/input_model", "/data-volumes/input_dataset"]
for path in paths:
if os.path.exists(path):
print(f"\nDirectory structure for {path}:")
print(path)
print_tree(path)
else:
print(f"\n{path} does not exist.")
Testing Read-Write Access
Use the following script to check write, read, and delete permissions:
import os
def check_dir_permissions(folder_path):
test_file = os.path.join(folder_path, "test_permission_file.txt")
if not os.path.exists(folder_path):
print(f"{folder_path} does not exist.")
return
try:
# Test write
with open(test_file, "w") as f:
f.write("permission test")
print(f"Write OK in {folder_path}")
# Test read
with open(test_file, "r") as f:
content = f.read()
if content == "permission test":
print(f"Read OK in {folder_path}")
else:
print(f"Read FAILED in {folder_path}")
# Test delete
os.remove(test_file)
print(f"Delete OK in {folder_path}")
except PermissionError:
print(f"Permission Denied in {folder_path}")
except Exception as e:
print(f"Error in {folder_path}: {e}")
if __name__ == "__main__":
paths = ["/data-volumes/output_model", "/data-volumes/output_checkpoints"]
for path in paths:
print(f"\nChecking permissions in: {path}")
check_dir_permissions(path)
Fine-Tuning Coming Soon
You can import a model from Hugging Face or from an external OBS into a Data Volume for later fine-tuning. Once you import your data into a Data Volume of data type model, it will appear under Model Source and can be selected when creating a Fine-Tuning Task.

Model Deployment
You can deploy a model directly from a Data Volume. This is the most efficient and fastest way to deploy a model since the data is already in the platform and does not need to be fetched from an external source every time you need to deploy the model. A single Data Volume can be used for multiple model deployments.
note: When deploying using Data Volume, the name of the deployment should be used for the model in Open AI compatible endpoint for inference.

Verifying Data Volume Checksum
This guide explains how to verify the checksum calculated by our platform for your data volume. The checksum ensures data integrity by creating a unique fingerprint of your entire directory structure and file contents.
How the Checksum Works
Our platform uses the BLAKE3 hashing algorithm to create a hierarchical checksum that includes:
- File contents (hashed using BLAKE3)
- Directory structure
- File and directory names
- Symlink targets
The algorithm processes entries in alphabetical order to ensure consistent results across different systems.
Data Structure
For each directory, we create a manifest with entries in this format:
- Files:
f:<filename>:<blake3_hash_of_content> - Directories:
d:<dirname>:<blake3_hash_of_subdirectory> - Symlinks:
l:<linkname>:<target_path>
These entries are sorted alphabetically, joined with null bytes (\0), and then hashed with BLAKE3 to produce the final checksum.
Verification Method
Installation
Usage
Generate a checksum:
oic /path/to/directory
# OR invoke oip-checksum-validator via uvx, no permanent install needed
uvx --from oip-checksum-validator oic /path/to/directory
Verify against a reference checksum:
oic /path/to/directory -c <expected_checksum>
# OR invoke oip-checksum-validator via uvx, no permanent install needed
uvx --from oip-checksum-validator oic /path/to/directory -c <expected_checksum>
Example
For a directory structure:
The algorithm creates:
f:file1.txt:<blake3_hash_of_file1_content>d:subdir:<blake3_hash_of_subdir_manifest>l:link:file1.txt
These are sorted, joined with null bytes, and hashed to produce the final checksum.
Verification
Compare your calculated checksum with the one provided by our platform. If they match, your data integrity is confirmed.