Tracking API Client
The Tracking API helps data scientists and engineers log experiments, package code, and share models. It’s built on top of MLflow, offering extra features for easy reproducibility and collaboration.
1. Installation
Install the client via pip:
2. Library Import
Add the following import to your Python code:
3. Initialization
Connect to the tracking server with:
api_host = "<API_HOST>"
api_key = "<API_KEY>"
workspace_name = "<TARGET_WORKSPACE_NAME>"
TrackingClient.connect(api_host, api_key, workspace_name)
- api_host (str, required) – Hostname of the API server.
- api_key (str, required) – Your API key.
- workspace_name (str, required) – Name of the target workspace.
4. Experiment Setting
After connecting, specify the experiment:
- experiment_name (str, required) – Name of the experiment; created if it doesn’t exist.
For more info, see Experiment UI.
5. MLflow Compatibility
The TrackingClient extends MLflow’s functionality. All MLflow methods are accessible via TrackingClient.*
. For details, refer to the Official MLflow Documentation.
Example: TrackingClient.autolog()
is equivalent to mlflow.autolog()
.
6. Tracking Experiments
Enable Auto-tracking to log parameters, metrics, and models automatically:
from sklearn.model_selection import train_test_split
from sklearn.datasets import load_diabetes
from sklearn.ensemble import RandomForestRegressor
api_host = "<API_HOST>"
api_key = "<API_KEY>"
workspace_name = "<TARGET_WORKSPACE_NAME>"
experiment_name = "<EXPERIMENT_NAME>"
TrackingClient.connect(api_host, api_key, workspace_name)
TrackingClient.set_experiment(experiment_name)
with TrackingClient.start_run():
TrackingClient.set_run_name("YOUR_RUN_NAME")
TrackingClient.autolog()
db = load_diabetes()
X_train, X_test, y_train, y_test = train_test_split(db.data, db.target)
rf = RandomForestRegressor(n_estimators=10, max_depth=6, max_features=3)
rf.fit(X_train, y_train)
predictions = rf.predict(X_test)
7. Manual Model Logging
ّIf you prefer finer control, you can use manual logging. Automated logging offers convenience, but manual logging provides a level of control that might be necessary in specific scenarios.
The manual logging can be performed utilizing:
Our method is based on mlflow.<name_of_the_library>.log_model
, ensuring that TrackingClient.<name_of_the_library>.log_model
behaves in the same manner.
The following are the arguments of log_model
:
- 1st Argument: The model.
- 2nd Argument: The artifact path, which should be set to "model".
- Signature: A named argument that represents the model's input and output schemas.
To infer the model's signature using samples of its input and output data, use:
Supported Libraries
Below is a list of the log_model
methods available for various libraries:
TrackingClient.catboost.log_model
TrackingClient.diviner.log_model
TrackingClient.fastai.log_model
TrackingClient.gluon.log_model
TrackingClient.h2o.log_model
TrackingClient.johnsnowlabs.log_model
TrackingClient.langchain.log_model
TrackingClient.lightgbm.log_model
TrackingClient.mleap.log_model
TrackingClient.onnx.log_model
TrackingClient.openai.log_model
TrackingClient.paddle.log_model
TrackingClient.pmdarima.log_model
TrackingClient.prophet.log_model
TrackingClient.pyfunc.log_model
TrackingClient.pytorch.log_model
TrackingClient.sentence_transformers.log_model
TrackingClient.sklearn.log_model
TrackingClient.spacy.log_model
TrackingClient.spark.log_model
TrackingClient.statsmodels.log_model
TrackingClient.tensorflow.log_model
TrackingClient.transformers.log_model
TrackingClient.xgboost.log_model
For manually logging a scikit-learn model:
signature = TrackingClient.infer_signature(x_train, y_train)
TrackingClient.sklearn.log_model(model, "model", signature=signature)
8. Model Artifacts
Each run generates artifacts, metrics, parameters, and tags:
- Model Files – Serialized models (
model.pkl
,.h5
, etc.) and metadata (MLmodel
). - Metrics & Params – Logged performance data and hyperparameters.
- Tags – Extra metadata for categorization.
Example MLmodel file:
artifact_path: model
flavors:
python_function:
env:
conda: conda.yaml
virtualenv: python_env.yaml
loader_module: mlflow.sklearn
model_path: model.pkl
predict_fn: predict
python_version: 3.9.7
sklearn:
code: null
pickled_model: model.pkl
serialization_format: cloudpickle
sklearn_version: 1.2.2
mlflow_version: 2.2.1
model_uuid: cd3b0852791d48a2a67ce739b0b07070
run_id: fb1aeba6036345cca027d44d747d1aeb
signature:
inputs: '[{"type": "tensor", "tensor-spec": {"dtype": "float64", "shape": [-1, 10]}}]'
outputs: '[{"type": "tensor", "tensor-spec": {"dtype": "float64", "shape": [-1]}}]'
utc_time_created: '2023-07-20 09:16:16.898078'
9. Logging Artifacts
Beyond the standard MLflow features, the TrackingClient also supports logging additional artifact types for deeper analysis.
9.1 Image Tracking
from PIL import Image
image_data = Image.open("test_image.jpg")
extra = {"label": "car"}
TrackingClient.log_image_at_step(image_data, 'image_file.jpg', 1, extra)
- image_data – A
numpy.ndarray
orPIL.Image.Image
. - extra – An optional dict for metadata.
9.2 Audio Tracking
import numpy as np
audio_data = np.random.random(1000) # Example
TrackingClient.log_audio_at_step(audio_data, 'audio_file.wav', 1, rate=44100)
- rate – Sample rate of the audio.
- audio_data – A 1D or 2D numpy array (for stereo).
9.3 Text Tracking
9.4 Figure Tracking
import matplotlib.pyplot as plt
fig, ax = plt.subplots()
ax.plot([1, 2, 3, 4], [1, 4, 2, 3])
TrackingClient.log_figure_at_step(fig, 'figure_file.jpg', 1)
- fig – A
matplotlib.figure.Figure
orplotly.graph_objects.Figure
.
9.5 JSON Tracking
dict_data = {"key1": "value1", "key2": "value2"}
TrackingClient.log_dict_at_step(dict_data, 'dict_file.json', 1)
- Saves a dictionary or list as JSON.
Extra Parameters
All log_*_at_step
methods can include:
- file_name (str) – Name of the artifact file.
- step (int) – Numeric step index.
- extra (dict) – Additional metadata (keys as strings, values as int/float/str/bool/list/None).
extra = {"description": "This is a description."}
TrackingClient.log_audio_at_step(audio_data, 'audio_file.wav', 1, rate=44100, extra=extra)
Next Steps
- Run UI – See how your logged experiments appear in the platform’s UI.
- Experiment UI – Explore how experiments are organized and compared.
- Tracking Overview – Learn the fundamentals of OICM’s tracking server.
- Workspace Overview – Discover how workspaces facilitate isolation and security for ML workflows.