Usage Guide¶

Setting Up Locally¶

1. Clone the Repository¶

git clone https://gitlab.com/icos/intelligence/intelligence-module.git
cd intelligence-module

2. Create a Python Environment¶

conda create -n icos-dev python=3.10
conda activate icos-dev
pip install -r src/requirements.txt

🧹 CLI Interaction via ICOS Shell¶

The Intelligence Layer supports CLI-based interactions through the Export Metrics API, accessible within the ICOS Shell. You can use two main commands:

train: Triggers model training with JSON-formatted input.
predict: Generates predictions based on the provided data.

These commands map to backend API endpoints detailed in the Backend Table.

📋 Using JupyterHub Inside the AI Support Container¶

Access the container:

docker exec -u root -it icos_intelligence_docker /bin/bash

Create a user:

passwd UC1  # Replace UC1 with your desired username

Launch JupyterHub:

jupyterhub -f /path/to/jupyterhub_config.py

Login via browser using the created credentials.

🔍 Trustworthy AI Module¶

✅ Explainable AI (XAI)¶

Uses SHAP for model interpretability:

plot_func(shap_data, show=False)
mlflow.log_figure(figure=fig, artifact_file=file_name)

Use consistent MLFlow tags to group experiment artifacts.

✅ Prediction Confidence¶

Each prediction includes confidence scores and intervals for assessing result reliability.

✅ Model Monitoring¶

Uses NannyML to monitor performance and detect drift. Retraining can be triggered automatically.

✅ Federated Learning¶

Implements federated learning via Flower, enabling privacy-preserving distributed training where raw data stays local.

📊 AI Analytics Module¶

To support scalable, efficient, and interpretable machine learning workflows, the Intelligence Layer offers a robust analytics module designed for both research and production use cases.

The Intelligence Layer provides a flexible and extensible AI analytics pipeline that supports:

Univariate & Multivariate Forecasting using LSTM models:
Supports forecasting for n-size metrics. If a single metric is used, it performs univariate forecasting. If multiple metrics are provided, it performs multivariate forecasting using a shared vanilla neural network architecture.
Users can define the number of steps ahead (x) for prediction — supporting both short and long-term forecasts.
The main limitations are the volume and quality of data, and the need for customized models tuned to specific datasets.
Experiment Tracking with MLflow:
Automatically logs training metrics, loss curves, and other relevant parameters.
Helps users visually inspect and compare model performance across multiple runs.
Model Compression using quantization and distillation:

🔧 Model Compression¶

The Intelligence Layer supports PyTorch-based model compression, which can be configured directly in the training request to the /train endpoint. This includes:

{
  "pytorch_model_parameters": {
    "hidden_size": 64,
    "num_epochs": 50,
    "quantize": true,
    "distill": false
  }
}

🔹 Available Compression Options:¶

"quantize": true → Enables dynamic post-training quantization (INT8 conversion using PyTorch). This reduces the model size by up to 3×.
"distill": true → Activates distillation using a teacher-student training setup, achieving model size reductions of up to 70×.
Both techniques can be combined, yielding a smaller, faster INT8-optimized model with preserved accuracy.

📄 Output¶

Trained models (whether quantized or full precision) are stored in the Intelligence Layer model registry.
Additional parameters (e.g., hidden_size, num_epochs) can be adjusted to fine-tune training.

These compression techniques improve model efficiency for deployment while preserving transparency and accuracy — making them suitable for both edge and cloud environments.

All compression options are easily configured using the JSON payload shown above.

Local tests¶

# Example training script with argparse
import argparse
import json

if __name__ == "__main__":
    parser = argparse.ArgumentParser(description='Model Trainer')
    parser.add_argument('--steps_back', default=12)
    parser.add_argument('--model_type', default='XGB', choices=['XGB', 'ARIMA'])
    parser.add_argument('--dataset_name', default='ORANGECPU')
    parser.add_argument('--test_size', default=0.2)
    parser.add_argument('--model_parameters', default={"arima_model_parameters": {"p":5, "d":1, "q":0},
        "xgboost_model_parameters": {"n_estimators":1000, "max_depth":7, "eta":0.1}}, type=json.loads)
    args = parser.parse_args()
    train = ModelTrain(args)
    results = train.initiate_train()

Inference via API:
Load trained models from BentoML store via the service API

# Example inference script
from inference import predict_cpu_utilisation, PredictFeatures

data = {
    "model_tag": "cpu_utilization_model_xgb:latest",
    "model_type": "XGB",
    "steps_back": 12,
    "input_series": [79.4, 67.9, 71.2, 46.5, 67.3, 65.7, 62.7, 70.5, 73.5, 69.9, 64.1, 61.0]
}
input_data = PredictFeatures(**data)
y_test_pred = predict_cpu_utilisation(input_data)
print(f"Predicted value: {y_test_pred}")

Usage¶

After running the api service you could use tools like curl, Postman or Swagger to interact with the endpoints.

Example¶

To request a model training

curl -X 'POST' \
  'http://0.0.0.0:3000/train_metrics_utilisation' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{model_training_parameters}'

To request model inference

curl -X 'POST' \
  'http://0.0.0.0:3000/predict_metrics_utilisation' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{model_inference_parameters}'