Usage GuideΒΆ
Setting Up LocallyΒΆ
1. Clone the RepositoryΒΆ
2. Create a Python EnvironmentΒΆ
π§Ή CLI Interaction via ICOS ShellΒΆ
The Intelligence Layer supports CLI-based interactions through the Export Metrics API, accessible within the ICOS Shell. You can use two main commands:
train
: Triggers model training with JSON-formatted input.predict
: Generates predictions based on the provided data.
These commands map to backend API endpoints detailed in the Backend Table.
π Using JupyterHub Inside the AI Support ContainerΒΆ
-
Access the container:
-
Create a user:
-
Launch JupyterHub:
-
Login via browser using the created credentials.
π Trustworthy AI ModuleΒΆ
β Explainable AI (XAI)ΒΆ
Uses SHAP for model interpretability:
Use consistent MLFlow tags to group experiment artifacts.β Prediction ConfidenceΒΆ
Each prediction includes confidence scores and intervals for assessing result reliability.
β Model MonitoringΒΆ
Uses NannyML to monitor performance and detect drift. Retraining can be triggered automatically.
β Federated LearningΒΆ
Implements federated learning via Flower, enabling privacy-preserving distributed training where raw data stays local.
π AI Analytics ModuleΒΆ
To support scalable, efficient, and interpretable machine learning workflows, the Intelligence Layer offers a robust analytics module designed for both research and production use cases.
The Intelligence Layer provides a flexible and extensible AI analytics pipeline that supports:
-
Univariate & Multivariate Forecasting using LSTM models:
-
Supports forecasting for n-size metrics. If a single metric is used, it performs univariate forecasting. If multiple metrics are provided, it performs multivariate forecasting using a shared vanilla neural network architecture.
-
Users can define the number of steps ahead (x) for prediction β supporting both short and long-term forecasts.
-
The main limitations are the volume and quality of data, and the need for customized models tuned to specific datasets.
-
Experiment Tracking with MLflow:
-
Automatically logs training metrics, loss curves, and other relevant parameters.
-
Helps users visually inspect and compare model performance across multiple runs.
-
Model Compression using quantization and distillation:
π§ Model CompressionΒΆ
The Intelligence Layer supports PyTorch-based model compression, which can be configured directly in the training request to the /train
endpoint. This includes:
{
"pytorch_model_parameters": {
"hidden_size": 64,
"num_epochs": 50,
"quantize": true,
"distill": false
}
}
πΉ Available Compression Options:ΒΆ
-
"quantize": true
β Enables dynamic post-training quantization (INT8 conversion using PyTorch). This reduces the model size by up to 3Γ. -
"distill": true
β Activates distillation using a teacher-student training setup, achieving model size reductions of up to 70Γ. -
Both techniques can be combined, yielding a smaller, faster INT8-optimized model with preserved accuracy.
π OutputΒΆ
-
Trained models (whether quantized or full precision) are stored in the Intelligence Layer model registry.
-
Additional parameters (e.g.,
hidden_size
,num_epochs
) can be adjusted to fine-tune training.
These compression techniques improve model efficiency for deployment while preserving transparency and accuracy β making them suitable for both edge and cloud environments.
All compression options are easily configured using the JSON payload shown above.
Local testsΒΆ
# Example training script with argparse
import argparse
import json
if __name__ == "__main__":
parser = argparse.ArgumentParser(description='Model Trainer')
parser.add_argument('--steps_back', default=12)
parser.add_argument('--model_type', default='XGB', choices=['XGB', 'ARIMA'])
parser.add_argument('--dataset_name', default='ORANGECPU')
parser.add_argument('--test_size', default=0.2)
parser.add_argument('--model_parameters', default={"arima_model_parameters": {"p":5, "d":1, "q":0},
"xgboost_model_parameters": {"n_estimators":1000, "max_depth":7, "eta":0.1}}, type=json.loads)
args = parser.parse_args()
train = ModelTrain(args)
results = train.initiate_train()
- Inference via API:
- Load trained models from BentoML store via the service API
# Example inference script
from inference import predict_cpu_utilisation, PredictFeatures
data = {
"model_tag": "cpu_utilization_model_xgb:latest",
"model_type": "XGB",
"steps_back": 12,
"input_series": [79.4, 67.9, 71.2, 46.5, 67.3, 65.7, 62.7, 70.5, 73.5, 69.9, 64.1, 61.0]
}
input_data = PredictFeatures(**data)
y_test_pred = predict_cpu_utilisation(input_data)
print(f"Predicted value: {y_test_pred}")
UsageΒΆ
After running the api service you could use tools like curl
, Postman
or Swagger
to interact with the endpoints.
ExampleΒΆ
-
To request a model training
-
To request model inference