ICOS

The ICOS Dynamic Policy Manager

by ENG | November 2024

The ICOS System aims at being the next generation meta-operating system for the Cloud-Edge-IoT Continuum, offering a complete platform for orchestrating the user applications’ lifecycle in a secure, smart and efficient way. At the core of the ICOS’ orchestration capabilities there are i) a powerful observability framework that efficiently reports the status and events for the infrastructure and the applications (e.g. topology, resource characteristics and usage, security assessments, power consumption), ii) a policy-based engine that allow to compare the current status of the system with the desidered policies for a given component of the infrastructure and/or applications and iii) a matchmaking mechanisms to find an orchestration solution that satisfy all the requirements.

The Dynamic Policy Manager DPM - a.k.a. Polman is the ICOS component responsible for ensuring that the system meets the desidered thresholds of efficiency, performance and security. It integrates across the entire application lifecycle - from deployment to decommissioning - enforcing consistent policies at every stage. The key features of the DPM are:

  • A flexible policy model to define different types of policies for applications and infrastructures related to different aspects such as performance, security and efficiency.
  • A continuous monitoring of the system through the ICOS observability framework.
  • Non-conformance Detection: Identifying deviations from defined standards and triggering notifications for corrective actions.
  • Policy Enforcement: Enabling automatic corrective measures to optimize or restore performance.

One of the strengths of the DPM tool is in its close integration with the ICOS’ observability framework. In fact, policies are translated in queries to metrics published in the observability system. This makes it easy for the DPM to continuously evaluate the enforcement status of the policies with always-up-to-date data and quickly react to violations.

Some examples of policies that can be currently expressed and enforced with the ICOS DPM are:

  • “Run my application only on nodes with a SCA score greater than 70”;
  • “Migrate my application if unauthorized accesses are detected on the nodes where it is running”;
  • “Scale-up my job if the predicted time for completing is greater than 30 minutes”;
  • “Scale-out my application’s database if there are more than 100 users connected” 1;
  • “Migrate my application if the predicted cpu usage in 10 minutes on the node is greater than 80%”

Implementation

The simplified architectural diagram below shows the main internal components of the DPM and its main interactions with the ICOS System. The Dynamic Policy Manager service is written in Python 3.11 and provides a REST API for managing policies, facilitating integration with other ICOS components such as the Job Manager, App Descriptor, and DataClay, more details can be found at the following deliverable D3.2 - Meta-Kernel Layer Module Developed (IT-2). It offers features to programmatically create and read policies. The tool supports deployment in diverse environments, including Docker containers, Helm charts (for Kubernetes deployments), Manual Python execution (via Python 3.11). The last release of the Dynamic Policy Manager is available in the ICOS github dynamic policy manager repository.

DPM Architecture

DPM policies consist of three main parts:

  1. Subject: Defines the entity to which the policy applies (e.g., an application or a host).
  2. Specification: Describes the policy's details, including conditions and triggers.
  3. Action: Specifies the action to be taken when a policy violation occurs (e.g., sending a webhook).

In addition, policies can include variables and properties for further customization. The example below outlines a policy that monitors CPU usage on a specific host and triggers a webhook action if the usage exceeds a predefined threshold:

{
  "name": "cpu_usage-for-agent",
  "subject": {
    "type": "host",
    "hostId": "57e17cac94714bf6976f1e071d64d586",
    "agentId": "icos-agent-1"
  },
  "spec": {
    "description": "Monitor CPU usage",
    "type": "telemetryQuery",
    "expr": "avg without (mode,cpu) (1 - rate(node_cpu_seconds_total{mode=\"idle\", icos_agent_id=\"icos-agent-1\", icos_host_id=\"unique_node_id\"}[2m])) > 0.5",
    "violatedIf": null,
    "thresholds": null
  },
  "action": {
    "type": "webhook",
    "url": "https://localhost:3246/",
    "httpMethod": "POST",
    "extraParams": {},
    "includeAccessToken": false
  },
  "variables": {
    "maxCpu": "0.5"
  },
  "properties": {}
}

The Dynamic Policy Manager’s GUI simplifies policy management for applications and services with an intuitive design and accessible navigation. It supports secure access and controlled permissions, making policy management both efficient and secure. The design is user-friendly, accommodating technical and non-technical users alike. The GUI leverages React JS to provide a fast, a flexible and stable interface designed for dynamic data handling. The combination of React, Material UI, and our custom-built components gives the UI a cohesive and professional look, while remaining adaptable to various user needs.

With its secure, intuitive, and highly functional design, the Dynamic Policy Manager’s GUI is designed to meet current needs in policy management. As this is the first version, future updates are planned to expand its features and make policy management even easier.

Footnotes

  1. This will need the application to publish a metric with the current number of connected users
Summary photo
ENG
Funded by European UnionPart of EUCloudEdgeIoT.eu

This project has received funding from the European Union’s HORIZON research and innovation programme under grant agreement No 101070177.

©2024 ICOS Project