Support for policies definition in the ICOS App Descriptor¶
Polman can automatically parse policies from an ICOS App Descriptor. Policies can be deinfed in two locations:
- per-component policies section. Each policy defined here is applied only to the component where it is defined
- top-level policies section. Policies defined here are applied to all components by default. if the
apply-toparameter is specified (as a string or a list of string) then the policy is applied only to the component(s) specified
The following example shows where policies are allowed:
name: test-policies
description: a sample manifest to describe the usage of policies
components:
- name: comp1
type: kubernetes
manifests:
- name: c1
##
## PER-COMPONENT policies section
##
policies:
- name: ...
type: ...
- name: comp2
type: kubernetes
manifests:
- name: c2
##
## TOP-LEVEL policies section
##
policies:
- name: ...
type: ...
# Optional. If not set the policy is applied to ALL components
apply-to:
- c1
- c2
There are two types of policies: 1. predefined: a set of generic policies defined and tested by the ICOS team. They can be reused in multiple domains, for this reason they are supported with a special syntax in the App Descriptor 2. custom: policies defined by the user using an expression directly parsed by the Policy Manager
Predefined¶
Node Security Level¶
This policy define a security level (based on the Wazuh SCA score) for the nodes where the application should run. ICOS will ensure that the application will be deployed in a node that satisfy the security level defined. If during the runtime, the node level changes ICOS will move (redeploy) the application in a node taht satisfy the policy.
policies:
#short form
- security: high # can be "high", "medium" or "low"
# alternative syntax
- name: my-security-policy
type: security
level: high # can be "high", "medium" or "low"
remediation: redeploy # "redeploy" is the default and can be omitted
Levels:
- high: SCA score > 67
- medium: SCA score > 34 and < 66
- low: SCA score > 0 and < 33
Component Reachability¶
This policy is based on the ICOS telemetry that periodically checks the workloads running in the continuum. The policy can be used to trigger an action (e.g. redeploy) when no telemetry is received from a component for a given amount of time.
policies:
# short form
- redployOnLostTelemetry: 5m
# alternative syntax
- name: my-lost-telemetry-policy
type: redeployOnLostTelemetry
timeout: 5m # "5m" is the default and can be omitted
remediation: redeploy # "redeploy" is the default and can be omitted
Node CPU Usage¶
This policy can be used to ensure that the node where the component is running has cpu usage below a given threshold.
policies:
- name: my-policy
type: custom
fromTemplate: app-host-cpu-usage
remediation: redeploy
variables:
maxCpu: 0.8 # 80%
COMPSS Under Allocation¶
This policy works with COMPSS applications and it is used to scale-up the application replicas when the estimated time to finish (ETA) is above a given threshold.
policies:
- name: my-policy
type: custom
fromTemplate: compss-under-allocation
remediation: scale-up
variables:
thresholdTimeSeconds: 120
compssTask: provesOtel.example_task
Custom¶
Custom policies must provide a valid expression based on metrics collected by the ICOS telemetry system. They also have to provide a valid remediation action.
policies:
- name: lost-telemetry-as-custom
type: custom
spec:
expr: "... custom PROMQL expression ..."
remediation: redeploy
variables:
myVar: myVal
properties:
pendingInterval: 5m
The expr must be a valid PromQL expression. It can use "{{variable_name}}" tokens that are replaced at policy creation time with the value of variables, properties or few predefined tokens (TODO: provide additional documentation).
Common¶
All policies can specify some fields to override/enrich the default behaviour.