Support for policies definition in the ICOS App Descriptor¶
Polman can automatically parse policies from an ICOS App Descriptor. Policies can be deinfed in two locations:
- per-component policies section. Each policy defined here is applied only to the component where it is defined
- top-level policies section. Policies defined here are applied to all components by default. if the
apply-to
parameter is specified (as a string or a list of string) then the policy is applied only to the component(s) specified
The following example shows where policies are allowed:
name: test-policies
description: a sample manifest to describe the usage of policies
components:
- name: comp1
type: kubernetes
manifests:
- name: c1
##
## PER-COMPONENT policies section
##
policies:
- name: ...
type: ...
- name: comp2
type: kubernetes
manifests:
- name: c2
##
## TOP-LEVEL policies section
##
policies:
- name: ...
type: ...
# Optional. If not set the policy is applied to ALL components
apply-to:
- c1
- c2
There are two types of policies: 1. predefined: a set of generic policies defined and tested by the ICOS team. They can be reused in multiple domains, for this reason they are supported with a special syntax in the App Descriptor 2. custom: policies defined by the user using an expression directly parsed by the Policy Manager
Predefined¶
Node Security Level¶
This policy define a security level (based on the Wazuh SCA score) for the nodes where the application should run. ICOS will ensure that the application will be deployed in a node that satisfy the security level defined. If during the runtime, the node level changes ICOS will move (redeploy) the application in a node taht satisfy the policy.
policies:
#short form
- security: high # can be "high", "medium" or "low"
# alternative syntax
- name: my-security-policy
type: security
level: high # can be "high", "medium" or "low"
remediation: redeploy # "redeploy" is the default and can be omitted
Levels:
- high: SCA score > 67
- medium: SCA score > 34 and < 66
- low: SCA score > 0 and < 33
Component Reachability¶
This policy is based on the ICOS telemetry that periodically checks the workloads running in the continuum. The policy can be used to trigger an action (e.g. redeploy
) when no telemetry is received from a component for a given amount of time.
policies:
# short form
- redployOnLostTelemetry: 5m
# alternative syntax
- name: my-lost-telemetry-policy
type: redeployOnLostTelemetry
timeout: 5m # "5m" is the default and can be omitted
remediation: redeploy # "redeploy" is the default and can be omitted
Node CPU Usage¶
This policy can be used to ensure that the node where the component is running has cpu usage below a given threshold.
policies:
- name: my-policy
type: custom
fromTemplate: app-host-cpu-usage
remediation: redeploy
variables:
maxCpu: 0.8 # 80%
COMPSS Under Allocation¶
This policy works with COMPSS applications and it is used to scale-up the application replicas when the estimated time to finish (ETA) is above a given threshold.
policies:
- name: my-policy
type: custom
fromTemplate: compss-under-allocation
remediation: scale-up
variables:
thresholdTimeSeconds: 120
compssTask: provesOtel.example_task
Custom¶
Custom policies must provide a valid expression based on metrics collected by the ICOS telemetry system. They also have to provide a valid remediation action.
policies:
- name: lost-telemetry-as-custom
type: custom
spec:
expr: "... custom PROMQL expression ..."
remediation: redeploy
variables:
myVar: myVal
properties:
pendingInterval: 5m
The expr
must be a valid PromQL expression. It can use "{{variable_name}}" tokens that are replaced at policy creation time with the value of variables, properties or few predefined tokens (TODO: provide additional documentation).
Common¶
All policies can specify some fields to override/enrich the default behaviour.