Unplanned maintenance is a key issue for asset-intensive industries, as it can seriously affect costs due to extended production downtimes. According to a recent report from Senseye1, up to $20 billion is lost annually due to unplanned repairs.
Predictive maintenance tackles this issue by using analytics and Machine Learning (ML) to estimate the lifespan of a machine and the failing likelihood so that maintenance can be scheduled before equipment fails, balancing costs of anticipated maintenance with production downtimes risks. This requires a new approach that leverages the Internet of Things (IoT) and the industry 4.0 paradigm, where advances in artificial intelligence and data mining allow the processing of a great amount of data to provide more accurate and advanced predictive models2 improving forecasts reliability3.
Similarly, in the application and infrastructure domain also hardware and software components need to be maintained and monitored to avoid costly failures, downtimes, performance and quality of service/experience (QoS/QoE) degradation.
Predictive Monitoring applies ML techniques to monitoring data, configuration data, application logs and error reports to provide useful insights on: the current and future health of an application, potential degradation of performance (e.g., response time, latency) or violations of Service Level Agreements (SLAs) that are about to happen, future loads of applications and infrastructures. This new knowledge can support smart and timely decisions that can save time and money and ensure a more reliable and resilient infrastructure.
Figure 1: Predictive forecasting graph
In ICOS we tackle both challenges of predictive maintenance and predictive monitor. From one side, predictive monitoring is of interest of the ICOS Software Platform and we will investigate and implement mechanisms, tools and models to exploit past monitoring data for predicting future behaviours. This will realise an optimised orchestration of the applications and infrastructures support smart decisions like re-deployments and reconfiguration of components to maintain a given QoS/QoE.
On the other side, the findings and outcomes produced in the predictive monitoring activity will also be employed to provide tools to realise predictive maintenance systems to support the project's use cases (such as, the Agriculture Operational Robotic Platform and the Railway Structural Alert Monitoring System). The goal is to provide generic predictive maintenance tools that can be re-used in a variety of scenarios also beyond those described in the project.
Similar ML approaches can be useful in both cases, monitoring and maintenance, to identify trends and anomalies, forecast key features, and enable smart decision-making. The "time" dimension plays an important role that may affect the validity of a forecast. Indeed, a forecast is a prediction of values that a variable could take with different probabilities at a certain time, which constrains future ones. In other words, the further ahead, the more uncertain a forecast is, so time sensitivity, cycles, and seasonality heavily impact forecasts and need to be properly addressed.
We will explore multiple inference models, as well as those most used for time-series forecasting4 like exponential smoothing and Auto Regressive Integrated Moving Average (ARIMA) models to identify trends and seasonality in the data. We will also investigate Machine Learning Operations (MLOps) tools, such as KubeFlow5, to speed up data collection, labelling, model training, and operation.
This project has received funding from the European Union’s HORIZON research and innovation programme under grant agreement No 101070177.