Closing the loop: from Observation to Action

Performance monitoring and observation is a requirement in the complex IT systems we are building nowaday. Exascale systems are digital factories operating with millions of cores and discrete components. As any factory, these systems are instrumented and monitored. Performance observation is facing three main challenges: Operating at scale ADMIRE monitoring infrastructure is using Prometheus as … Read more