Skip to content

Prometheus Alerts

The Prometheus Operator introduces an Alertmanager resource that sends alerts about the cluster. Alertmanager is used to:

To configure Prometheus to send alerts, see Setting up Alert Monitoring for Run:ai Using Alertmanager in Prometheus.

List of Alerts

The following is a list of alerts that you will receive once Prometheus is configured.

Alert Name
RunaiAgentClusterInfoPushRateLow
RunaiAgentPullRateLow
RunaiContainerMemoryUsageCritical
RunaiContainerMemoryUsageWarning
RunaiContainerRestarting
RunaiCpuUsageWarning
RunaiCriticalProblem
RunaiDaemonSetRolloutStuck
RunaiDaemonSetUnavailableOnNodes
RunaiDeploymentInsufficientReplicas
RunaiDeploymentNoAvailableReplicas
RunaiDeploymentUnavailableReplicas
RunaiProjectControllerReconcileFailure
RunaiStatefulSetInsufficientReplicas
RunaiStatefulSetNoAvailableReplicas