Runai Agent Pull Rate Low
Meaning¶
The runai-agent
pod may be too loaded, is slow in processing data (possible in very big clusters), or the ‘runai-agent‘ pod itself in the ‘runai’ namespace may not be functioning properly.
Impact¶
Possible impact - no info/partial info from the control-plane is being synced in the cluster.
Severity¶
Critical
Diagnosis¶
Run:
kubectl get pod -n runai
And see if the runai-agent
pod is running.
Mitigation¶
Run:
From the logs and pod details, try and figure out why the runai-agent
pod is not functioning properly.
Perhaps there’s a connectivity issue with the control-plane?
If it seems that the agent is functioning properly, but the cluster is very big and loaded, it makes sense that the agent takes time to process the data coming from the control-plane.
In that case, if we want the alert to stop firing, we can try to edit the value under which the alert starts firing.
The default is 0.05, you can try to change it to less than that (for example, 0.045 or 0.04).
kubectl edit runaiconfig -n runai
under: spec
-> prometheus
-> agentPullPushRateMinForAlert
(if the value doesn’t exist, add it)
Be the first to add a reaction