Skip to content

Runai Agent Pull Rate Low

Meaning

The runai-agent pod may be too loaded, is slow in processing data (possible in very big clusters), or the ‘runai-agent‘ pod itself in the ‘runai’ namespace may not be functioning properly.

Impact

Possible impact - no info/partial info from the control-plane is being synced in the cluster.

Severity

Critical

Diagnosis

Run:

kubectl get pod -n runai

And see if the runai-agent pod is running.

Mitigation

Run:

kubectl describe deployment runai-agent -n runai
kubectl logs deployment/runai-agent -n runai

From the logs and pod details, try and figure out why the runai-agent pod is not functioning properly.

Perhaps there’s a connectivity issue with the control-plane?

If it seems that the agent is functioning properly, but the cluster is very big and loaded, it makes sense that the agent takes time to process the data coming from the control-plane.

In that case, if we want the alert to stop firing, we can try to edit the value under which the alert starts firing.

The default is 0.05, you can try to change it to less than that (for example, 0.045 or 0.04).

kubectl edit runaiconfig -n runai

under: spec -> prometheus -> agentPullPushRateMinForAlert (if the value doesn’t exist, add it)

Be the first to add a reaction