Skip to content

Runai DaemonSet Rollout Stuck

Meaning

Runai daemonset has 0 available pods on a relevant node.

Impact

No fractional gpu workloads support.

Severity

Critical

Diagnosis

Run

kubectl get daemonset -n runai-backend

Identify the one or more daemonsets that have no running pods on some of the nodes.

Mitigation

Run kubectl describe daemonset X -n runai on the relevant deamonset(s) to try and figure out why it cannot create pods.

If you cannot correct the issue, contact Run:ai support.