Kube Node Readiness Flapping

KubeNodeReadinessFlapping #

Meaning #

The readiness status of node has changed few times in the last 15 minutes.

The performance of the cluster deployments is affected, depending on the overall workload and the type of the node.

The notification details should list the node that’s not reachable. For Example:

 - alertname = KubeNodeUnreachable
...
 - node = node1.example.com
...

$ kubectl get node $NODE -o yaml

The output should describe why the node is not reachable.

Common failure scenarios:

In case of maintenance ensure to cordon and drain node.

In other cases ensure storage and networking redundancy if applicable.