Kube CPU Overcommit

KubeCPUOvercommit #

Meaning #

Cluster has overcommitted CPU resource requests for Pods and cannot tolerate node failure.

Full context

Total number of CPU requests for pods exceeds cluster capacity. In case of node failure some pods will not fit in the remaining nodes.

The cluster cannot tolerate node failure. In the event of a node failure, some Pods will be in Pending state.

Add more nodes to the cluster - usually it is better to have more smaller nodes, than few bigger.
Add different node pools with different instance types to avoid problem when using only one instance type in the cloud.
Use pod priorities to avoid important services from losing performance, see pod priority and preemption
Fine tune settings for special pods used with cluster-autoscaler
Prepare performance tests for the expected workload, plan cluster capacity accordingly.