NodeFileDescriptorLimit #
Meaning #
This alert is triggered when a node’s kernel is found to be running out of
available file descriptors – a warning
level alert at greater than 70% usage
and a critical
level alert at greater than 90% usage.
Impact #
Applications on the node may no longer be able to open and operate on files. This is likely to have severe consequences for anything scheduled on this node.
Diagnosis #
You can open a shell on the node and use the standard Linux utilities to diagnose the issue:
$ NODE_NAME='<value of instance label from alert>'
$ oc debug "node/$NODE_NAME"
# sysctl -a | grep 'fs.file-'
fs.file-max = 1597016
fs.file-nr = 7104 0 1597016
# lsof -n
Mitigation #
Reduce the number of files opened simultaneously by either adjusting application configuration or by moving some applications to other nodes.