Network interface is reporting many receive errors.
Applications on the node may no longer be able to operate with other services. Network attached storage performance issues or even data loss.
Investigate networkng issues on the node and to connected hardware. Check physical cables, check networking firewall rules and so on.
In general mitigation landscape is quite vast, some suggestions:
- Ensure some node capacity is left unallocated (cpu/memory) for handling networking.
- Increase TX queue length
- Spread services to other nodes/pods.
- Replace physical cables, change ports.
- Look into introducting Quality of Service or other TCP congestion avoidance algorithms