背景
线上k8s集群的cni为flannel,apiserver时不时发生重启
排查
查看apiserver重启前日志
kubectl logs -f -n kube-system {apiserver-pod-name}
发现关闭前大量timeout
查看syslog
发现大量neighbour: arp_cache: neighbor table overflow!
https://man7.org/linux/man-pages/man7/arp.7.html中两个重要的信息
Entries which are marked as permanent are never deleted by the
garbage-collector
gc_thresh1 (since Linux 2.2)
The minimum number of entries to keep in the ARP cache.
The garbage collector will not run if there are fewer than
this number of entries in the cache. Defaults to 128.
gc_thresh2 (since Linux 2.2)
The soft maximum number of entries to keep in the ARP
cache. The garbage collector will allow the number of
entries to exceed this for 5 seconds before collection
will be performed. Defaults to 512.
gc_thresh3 (since Linux 2.2)
The hard maximum number of entries to keep in the ARP
cache. The garbage collector will always run if there are
more than this number of entries in the cache. Defaults
to 1024
当arp cache存活超过5s的条目大于gc_thresh2会触发gc
当arp cache条目大于gc_thresh3时会触发gc,gc后仍然大于gc_thresh3则报错arp_cache: neighbor table overflow!
但是由于flannel这个arp配置是flannel自动配置成permanent,所以无法被gc,也就是一旦大于了gc_thresh3就无法添加新的arp cache条目
解决
提高arp_cache的gc阈值,修改/etc/sysctl.conf添加,例如
net.ipv4.neigh.default.gc_thresh1 = 8192
net.ipv4.neigh.default.gc_thresh2 = 16384
net.ipv4.neigh.default.gc_thresh3 = 32768
然后加载
sysctl -p
网友评论