美文网首页
arp_cache: neighbor table overfl

arp_cache: neighbor table overfl

作者: wwq2020 | 来源:发表于2023-11-29 11:24 被阅读0次

    背景

    线上k8s集群的cni为flannel,apiserver时不时发生重启

    排查

    查看apiserver重启前日志

    kubectl logs -f -n kube-system {apiserver-pod-name}
    

    发现关闭前大量timeout

    查看syslog
    发现大量neighbour: arp_cache: neighbor table overflow!

    https://man7.org/linux/man-pages/man7/arp.7.html中两个重要的信息

           Entries which are marked as permanent are never deleted by the
           garbage-collector
    
           gc_thresh1 (since Linux 2.2)
                  The minimum number of entries to keep in the ARP cache.
                  The garbage collector will not run if there are fewer than
                  this number of entries in the cache.  Defaults to 128.
    
           gc_thresh2 (since Linux 2.2)
                  The soft maximum number of entries to keep in the ARP
                  cache.  The garbage collector will allow the number of
                  entries to exceed this for 5 seconds before collection
                  will be performed.  Defaults to 512.
    
           gc_thresh3 (since Linux 2.2)
                  The hard maximum number of entries to keep in the ARP
                  cache.  The garbage collector will always run if there are
                  more than this number of entries in the cache.  Defaults
                  to 1024
    

    当arp cache存活超过5s的条目大于gc_thresh2会触发gc
    当arp cache条目大于gc_thresh3时会触发gc,gc后仍然大于gc_thresh3则报错arp_cache: neighbor table overflow!
    但是由于flannel这个arp配置是flannel自动配置成permanent,所以无法被gc,也就是一旦大于了gc_thresh3就无法添加新的arp cache条目

    解决

    提高arp_cache的gc阈值,修改/etc/sysctl.conf添加,例如

    net.ipv4.neigh.default.gc_thresh1 = 8192
    net.ipv4.neigh.default.gc_thresh2 = 16384
    net.ipv4.neigh.default.gc_thresh3 = 32768
    

    然后加载

    sysctl -p
    

    相关文章

      网友评论

          本文标题:arp_cache: neighbor table overfl

          本文链接:https://www.haomeiwen.com/subject/loxfgdtx.html