美文网首页
K8S 性能优化 - OS sysctl 调优

K8S 性能优化 - OS sysctl 调优

作者: 东风微鸣 | 来源:发表于2023-03-11 11:28 被阅读0次

    前言

    K8S 性能优化系列文章,本文为第一篇:OS sysctl 性能优化参数最佳实践。

    参数一览

    sysctl 调优参数一览

    # Kubernetes Settings
    vm.max_map_count = 262144
    kernel.softlockup_panic = 1
    kernel.softlockup_all_cpu_backtrace = 1
    net.ipv4.ip_local_reserved_ports = 30000-32767
    
    # Increase the number of connections
    net.core.somaxconn = 32768
    
    # Maximum Socket Receive Buffer
    net.core.rmem_max = 16777216
    
    # Maximum Socket Send Buffer
    net.core.wmem_max = 16777216
    
    # Increase the maximum total buffer-space allocatable
    net.ipv4.tcp_wmem = 4096 87380 16777216
    net.ipv4.tcp_rmem = 4096 87380 16777216
    
    # Increase the number of outstanding syn requests allowed
    net.ipv4.tcp_max_syn_backlog = 8096
    
    
    # For persistent HTTP connections
    net.ipv4.tcp_slow_start_after_idle = 0
    
    # Allow to reuse TIME_WAIT sockets for new connections
    # when it is safe from protocol viewpoint
    net.ipv4.tcp_tw_reuse = 1
    
    # Max number of packets that can be queued on interface input
    # If kernel is receiving packets faster than can be processed
    # this queue increases
    net.core.netdev_max_backlog = 16384
    
    # Increase size of file handles and inode cache
    fs.file-max = 2097152
    
    # Max number of inotify instances and watches for a user
    # Since dockerd runs as a single user, the default instances value of 128 per user is too low
    # e.g. uses of inotify: nginx ingress controller, kubectl logs -f
    fs.inotify.max_user_instances = 8192
    fs.inotify.max_user_watches = 524288
    
    # Additional sysctl flags that kubelet expects
    vm.overcommit_memory = 1
    kernel.panic = 10
    kernel.panic_on_oops = 1
    
    # Prevent docker from changing iptables: https://github.com/kubernetes/kubernetes/issues/40182
    net.ipv4.ip_forward=1
    

    如果是 AWS,额外增加如下:

    # AWS settings
    # Issue #23395
    net.ipv4.neigh.default.gc_thresh1=0
    

    如果启用了 IPv6,额外增加如下:

    # Enable IPv6 forwarding for network plugins that don't do it themselves
    net.ipv6.conf.all.forwarding=1
    

    参数解释

    分类 内核参数 说明 参考链接
    Kubernetes vm.max_map_count = 262144 限制一个进程可以拥有的VMA(虚拟内存区域)的数量,<br />一个更大的值对于 elasticsearch、mongo 或其他 mmap 用户来说非常有用 ES Configuration
    Kubernetes kernel.softlockup_panic = 1 用于解决 K8S 内核软锁相关 bug root cause kernel soft lockups · Issue #37853 · kubernetes/kubernetes (github.com)
    Kubernetes kernel.softlockup_all_cpu_backtrace = 1 用于解决 K8S 内核软锁相关 bug root cause kernel soft lockups · Issue #37853 · kubernetes/kubernetes (github.com)
    Kubernetes net.ipv4.ip_local_reserved_ports = 30000-32767 默认 K8S Nodport 端口 service-node-port-range and ip_local_port_range collision · Issue #6342 · kubernetes/kops (github.com)
    网络 net.core.somaxconn = 32768 表示socket监听(listen)的backlog上限。什么是backlog?backlog就是socket的监听队列,当一个请求(request)尚未被处理或建立时,他会进入backlog。<br />增加连接数. Image: We should tweak our sysctls · Issue #261 · kubernetes-retired/kube-deploy (github.com)
    网络 net.core.rmem_max = 16777216 接收套接字缓冲区大小的最大值(以字节为单位)。<br />最大化 Socket Receive Buffer Image: We should tweak our sysctls · Issue #261 · kubernetes-retired/kube-deploy (github.com)
    网络 net.core.wmem_max = 16777216 发送套接字缓冲区大小的最大值(以字节为单位)。<br />最大化 Socket Send Buffer Image: We should tweak our sysctls · Issue #261 · kubernetes-retired/kube-deploy (github.com)
    网络 net.ipv4.tcp_wmem = 4096 87380 16777216<br />net.ipv4.tcp_rmem = 4096 87380 16777216 增加总的可分配的 buffer 空间的最大值 Image: We should tweak our sysctls · Issue #261 · kubernetes-retired/kube-deploy (github.com)
    网络 net.ipv4.tcp_max_syn_backlog = 8096 表示那些尚未收到客户端确认信息的连接(SYN消息)队列的长度,默认为1024<br />增加未完成的syn请求的数量 Image: We should tweak our sysctls · Issue #261 · kubernetes-retired/kube-deploy (github.com)
    网络 net.ipv4.tcp_slow_start_after_idle = 0 持久化 HTTP 连接 Image: We should tweak our sysctls · Issue #261 · kubernetes-retired/kube-deploy (github.com)
    网络 net.ipv4.tcp_tw_reuse = 1 表示允许重用TIME_WAIT状态的套接字用于新的TCP连接,默认为0,表示关闭。<br />允许在协议安全的情况下重用TIME_WAIT 套接字用于新的连接 Image: We should tweak our sysctls · Issue #261 · kubernetes-retired/kube-deploy (github.com)
    网络 net.core.netdev_max_backlog = 16384 当网卡接收数据包的速度大于内核处理的速度时,会有一个队列保存这些数据包。这个参数表示该队列的最大值<br />如果内核接收数据包的速度超过了可以处理的速度,这个队列就会增加 Image: We should tweak our sysctls · Issue #261 · kubernetes-retired/kube-deploy (github.com)
    文件系统 fs.file-max = 2097152 该参数决定了系统中所允许的文件句柄最大数目,文件句柄设置代表linux系统中可以打开的文件的数量。<br />增加文件句柄和inode缓存的大小 Image: We should tweak our sysctls · Issue #261 · kubernetes-retired/kube-deploy (github.com)
    文件系统 fs.inotify.max_user_instances = 8192<br />fs.inotify.max_user_watches = 524288 一个用户的inotify实例和watch的最大数量
    由于dockerd作为单个用户运行,每个用户的默认实例值128太低了
    例如使用inotify: nginx ingress controller, kubectl logs -f
    Image: We should tweak our sysctls · Issue #261 · kubernetes-retired/kube-deploy (github.com)
    kubelet vm.overcommit_memory = 1 对内存分配的一种策略<br />=1, 表示内核允许分配所有的物理内存,而不管当前的内存状态如何 Image: We should tweak our sysctls · Issue #261 · kubernetes-retired/kube-deploy (github.com)
    kubelet kernel.panic = 10 panic错误中自动重启,等待时间为10秒 Image: We should tweak our sysctls · Issue #261 · kubernetes-retired/kube-deploy (github.com)
    kubelet kernel.panic_on_oops = 1 在Oops发生时会进行panic()操作 Image: We should tweak our sysctls · Issue #261 · kubernetes-retired/kube-deploy (github.com)
    网络 net.ipv4.ip_forward=1 启用ip转发<br />另外也防止docker改变iptables Upgrading docker 1.13 on nodes causes outbound container traffic to stop working · Issue #40182 · kubernetes/kubernetes (github.com)
    网络 net.ipv4.neigh.default.gc_thresh1=0 修复 AWS arp_cache: neighbor table overflow! 报错 arp_cache: neighbor table overflow! · Issue #4533 · kubernetes/kops (github.com)

    EOF

    三人行, 必有我师; 知识共享, 天下为公. 本文由东风微鸣技术博客 EWhisper.cn 编写.

    相关文章

      网友评论

          本文标题:K8S 性能优化 - OS sysctl 调优

          本文链接:https://www.haomeiwen.com/subject/nodxrdtx.html