美文网首页
容器内核参数

容器内核参数

作者: 陈sir的知识图谱 | 来源:发表于2020-08-14 10:15 被阅读0次

    容器安全-内核参数?
    容器与宿主机共用内核,修改内核参数可能会影响到宿主机和其他容器。非特权容器大部分内核参数无法修改,可以修改的内核参数主要涉及到 IPC Namespace和Net Namespace,可修改内核参数需要满足3个条件

    • docker白名单
    • 可命名空间化
    • 容器中可见
    1. docker白名单
      IPC Namespace kernel.msgmax kernel.msgmnb kernel.msgmni kernel.sem kernel.shmall kernel.shmmax kernel.shmmni kernel.shm_rmid_forced fs.mqueue.msg_default fs.mqueue.msg_max fs.mqueue.msgsize_default fs.mqueue.msgsize_max fs.mqueue.queues_max
      Net Namespace net.*
    2. 可命名空间化的
      使用特权的容器可以修改所有容器可见的参数,其中不影响宿主机的为可命名空间化的,与白名单大致一致。
    # sysctl -a | grep fs.mqueue.msg_max
    fs.mqueue.msg_max = 10
    # docker run   --privileged --rm centos /bin/bash -c "sysctl -w fs.mqueue.msg_max=5"
    fs.mqueue.msg_max = 5
    # sysctl -a | grep fs.mqueue.msg_max                            
    fs.mqueue.msg_max = 10
    

    3、符合白名单但是容器中不可见的参数
    符合白名单且不可见的参数有:

    net.bridge.bridge-nf-call-arptables
    net.bridge.bridge-nf-call-ip6tables
    net.bridge.bridge-nf-call-iptables
    net.bridge.bridge-nf-filter-pppoe-tagged
    net.bridge.bridge-nf-filter-vlan-tagged
    net.bridge.bridge-nf-pass-vlan-input-dev
    net.core.bpf_jit_enable
    net.core.bpf_jit_harden
    net.core.bpf_jit_kallsyms
    net.core.bpf_jit_limit
    net.core.busy_poll
    net.core.busy_read
    net.core.default_qdisc
    net.core.dev_weight
    net.core.dev_weight_rx_bias
    net.core.dev_weight_tx_bias
    net.core.fb_tunnels_only_for_init_net
    net.core.flow_limit_cpu_bitmap
    net.core.flow_limit_table_len
    net.core.max_skb_frags
    net.core.message_burst
    net.core.message_cost
    net.core.netdev_budget
    net.core.netdev_budget_usecs
    net.core.netdev_max_backlog
    net.core.netdev_rss_key
    net.core.netdev_tstamp_prequeue
    net.core.optmem_max
    net.core.rmem_default
    net.core.rmem_max
    net.core.rps_sock_flow_entries
    net.core.tstamp_allow_data
    net.core.warnings
    net.core.wmem_default
    net.core.wmem_max
    net.ipv4.icmp_msgs_burst
    net.ipv4.icmp_msgs_per_sec
    net.ipv4.inet_peer_maxttl
    net.ipv4.inet_peer_minttl
    net.ipv4.inet_peer_threshold
    net.ipv4.ipfrag_secret_interval
    net.ipv4.route.error_burst
    net.ipv4.route.error_cost
    net.ipv4.route.gc_elasticity
    net.ipv4.route.gc_interval
    net.ipv4.route.gc_min_interval
    net.ipv4.route.gc_min_interval_ms
    net.ipv4.route.gc_thresh
    net.ipv4.route.gc_timeout
    net.ipv4.route.max_size
    net.ipv4.route.min_adv_mss
    net.ipv4.route.min_pmtu
    net.ipv4.route.mtu_expires
    net.ipv4.route.redirect_load
    net.ipv4.route.redirect_number
    net.ipv4.route.redirect_silence
    net.ipv4.tcp_allowed_congestion_control
    net.ipv4.tcp_available_congestion_control
    net.ipv4.tcp_available_ulp
    net.ipv4.tcp_low_latency
    net.ipv4.tcp_max_orphans
    net.ipv4.tcp_mem
    net.ipv4.udp_mem
    net.netfilter.nf_log_all_netns
    net.nf_conntrack_max
    net.unix.max_dgram_qlen
    net.netfilter.nf_conntrack_tcp_timeout_close_wait
    net.netfilter.nf_conntrack_tcp_timeout_established
    net.ipv6.conf.default.disable_ipv6
    net.ipv6.conf.all.disable_ipv6
    net.netfilter.nf_conntrack_count
    

    4、常见可修改的内核参数

    net.core.somaxconn 
    net.ipv4.conf.xxx.proxy_arp_pvlan 
    net.ipv4.ip_default_ttl 
    net.ipv4.ip_forward 
    net.ipv4.tcp_base_mss 
    net.ipv4.tcp_sack 
    net.ipv4.tcp_syncookies 
    net.ipv4.tcp_timestamps 
    net.ipv4.tcp_tw_reuse
    net.ipv4.tcp_window_scaling 
    net.ipv4.tcp_wmem
    net.ipv4.udp_rmem_min 
    net.ipv4.udp_wmem_min 
    

    5、容器缺省会修改的内核参数
    docker中有部分内核参数缺省会修改,需要注意

    net.unix.max_dgram_qlen = 10
    net.netfilter.nf_conntrack_tcp_timeout_close_wait = 60
    net.netfilter.nf_conntrack_tcp_timeout_established = 432000
    net.ipv6.conf.default.disable_ipv6 = 1
    net.ipv6.conf.all.disable_ipv6 = 1
    net.netfilter.nf_conntrack_count = 0
    

    6、docker中修改内核参数
    如果ipc和net配置为host,则无法修改。
    拥有特权的容器可以修改所有可见参数,可能会影响宿主机和其他容器。
    非特权容器内内核参数为只读文件系统,任何内核参数在容器内部修改会报错。

    # sysctl -w net.ipv6.icmp.ratelimit=500
    sysctl: setting key "net.ipv6.icmp.ratelimit": Read-only file system
    

    在docker run中修改不在白名单中的内核参数,则会报错不在白名单

    #  docker run -it --sysctl vm.swappiness=10 centos /bin/bash    invalid argument "vm.swappiness=10" for "--sysctl" flag: sysctl 'vm.swappiness=10' is not whitelisted
    See 'docker run --help'.
    

    在docker run中修改不可见内核参数,容器会报文件不存在

    #  docker run -it --sysctl net.core.busy_poll=1 centos /bin/bash
    docker: Error response from daemon: OCI runtime create failed: container_linux.go:345: starting container process caused "process_linux.go:430: container init caused \"write sysctl key net.core.busy_poll: open /proc/sys/net/core/busy_poll: no such file or directory\"": unknown.
    ERRO[0000] error waiting for container: context canceled
    

    在docker run中修改可修改的内核参数net.ipv4.ip_default_ttl=32

    # sysctl -a | grep net.ipv4.ip_default_ttl
    net.ipv4.ip_default_ttl = 64
    # docker run --rm centos /bin/bash -c "ping -c 1 172.17.0.2"
    PING 172.17.0.2 (172.17.0.2) 56(84) bytes of data.
    64 bytes from 172.17.0.2: icmp_seq=1 ttl=64 time=0.279 ms
    --- 172.17.0.2 ping statistics ---
    1 packets transmitted, 1 received, 0% packet loss, time 0ms
    rtt min/avg/max/mdev = 0.279/0.279/0.279/0.000 ms
    # docker run --sysctl net.ipv4.ip_default_ttl=32 --rm centos /bin/bash -c "ping -c 1 172.17.0.2"
    PING 172.17.0.2 (172.17.0.2) 56(84) bytes of data.
    64 bytes from 172.17.0.2: icmp_seq=1 ttl=32 time=0.033 ms
    --- 172.17.0.2 ping statistics ---
    1 packets transmitted, 1 received, 0% packet loss, time 0ms
    rtt min/avg/max/mdev = 0.033/0.033/0.033/0.000 ms
    

    调整内核参数启动ipv6

    # docker run --rm  centos ip a
    1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
        link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
        inet 127.0.0.1/8 scope host lo
           valid_lft forever preferred_lft forever
    2: tunl0@NONE: <NOARP> mtu 1480 qdisc noop state DOWN group default qlen 1000
        link/ipip 0.0.0.0 brd 0.0.0.0
    148: eth0@if149: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
        link/ether 02:42:ac:11:00:02 brd ff:ff:ff:ff:ff:ff link-netnsid 0
        inet 172.17.0.2/16 brd 172.17.255.255 scope global eth0
           valid_lft forever preferred_lft forever
    # docker run --sysctl net.ipv6.conf.default.disable_ipv6=0 --sysctl net.ipv6.conf.all.disable_ipv6=0 --rm  centos ip a
    1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
        link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
        inet 127.0.0.1/8 scope host lo
           valid_lft forever preferred_lft forever
        inet6 ::1/128 scope host
           valid_lft forever preferred_lft forever
    2: tunl0@NONE: <NOARP> mtu 1480 qdisc noop state DOWN group default qlen 1000
        link/ipip 0.0.0.0 brd 0.0.0.0
    150: eth0@if151: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
        link/ether 02:42:ac:11:00:02 brd ff:ff:ff:ff:ff:ff link-netnsid 0
        inet 172.17.0.2/16 brd 172.17.255.255 scope global eth0
           valid_lft forever preferred_lft forever
        inet6 fe80::42:acff:fe11:2/64 scope link tentative
           valid_lft forever preferred_lft forever
    

    6、K8S中修改内核参数
    查看相关参数

    # kubectl explain pod.spec.securityContext.sysctls
    KIND:     Pod
    VERSION:  v1
    RESOURCE: sysctls <[]Object>
    DESCRIPTION:
         Sysctls hold a list of namespaced sysctls used for the pod. Pods with
         unsupported sysctls (by the container runtime) might fail to launch.
         Sysctl defines a kernel parameter to be set
    FIELDS:
       name <string> -required-
       Name of a property to set
       value        <string> -required-
       Value of a property to set
    不是docker中可以修改的内核参数在k8s中就可以修改,缺省k8s只认为下面三个参数是安全的:
    kernel.shm_rmid_forced
    net.ipv4.ip_local_port_range
    net.ipv4.tcp_syncookies
    

    要使用其他内核参数,需要在kubelet中启用参数

    # cat /etc/sysconfig/kubelet
    KUBELET_EXTRA_ARGS= "--allowed-unsafe-sysctls=net.ipv6.conf.*"
    

    在pod中启用ipv6的内核参数

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      labels:
        run: test1
      name: test1
      namespace: default
    spec:
      selector:
        matchLabels:
          run: test1
      template:
        metadata:
          labels:
            run: test1
        spec:
          containers:
          - args:
            - "/bin/sh"
            - "-c"
            - "sleep 120"
            image: centos:latest
            name: test1
          securityContext:
            sysctls:
              - name: net.ipv6.conf.default.disable_ipv6
                value: '0'
              - name: net.ipv6.conf.all.disable_ipv6
                value: '0'
    

    查看ipv6地址

    # kubectl exec test1-66ccc967c5-7xrpc  ip a
    1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
        link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
        inet 127.0.0.1/8 scope host lo
           valid_lft forever preferred_lft forever
    2: tunl0@NONE: <NOARP> mtu 1480 qdisc noop state DOWN group default qlen 1000
        link/ipip 0.0.0.0 brd 0.0.0.0
    4: eth0@if154: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1440 qdisc noqueue state UP group default
        link/ether 66:cf:30:b8:22:71 brd ff:ff:ff:ff:ff:ff link-netnsid 0
        inet 172.16.137.96/32 scope global eth0
           valid_lft forever preferred_lft forever
        inet6 fe80::64cf:30ff:feb8:2271/64 scope link
           valid_lft forever preferred_lft forever
    

    相关文章

      网友评论

          本文标题:容器内核参数

          本文链接:https://www.haomeiwen.com/subject/oatgdktx.html