美文网首页k8sK8s错误集锦Linux
【K8s 精选】CKA - 如何排查网络故障

【K8s 精选】CKA - 如何排查网络故障

作者: 熊本极客 | 来源:发表于2022-03-29 14:15 被阅读0次

    1.排查应用 Pod 的 DNS 配置

    #查找应用
    $kubectl get pod  |grep flink
    deployment-flink-jobmanager-c695cf9d-rgtbh             1/1     Running   0          2d23h
    deployment-flink-taskmanager-7c7bbcd4db-5qv9k          1/1     Running   0          2d23h
    deployment-flink-taskmanager-7c7bbcd4db-hnpm5          1/1     Running   0          2d23h
    
    #查看应用中的DNS配置
    $kubectl exec -it deployment-flink-jobmanager-c695cf9d-rgtbh -- cat /etc/resolv.conf
    nameserver 10.96.0.10
    search default.svc.cluster.local svc.cluster.local cluster.local
    options ndots:5
    
    #查看DNS地址是否正确
    $kubectl get svc -nkube-system |grep 10.96.0.10
    kube-dns   ClusterIP   10.96.0.10   <none>        53/UDP,53/TCP,9153/TCP   32d
    

    2.排查 DNS 是否启用 Service 及其后端 Pod 是否正常运行

    #查找dns的service
    $kubectl get svc -nkube-system -owide
    NAME       TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)                  AGE   SELECTOR
    kube-dns   ClusterIP   10.96.0.10   <none>        53/UDP,53/TCP,9153/TCP   32d   k8s-app=kube-dns
    
    #查看 service 详情
    $kubectl describe svc -nkube-system kube-dns
    Name:              kube-dns
    Namespace:         kube-system
    Labels:            k8s-app=kube-dns
                       kubernetes.io/cluster-service=true
                       kubernetes.io/name=KubeDNS
    Annotations:       prometheus.io/port: 9153
                       prometheus.io/scrape: true
    Selector:          k8s-app=kube-dns
    Type:              ClusterIP
    IP Families:       <none>
    IP:                10.96.0.10
    IPs:               10.96.0.10
    Port:              dns  53/UDP
    TargetPort:        53/UDP
    Endpoints:         10.244.0.34:53,10.244.0.66:53,10.244.0.8:53
    Port:              dns-tcp  53/TCP
    TargetPort:        53/TCP
    Endpoints:         10.244.0.34:53,10.244.0.66:53,10.244.0.8:53
    Port:              metrics  9153/TCP
    TargetPort:        9153/TCP
    Endpoints:         10.244.0.34:9153,10.244.0.66:9153,10.244.0.8:9153
    Session Affinity:  None
    Events:            <none>
    
    #查看endpoint对应的pod状态
    $kubectl get pod -nkube-system -owide |grep 10.244.0.34
    coredns-659f5bbffd-w5vzw                   1/1     Running   0          2d   10.244.0.34     master-0002    <none>           <none>
    $kubectl get pod -nkube-system -owide |grep 10.244.0.66
    coredns-659f5bbffd-qrzl8                   1/1     Running   0         2d   10.244.0.66     master-0003    <none>           <none>
    $kubectl get pod -nkube-system -owide |grep 10.244.0.8
    coredns-659f5bbffd-rfr79                   1/1     Running   0          2d   10.244.0.8      master-0001    <none>           <none>
    

    3.如何在 CoreDNS 配置文件添加访问日志

    CoreDNS 配置文件是 Corefile,通过添加 log 插件可以打印访问日志,而 Corefile 是保存在 ConfigMap 中。

    $kubectl edit configmap -nkube-system coredns
    

    修改如下 ConfigMap,添加 log

    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: coredns
      namespace: kube-system
    data:
      Corefile: |
        .:53 {
            log
            errors
            health
            kubernetes cluster.local in-addr.arpa ip6.arpa {
              pods insecure
              upstream
              fallthrough in-addr.arpa ip6.arpa
            }
            prometheus :9153
            forward . /etc/resolv.conf
            cache 30
            loop
            reload
            loadbalance
        }    
    

    ConfigMap 保存后需要等待 1 到 2 分钟生效到 CoreDNS 的 Pod 中。如果配置生效,则 CoreDNS 将在日志看到:

    .:53
    [INFO] plugin/reload: Running configuration MD5 = db32ca3650231d74073ff4cf814959a7
    [INFO] Reloading complete
    

    在应用 Pod 中测试目标地址的连通性:

    #查询目标地址
    $kubectl get svc |grep datachannel
    datatest                               ClusterIP   10.96.0.37    <none>        8080/TCP                                         29d
    
    #测试连通性
    $kubectl exec -it deployment-flink-jobmanager-c695cf9d-rgtbh -- curl -vi datatest:8080
    *   Trying 10.96.0.37:8080...
    * Connected to datachannel (10.96.0.37) port 8092 (#0)
    > GET / HTTP/1.1
    > Host: datachannel:8080
    > User-Agent: curl/7.69.1
    > Accept: */*
    >
    * Mark bundle as not supporting multiuse
    < HTTP/1.1 400
    HTTP/1.1 400
    < Content-Type: text/plain;charset=UTF-8
    Content-Type: text/plain;charset=UTF-8
    < Connection: close
    Connection: close
    
    <
    Bad Request
    This combination of host and port requires TLS.
    * Closing connection 0
    
    #DNS的关键日志
    kubectl logs -nkube-system coredns-659f5bbffd-rfr79 |grep datatest
    [INFO] 10.244.2.252:36293 - 17111 "AAAA IN datatest.default.svc.cluster.local. udp 51 false 512" NOERROR qr,aa,rd 144 0.000184943s
    [INFO] 10.244.2.252:36293 - 32461 "A IN datatest.default.svc.cluster.local. udp 51 false 512" NOERROR qr,aa,rd 100 0.000144745s
    
    

    相关文章

      网友评论

        本文标题:【K8s 精选】CKA - 如何排查网络故障

        本文链接:https://www.haomeiwen.com/subject/ajsmjrtx.html