美文网首页
使用bpftrace分析网络包在内核协议栈各层的耗时

使用bpftrace分析网络包在内核协议栈各层的耗时

作者: zhangzhifei | 来源:发表于2021-03-24 11:16 被阅读0次

    为什么使用bpf分析耗时问题

    网络性能时分析抓包工具是一大利器,tcpdump/wireshark等抓包工具底层都是使用bpf提供的库,抓包位置在软中断从网卡队列(ring buffer)中读取数据后发送给协议栈时同时发给抓包程序一份,也就是只能知道同一seq数据包在网卡接口处一来一回的时间,比较难分析的延时问题都是偶尔的一次抖动(一天一次,几个小时一次这样),所以抖动的原因是发生在内核还是用户太就说不清了(在tcp各监控指标都看不出异常的情况下)。使用bpf可通过kprobe或者tracepoint在协议栈各层的关键函数中添加hook点,当数据包经过该函数时,打印出seq、network namespace、时间戳等关键信息,帮助我们快速定位或者缩小问题范围。

    ebpf工作原理

    1. 这里简单介绍下ebpf的原理,用户写好的ebpf程序,通过LLVM或者clang编译好后,通过bpf校验器校验(如防止程序导致内核崩溃),插入到内核中的bpf虚拟机中,例如图中的filter位置。通过网络处理函的静态探针(也就是tracepoint内核代码中预先定义好的hook点)或者动态探针(内核函数中只有少数函数有tracepoint,没有tracepoint的函数就需要静态探针如:kprobe,kprobe通过注册函数地址,然后在函数被调用时调用我们的bpf程序)调用bpf程序,过滤结果存到buffer,用户层通过系统调用获取过滤结果。这个图来自tcpdump papers值给出了从网卡接口函数获数据包,对于protocal stack或者内核其他子系统的工作原理都类似。


      来自tcpdump papers
    2. 本文使用的bpftrace开发bpf程序,bpftrace(https://github.com/iovisor/bpftrace)实现了一套前端语言类似awk和c,不需要关心bpf程序的编译过程。同时作者也将bcc的工具用bpftrace实现了一份,安装即用。bpftrace开发的好处就是其语法简单,缺点就是太简单,不能自定义函数,不能调用内核函数,所以只适合开发一些简单的小工具。开发本文介绍的小工具时,已经有些吃力。

    场景

    容器环境中,用户反馈在流量没有突增的情况下,请求redis p99时而发生抖动,频率大概是一天一次。排除redis服务端,以及容器的cpu、内存等问题。数据包在协议栈的耗时无法得知,所以决定完善tcp高级指标的监控。同时我们也需要一个工具,能够知道从网卡队列到tcp状态机之前的耗时。

    协议栈关键函数

    1. 发送:
      ip_queue_xmit(tcp层处理后的段都要调用这个函数发送ip层的发送队列)------> ip_finish_output2(进入到链路层之前的处理函数,也就是netfilter和路由之后) -----> __dev_queue_xmit(将数据发送到驱动层)
    2. 接收:
      netif_rx(软中断调用的处理函数)-----> __netif_receive_skb(协议栈报文接收的入口)-----> ip_rcv(ip层入口)-----> tcp_v4_rcv(tcp层入口)-----> tcp_rcv_state_process(tcp状态机)
    3. 网桥(因为容器使用了docker):
      br_handle_frame

    大家可能看到发送过程没有4层的关键函数,因为调用ip_queue_xmit之前是没有填充完整的tcp协议头的,没办法在一个函数中获取到我们需要的完整系信息。

    1. 下面这张图描述了网络报从宿主机到容器的一些关键函数


      网络报从宿主机到容器

    使用bpftrace实现功能

    1. bpftrace的语法建议直接去github上看,文档比较新。如果要在centos上使用bpftrace,建议使用ubuntu打镜像直接使用ubuntu的安装包,可以参考上篇文章https://www.jianshu.com/p/54cb1f2b3614
    2. 下边直接给出__netif_receive_skb函数的kprobe打点代码,其他函数的代码基本一样。上边说了,由于bpftrace语法功能太简单,因为不能定义函数,所以代码重复很多。另外提一下,bpftrace还是强类型的,所以过滤ip功能,要自己把需要过滤的ip转化成网络字节序。
    3. 一共使用了3个命令行参数,参数一和二用于过滤IP,参数三是物理机network ns,这个ns的作用是在宿主机的二层数据处理函数__netif_receive_skb上打上时间戳,然后在容器的ns中的tcp_rcv_state_process函数处计算下时延。这样方便快速的判断是否有异常。我们程序中时延大于30ms就会打印一条日志。这是接收时,发送数据亦然。
    kprobe:__netif_receive_skb
    /
    $1 == ((struct iphdr *)(((struct sk_buff *) arg0)->head + ((struct sk_buff *) arg0)->network_header))->saddr ||
    $1 == ((struct iphdr *)(((struct sk_buff *) arg0)->head + ((struct sk_buff *) arg0)->network_header))->daddr ||
    $2 == ((struct iphdr *)(((struct sk_buff *) arg0)->head + ((struct sk_buff *) arg0)->network_header))->saddr ||
    $2 == ((struct iphdr *)(((struct sk_buff *) arg0)->head + ((struct sk_buff *) arg0)->network_header))->saddr
    /
    {
            $skb = ((struct sk_buff *) arg0);
            $net = $skb->dev->nd_net.net;
            $netif = $skb->dev->name;
            $nsid = $net->ns.inum;
    
            $iphd = ((struct iphdr *)($skb->head + $skb->network_header));
            $srcaddr = $iphd->saddr;
            $dstaddr = $iphd->daddr;
            $srcip = ntop($iphd->saddr);
            $dstip = ntop($iphd->daddr);
            if ($iphd->protocol == IPPROTO_TCP) {
                    $tcphd = ((struct tcphdr *)($skb->head + $skb->transport_header));
                    $sport = $tcphd->source;
                    $sport = ($sport >> 8) | (($sport << 8) & 0x00FF00);
    
                    $dport = $tcphd->dest;
                    $dport = ($dport >> 8) | (($dport << 8) & 0x00FF00);
    
                    // __u32   be32_to_cpu(const __be32);
                    $seq = $tcphd->seq;
                    $seq = ($seq >> 24) | (($seq & 0x00FF0000) >> 8) | (( $seq & 0x00FF00) << 8) | (($seq & 0xFF) << 24);
    
                    $ack = $tcphd->ack_seq;
                    $ack = ($ack >> 24) | (($ack & 0x00FF0000) >> 8) | (( $ack & 0x00FF00) << 8) | (($ack & 0xFF) << 24);
    
                    if($nsid == $3) {
                        @rcvpkg[$seq] = nsecs;
                    }
    
                    $win = $tcphd->window;
                    $win = ($win >> 8) | (($win << 8) & 0x00FF00);
    
                    $pkgflag = $tcphd->syn > 0 ? "syn" : ($tcphd->fin > 0 ? "fin" : ($tcphd->rst > 0 ? "rst" : ($tcphd->psh > 0 ? "psh" : ($tcphd->ack > 0 ? "ack" : ""))));
    
                    time("%H:%M:%S   ");
                    printf("%-19u %-15s %d,%s,%s,%-20d ", $nsid, $netif, pid, comm, func, cpu);
                    printf("flags:%s, seq:%-u, ack:%u, win:%-25u ", $pkgflag, $seq, $ack, $win);
                    printf("%s:%-15d  %s:%-15d %d ms\n", $srcip, $sport, $dstip, $dport, (nsecs / 1000000) % 1000);
            }
    }
    

    测试验证

    1. 用docker启一个nginx容器
    [root@ ~]# docker ps | grep 639267e5f60f
    639267e5f60f        nginx                                          "/docker-entrypoint.…"   13 days ago
    
    1. 获取docker pid
    [root@ ~]# docker inspect --format {{.State.Pid}} 639267e5f60f
    1946308
    
    1. 模拟容器网卡eth0发包延时100ms
    nsenter -n -t  1946308 tc qdisc add dev eth0 root netem delay 100ms
    
    1. 模拟veth peer,就是宿主机上的veth,200ms的发包延时
    tc qdisc add dev veth4458ea2  root netem delay 200ms
    
    1. 运行工具
    root@:/# ./pkgtool.bt 1933764362 1966204682 4026531992
    Attaching 11 probes...
    Tracing tcp pkg. Hit Ctrl-C to end.
    TIME       NETWORK NS          INTERFACE       PID,COMM,FUNC,CPU                   PKGFLAGS                  SADDR:SPORT           DADDR:DPORT            TIMESTAMP
    
    1. 请求容器中的nginx,在除容器的宿主机之外的机器上请求。
    [root@ ~]# curl 10.231.66.115(容器ip)
    
    1. 观察输出结果
    12:11:24   4026531992          eth0            0,swapper/5,__netif_receive_skb,5                    flags:syn, seq:3511028477, ack:0, win:0                         10.231.66.115:49847            10.46.242.54:80              787 ms
    12:11:24   4026531992          eth0            0,swapper/5,ip_rcv,5                    flags:syn, seq:3511028477, ack:0, win:0                         10.231.66.115:49847            10.46.242.54:80              787 ms
    12:11:24   4026531992          docker0         0,swapper/5,ip_finish_output2,5                    flags:syn, seq:3511028477, ack:0, win:0                         10.231.66.115:49847            10.46.242.54:80              787 ms
    12:11:24   4026531992          docker0         0,swapper/5,__dev_queue_xmit,5                    flags:syn, seq:3511028477, ack:0, win:0                         10.231.66.115:49847            10.46.242.54:80              787 ms
    12:11:24   4026531992          veth4458ea2     0,swapper/5,__dev_queue_xmit,5                    flags:syn, seq:3511028477, ack:0, win:0                         10.231.66.115:49847            10.46.242.54:80              787 ms
    12:11:24   4026538786          eth0            0,swapper/5,netif_rx,5                    flags:syn, seq:3511028477, ack:0, win:0                         10.231.66.115:49847            10.46.242.54:80              987 ms
    12:11:24   4026538786          eth0            0,swapper/5,__netif_receive_skb,5                    flags:syn, seq:3511028477, ack:0, win:0                         10.231.66.115:49847            10.46.242.54:80              987 ms
    12:11:24   4026538786          eth0            0,swapper/5,ip_rcv,5                    flags:syn, seq:3511028477, ack:0, win:0                         10.231.66.115:49847            10.46.242.54:80              988 ms
    12:11:24   4026538786          eth0            0,swapper/5,tcp_v4_rcv,5                    flags:syn, seq:3511028477, ack:0, win:0                         10.231.66.115:49847            10.46.242.54:80              988 ms
    WARN:RCV slow pkg: duration is 200  seq is 3511028477
    12:11:24   4026538786                          0,swapper/5,tcp_rcv_state_process,5                    flags:syn, seq:3511028477, ack:0, win:0                         10.231.66.115:49847            10.46.242.54:80              988 ms
    12:11:24   4026538786          eth0            0,swapper/5,ip_finish_output2,5                    flags:syn, seq:278643593, ack:3511028478, win:0                         10.46.242.54:80               10.231.66.115:49847           988 ms
    12:11:24   4026538786          eth0            0,swapper/5,__dev_queue_xmit,5                    flags:syn, seq:278643593, ack:3511028478, win:0                         10.46.242.54:80               10.231.66.115:49847           988 ms
    12:11:24   4026531992          veth4458ea2     0,swapper/5,netif_rx,5                    flags:syn, seq:278643593, ack:3511028478, win:0                         10.46.242.54:80               10.231.66.115:49847           88 ms
    12:11:24   4026531992          veth4458ea2     36,ksoftirqd/5,__netif_receive_skb,5                    flags:syn, seq:278643593, ack:3511028478, win:0                         10.46.242.54:80               10.231.66.115:49847           88 ms
    12:11:24   4026531992          veth4458ea2     36,ksoftirqd/5,br_handle_frame_finish,5                    flags:syn, seq:278643593, ack:3511028478, win:0                         10.46.242.54:80               10.231.66.115:49847           88 ms
    12:11:24   4026531992          docker0         36,ksoftirqd/5,__netif_receive_skb,5                    flags:syn, seq:278643593, ack:3511028478, win:0                         10.46.242.54:80               10.231.66.115:49847           88 ms
    12:11:24   4026531992          docker0         36,ksoftirqd/5,ip_rcv,5                    flags:syn, seq:278643593, ack:3511028478, win:0                         10.46.242.54:80               10.231.66.115:49847           88 ms
    12:11:24   4026531992          eth0            36,ksoftirqd/5,ip_finish_output2,5                    flags:syn, seq:278643593, ack:3511028478, win:0                         10.46.242.54:80               10.231.66.115:49847           88 ms
    12:11:24   4026531992          eth0            36,ksoftirqd/5,__dev_queue_xmit,5                    flags:syn, seq:278643593, ack:3511028478, win:0                         10.46.242.54:80               10.231.66.115:49847           88 ms
    12:11:24   4026531992          eth0            0,swapper/5,__netif_receive_skb,5                    flags:ack, seq:3511028478, ack:278643594, win:0                         10.231.66.115:49847            10.46.242.54:80              95 ms
    12:11:24   4026531992          eth0            0,swapper/5,ip_rcv,5                    flags:ack, seq:3511028478, ack:278643594, win:0                         10.231.66.115:49847            10.46.242.54:80              95 ms
    12:11:24   4026531992          docker0         0,swapper/5,ip_finish_output2,5                    flags:ack, seq:3511028478, ack:278643594, win:0                         10.231.66.115:49847            10.46.242.54:80              95 ms
    12:11:24   4026531992          docker0         0,swapper/5,__dev_queue_xmit,5                    flags:ack, seq:3511028478, ack:278643594, win:0                         10.231.66.115:49847            10.46.242.54:80              95 ms
    12:11:24   4026531992          veth4458ea2     0,swapper/5,__dev_queue_xmit,5                    flags:ack, seq:3511028478, ack:278643594, win:0                         10.231.66.115:49847            10.46.242.54:80              95 ms
    12:11:24   4026531992          eth0            0,swapper/5,__netif_receive_skb,5                    flags:psh, seq:3511028478, ack:278643594, win:0                         10.231.66.115:49847            10.46.242.54:80              95 ms
    12:11:24   4026531992          eth0            0,swapper/5,ip_rcv,5                    flags:psh, seq:3511028478, ack:278643594, win:0                         10.231.66.115:49847            10.46.242.54:80              95 ms
    12:11:24   4026531992          docker0         0,swapper/5,ip_finish_output2,5                    flags:psh, seq:3511028478, ack:278643594, win:0                         10.231.66.115:49847            10.46.242.54:80              95 ms
    12:11:24   4026531992          docker0         0,swapper/5,__dev_queue_xmit,5                    flags:psh, seq:3511028478, ack:278643594, win:0                         10.231.66.115:49847            10.46.242.54:80              95 ms
    12:11:24   4026531992          veth4458ea2     0,swapper/5,__dev_queue_xmit,5                    flags:psh, seq:3511028478, ack:278643594, win:0                         10.231.66.115:49847            10.46.242.54:80              95 ms
    12:11:24   4026538786          eth0            0,swapper/5,netif_rx,5                    flags:ack, seq:3511028478, ack:278643594, win:0                         10.231.66.115:49847            10.46.242.54:80              295 ms
    12:11:24   4026538786          eth0            36,ksoftirqd/5,__netif_receive_skb,5                    flags:ack, seq:3511028478, ack:278643594, win:0                         10.231.66.115:49847            10.46.242.54:80              295 ms
    12:11:24   4026538786          eth0            36,ksoftirqd/5,ip_rcv,5                    flags:ack, seq:3511028478, ack:278643594, win:0                         10.231.66.115:49847            10.46.242.54:80              295 ms
    12:11:24   4026538786          eth0            36,ksoftirqd/5,tcp_v4_rcv,5                    flags:ack, seq:3511028478, ack:278643594, win:0                         10.231.66.115:49847            10.46.242.54:80              295 ms
    WARN:RCV slow pkg: duration is 200  seq is 3511028478
    12:11:24   4026538786          eth0            36,ksoftirqd/5,tcp_rcv_state_process,5                    flags:ack, seq:3511028478, ack:278643594, win:0                         10.231.66.115:49847            10.46.242.54:80              295 ms
    12:11:24   4026538786          eth0            36,ksoftirqd/5,netif_rx,5                    flags:psh, seq:3511028478, ack:278643594, win:0                         10.231.66.115:49847            10.46.242.54:80              295 ms
    12:11:24   4026538786          eth0            36,ksoftirqd/5,__netif_receive_skb,5                    flags:psh, seq:3511028478, ack:278643594, win:0                         10.231.66.115:49847            10.46.242.54:80              295 ms
    12:11:24   4026538786          eth0            36,ksoftirqd/5,ip_rcv,5                    flags:psh, seq:3511028478, ack:278643594, win:0                         10.231.66.115:49847            10.46.242.54:80              295 ms
    12:11:24   4026538786          eth0            36,ksoftirqd/5,tcp_v4_rcv,5                    flags:psh, seq:3511028478, ack:278643594, win:0                         10.231.66.115:49847            10.46.242.54:80              295 ms
    12:11:24   4026538786                          36,ksoftirqd/5,__ip_queue_xmit,5                    flags:ack, seq:278643594, ack:3511028554, win:0                         10.46.242.54:80               10.231.66.115:49847           295 ms
    12:11:24   4026538786          eth0            36,ksoftirqd/5,ip_finish_output2,5                    flags:ack, seq:278643594, ack:3511028554, win:0                         10.46.242.54:80               10.231.66.115:49847           295 ms
    12:11:24   4026538786          eth0            36,ksoftirqd/5,__dev_queue_xmit,5                    flags:ack, seq:278643594, ack:3511028554, win:0                         10.46.242.54:80               10.231.66.115:49847           295 ms
    12:11:24   4026538786                          1946355,nginx,__ip_queue_xmit,10                   flags:psh, seq:278643594, ack:3511028554, win:0                         10.46.242.54:80               10.231.66.115:49847           296 ms
    12:11:24   4026538786          eth0            1946355,nginx,ip_finish_output2,10                   flags:psh, seq:278643594, ack:3511028554, win:0                         10.46.242.54:80               10.231.66.115:49847           296 ms
    12:11:24   4026538786          eth0            1946355,nginx,__dev_queue_xmit,10                   flags:psh, seq:278643594, ack:3511028554, win:0                         10.46.242.54:80               10.231.66.115:49847           296 ms
    12:11:24   4026538786                          1946355,nginx,__ip_queue_xmit,10                   flags:psh, seq:278643832, ack:3511028554, win:0                         10.46.242.54:80               10.231.66.115:49847           296 ms
    12:11:24   4026538786          eth0            1946355,nginx,ip_finish_output2,10                   flags:psh, seq:278643832, ack:3511028554, win:0                         10.46.242.54:80               10.231.66.115:49847           296 ms
    12:11:24   4026538786          eth0            1946355,nginx,__dev_queue_xmit,10                   flags:psh, seq:278643832, ack:3511028554, win:0                         10.46.242.54:80               10.231.66.115:49847           296 ms
    12:11:25   4026531992          veth4458ea2     0,swapper/5,netif_rx,5                    flags:ack, seq:278643594, ack:3511028554, win:0                         10.46.242.54:80               10.231.66.115:49847           395 ms
    12:11:25   4026531992          veth4458ea2     0,swapper/5,__netif_receive_skb,5                    flags:ack, seq:278643594, ack:3511028554, win:0                         10.46.242.54:80               10.231.66.115:49847           396 ms
    12:11:25   4026531992          veth4458ea2     0,swapper/5,br_handle_frame_finish,5                    flags:ack, seq:278643594, ack:3511028554, win:0                         10.46.242.54:80               10.231.66.115:49847           396 ms
    12:11:25   4026531992          docker0         0,swapper/5,__netif_receive_skb,5                    flags:ack, seq:278643594, ack:3511028554, win:0                         10.46.242.54:80               10.231.66.115:49847           396 ms
    12:11:25   4026531992          docker0         0,swapper/5,ip_rcv,5                    flags:ack, seq:278643594, ack:3511028554, win:0                         10.46.242.54:80               10.231.66.115:49847           396 ms
    12:11:25   4026531992          eth0            0,swapper/5,ip_finish_output2,5                    flags:ack, seq:278643594, ack:3511028554, win:0                         10.46.242.54:80               10.231.66.115:49847           396 ms
    WARN: SEND slow pkg: duration is 99  seq is 278643594
    12:11:25   4026531992          eth0            0,swapper/5,__dev_queue_xmit,5                    flags:ack, seq:278643594, ack:3511028554, win:0                         10.46.242.54:80               10.231.66.115:49847           396 ms
    12:11:25   4026531992          veth4458ea2     36,ksoftirqd/5,netif_rx,5                    flags:psh, seq:278643594, ack:3511028554, win:0                         10.46.242.54:80               10.231.66.115:49847           396 ms
    12:11:25   4026531992          veth4458ea2     36,ksoftirqd/5,__netif_receive_skb,5                    flags:psh, seq:278643594, ack:3511028554, win:0                         10.46.242.54:80               10.231.66.115:49847           396 ms
    12:11:25   4026531992          veth4458ea2     36,ksoftirqd/5,br_handle_frame_finish,5                    flags:psh, seq:278643594, ack:3511028554, win:0                         10.46.242.54:80               10.231.66.115:49847           396 ms
    12:11:25   4026531992          docker0         36,ksoftirqd/5,__netif_receive_skb,5                    flags:psh, seq:278643594, ack:3511028554, win:0                         10.46.242.54:80               10.231.66.115:49847           396 ms
    12:11:25   4026531992          docker0         36,ksoftirqd/5,ip_rcv,5                    flags:psh, seq:278643594, ack:3511028554, win:0                         10.46.242.54:80               10.231.66.115:49847           396 ms
    12:11:25   4026531992          eth0            36,ksoftirqd/5,ip_finish_output2,5                    flags:psh, seq:278643594, ack:3511028554, win:0                         10.46.242.54:80               10.231.66.115:49847           396 ms
    12:11:25   4026531992          eth0            36,ksoftirqd/5,__dev_queue_xmit,5                    flags:psh, seq:278643594, ack:3511028554, win:0                         10.46.242.54:80               10.231.66.115:49847           396 ms
    12:11:25   4026531992          veth4458ea2     36,ksoftirqd/5,netif_rx,5                    flags:psh, seq:278643832, ack:3511028554, win:0                         10.46.242.54:80               10.231.66.115:49847           396 ms
    12:11:25   4026531992          veth4458ea2     36,ksoftirqd/5,__netif_receive_skb,5                    flags:psh, seq:278643832, ack:3511028554, win:0                         10.46.242.54:80               10.231.66.115:49847           396 ms
    12:11:25   4026531992          veth4458ea2     36,ksoftirqd/5,br_handle_frame_finish,5                    flags:psh, seq:278643832, ack:3511028554, win:0                         10.46.242.54:80               10.231.66.115:49847           396 ms
    12:11:25   4026531992          docker0         36,ksoftirqd/5,__netif_receive_skb,5                    flags:psh, seq:278643832, ack:3511028554, win:0                         10.46.242.54:80               10.231.66.115:49847           396 ms
    12:11:25   4026531992          docker0         36,ksoftirqd/5,ip_rcv,5                    flags:psh, seq:278643832, ack:3511028554, win:0                         10.46.242.54:80               10.231.66.115:49847           396 ms
    12:11:25   4026531992          eth0            36,ksoftirqd/5,ip_finish_output2,5                    flags:psh, seq:278643832, ack:3511028554, win:0                         10.46.242.54:80               10.231.66.115:49847           396 ms
    WARN: SEND slow pkg: duration is 100  seq is 278643832
    12:11:25   4026531992          eth0            36,ksoftirqd/5,__dev_queue_xmit,5                    flags:psh, seq:278643832, ack:3511028554, win:0                         10.46.242.54:80               10.231.66.115:49847           396 ms
    12:11:25   4026531992          eth0            0,swapper/5,__netif_receive_skb,5                    flags:fin, seq:3511028554, ack:278644444, win:0                         10.231.66.115:49847            10.46.242.54:80              408 ms
    12:11:25   4026531992          eth0            0,swapper/5,ip_rcv,5                    flags:fin, seq:3511028554, ack:278644444, win:0                         10.231.66.115:49847            10.46.242.54:80              408 ms
    12:11:25   4026531992          docker0         0,swapper/5,ip_finish_output2,5                    flags:fin, seq:3511028554, ack:278644444, win:0                         10.231.66.115:49847            10.46.242.54:80              408 ms
    12:11:25   4026531992          docker0         0,swapper/5,__dev_queue_xmit,5                    flags:fin, seq:3511028554, ack:278644444, win:0                         10.231.66.115:49847            10.46.242.54:80              408 ms
    12:11:25   4026531992          veth4458ea2     0,swapper/5,__dev_queue_xmit,5                    flags:fin, seq:3511028554, ack:278644444, win:0                         10.231.66.115:49847            10.46.242.54:80              408 ms
    12:11:25   4026538786          eth0            0,swapper/5,netif_rx,5                    flags:fin, seq:3511028554, ack:278644444, win:0                         10.231.66.115:49847            10.46.242.54:80              608 ms
    12:11:25   4026538786          eth0            0,swapper/5,__netif_receive_skb,5                    flags:fin, seq:3511028554, ack:278644444, win:0                         10.231.66.115:49847            10.46.242.54:80              608 ms
    12:11:25   4026538786          eth0            0,swapper/5,ip_rcv,5                    flags:fin, seq:3511028554, ack:278644444, win:0                         10.231.66.115:49847            10.46.242.54:80              608 ms
    12:11:25   4026538786          eth0            0,swapper/5,tcp_v4_rcv,5                    flags:fin, seq:3511028554, ack:278644444, win:0                         10.231.66.115:49847            10.46.242.54:80              608 ms
    12:11:25   4026538786                          1946355,nginx,__ip_queue_xmit,10                   flags:fin, seq:278644444, ack:3511028555, win:0                         10.46.242.54:80               10.231.66.115:49847           608 ms
    12:11:25   4026538786          eth0            1946355,nginx,ip_finish_output2,10                   flags:fin, seq:278644444, ack:3511028555, win:0                         10.46.242.54:80               10.231.66.115:49847           608 ms
    12:11:25   4026538786          eth0            1946355,nginx,__dev_queue_xmit,10                   flags:fin, seq:278644444, ack:3511028555, win:0                         10.46.242.54:80               10.231.66.115:49847           608 ms
    12:11:25   4026531992          veth4458ea2     0,swapper/10,netif_rx,10                   flags:fin, seq:278644444, ack:3511028555, win:0                         10.46.242.54:80               10.231.66.115:49847           708 ms
    12:11:25   4026531992          veth4458ea2     0,swapper/10,__netif_receive_skb,10                   flags:fin, seq:278644444, ack:3511028555, win:0                         10.46.242.54:80               10.231.66.115:49847           708 ms
    12:11:25   4026531992          veth4458ea2     0,swapper/10,br_handle_frame_finish,10                   flags:fin, seq:278644444, ack:3511028555, win:0                         10.46.242.54:80               10.231.66.115:49847           708 ms
    12:11:25   4026531992          docker0         0,swapper/10,__netif_receive_skb,10                   flags:fin, seq:278644444, ack:3511028555, win:0                         10.46.242.54:80               10.231.66.115:49847           708 ms
    12:11:25   4026531992          docker0         0,swapper/10,ip_rcv,10                   flags:fin, seq:278644444, ack:3511028555, win:0                         10.46.242.54:80               10.231.66.115:49847           708 ms
    12:11:25   4026531992          eth0            0,swapper/10,ip_finish_output2,10                   flags:fin, seq:278644444, ack:3511028555, win:0                         10.46.242.54:80               10.231.66.115:49847           708 ms
    WARN: SEND slow pkg: duration is 100  seq is 278644444
    12:11:25   4026531992          eth0            0,swapper/10,__dev_queue_xmit,10                   flags:fin, seq:278644444, ack:3511028555, win:0                         10.46.242.54:80               10.231.66.115:49847           708 ms
    12:11:25   4026531992          eth0            0,swapper/5,__netif_receive_skb,5                    flags:ack, seq:3511028555, ack:278644445, win:0                         10.231.66.115:49847            10.46.242.54:80              720 ms
    12:11:25   4026531992          eth0            0,swapper/5,ip_rcv,5                    flags:ack, seq:3511028555, ack:278644445, win:0                         10.231.66.115:49847            10.46.242.54:80              720 ms
    12:11:25   4026531992          docker0         0,swapper/5,ip_finish_output2,5                    flags:ack, seq:3511028555, ack:278644445, win:0                         10.231.66.115:49847            10.46.242.54:80              720 ms
    12:11:25   4026531992          docker0         0,swapper/5,__dev_queue_xmit,5                    flags:ack, seq:3511028555, ack:278644445, win:0                         10.231.66.115:49847            10.46.242.54:80              720 ms
    12:11:25   4026531992          veth4458ea2     0,swapper/5,__dev_queue_xmit,5                    flags:ack, seq:3511028555, ack:278644445, win:0                         10.231.66.115:49847            10.46.242.54:80              720 ms
    12:11:25   4026538786          eth0            0,swapper/5,netif_rx,5                    flags:ack, seq:3511028555, ack:278644445, win:0                         10.231.66.115:49847            10.46.242.54:80              920 ms
    12:11:25   4026538786          eth0            0,swapper/5,__netif_receive_skb,5                    flags:ack, seq:3511028555, ack:278644445, win:0                         10.231.66.115:49847            10.46.242.54:80              921 ms
    12:11:25   4026538786          eth0            0,swapper/5,ip_rcv,5                    flags:ack, seq:3511028555, ack:278644445, win:0                         10.231.66.115:49847            10.46.242.54:80              921 ms
    12:11:25   4026538786          eth0            0,swapper/5,tcp_v4_rcv,5                    flags:ack, seq:3511028555, ack:278644445, win:0                         10.231.66.115:49847            10.46.242.54:80              921 ms
    WARN:RCV slow pkg: duration is 200  seq is 3511028555
    12:11:25   4026538786                          0,swapper/5,tcp_rcv_state_process,5                    flags:ack, seq:3511028555, ack:278644445, win:0                         10.231.66.115:49847            10.46.242.54:80              921 ms
    
    1. 分析结果,发送包的100ms和接收包的200ms都已经打印出来了。拿“WARN: SEND slow pkg: duration is 100 seq is 278644444”这条日志看看,序列号是“278644444”的包从容器到宿主机上延时超过了100ms(正常的情况下不会超过5ms,如果超过5ms物理机压力可能比较大,长时间这样的话需要额外关注了)。拿着序列号“seq:278644444”,可以过滤出下面的信息:
    12:11:25   4026538786                          1946355,nginx,__ip_queue_xmit,10                   flags:fin, seq:278644444, ack:3511028555, win:0                         10.46.242.54:80               10.231.66.115:49847           608 ms
    12:11:25   4026538786          eth0            1946355,nginx,ip_finish_output2,10                   flags:fin, seq:278644444, ack:3511028555, win:0                         10.46.242.54:80               10.231.66.115:49847           608 ms
    12:11:25   4026538786          eth0            1946355,nginx,__dev_queue_xmit,10                   flags:fin, seq:278644444, ack:3511028555, win:0                         10.46.242.54:80               10.231.66.115:49847           608 ms
    12:11:25   4026531992          veth4458ea2     0,swapper/10,netif_rx,10                   flags:fin, seq:278644444, ack:3511028555, win:0                         10.46.242.54:80               10.231.66.115:49847           708 ms
    12:11:25   4026531992          veth4458ea2     0,swapper/10,__netif_receive_skb,10                   flags:fin, seq:278644444, ack:3511028555, win:0                         10.46.242.54:80               10.231.66.115:49847           708 ms
    12:11:25   4026531992          veth4458ea2     0,swapper/10,br_handle_frame_finish,10                   flags:fin, seq:278644444, ack:3511028555, win:0                         10.46.242.54:80               10.231.66.115:49847           708 ms
    12:11:25   4026531992          docker0         0,swapper/10,__netif_receive_skb,10                   flags:fin, seq:278644444, ack:3511028555, win:0                         10.46.242.54:80               10.231.66.115:49847           708 ms
    12:11:25   4026531992          docker0         0,swapper/10,ip_rcv,10                   flags:fin, seq:278644444, ack:3511028555, win:0                         10.46.242.54:80               10.231.66.115:49847           708 ms
    12:11:25   4026531992          eth0            0,swapper/10,ip_finish_output2,10                   flags:fin, seq:278644444, ack:3511028555, win:0                         10.46.242.54:80               10.231.66.115:49847           708 ms
    WARN: SEND slow pkg: duration is 100  seq is 278644444
    12:11:25   4026531992          eth0            0,swapper/10,__dev_queue_xmit,10                   flags:fin, seq:278644444, ack:3511028555, win:0                         10.46.242.54:80               10.231.66.115:49847           708 ms
    

    可以看到容器网卡发包后(__dev_queue_xmit)到veth收包(netif_rx),时间戳由608 ms 到了708 ms(为了直观,我们把时间戳由ns转化为了ms)。这样很快就能把问题缩小到网卡接口层。

    1. 上边的测试只是为了说明工具,定位问题便捷。我们目前还没遇见在这一链路发生延时的问题,即使这样我们能快速的排除这段链路的问题,缩小问题范围。

    最后

    1. bpftrace作者提供了很多实用工具,也可以根据其提供的编程语言实现自己的工具。在传统工具得不到什么信息时,bpftrace可能为你开荒破土。
    2. 由于bpftrace语法功能的原因,这个工具还是打印太多的数据,如果有需要像这样稍微复杂一点的工具,应该直接用c写,只打印出时延超过预期的数据。
    3. 网络性能分析时,一直想要确定物理机二层到容器tcp层之间是否有问题,也一直想有个这样的工具。得益于“参考”的第一个连接ebay的一边性能分析文章,否则不会这么顺畅。

    参考

    https://mp.weixin.qq.com/s/ZUS94PMCKsqgZFHX9b99-g
    https://github.com/iovisor/bpftrace/blob/master/docs/reference_guide.md
    https://www.tcpdump.org/papers/bpf-usenix93.pdf

    相关文章

      网友评论

          本文标题:使用bpftrace分析网络包在内核协议栈各层的耗时

          本文链接:https://www.haomeiwen.com/subject/mwaocltx.html