2020-03-19

作者: northking | 来源:发表于2020-03-21 21:30 被阅读0次

1. 概述

在 go 语言中，主要关注的应用运行情况主要包括以下几种：

CPU Profiling：CPU 分析，按照一定的频率采集所监听的应用程序 CPU（含寄存器）的使用情况，可确定应用程序在主动消耗 CPU 周期时花费时间的位置
Memory Profiling：内存分析，在应用程序进行堆分配时记录堆栈跟踪，用于监视当前和历史内存使用情况，以及检查内存泄漏
Goroutine Profiling：报告 goroutines 的使用情况，有哪些 goroutine，它们的调用关系是怎样的
Block Profiling：阻塞分析，记录 goroutine 阻塞等待同步（包括定时器通道）的位置,可以用来分析和查找死锁等性能瓶颈
Mutex Profiling：互斥锁分析，报告互斥锁的竞争情况

1.1 pprof

PProf是一个CPU分析器( cpu profiler)，它是gperftools工具的一个组件，由Google工程师为分析多线程的程序所开发，是用于可视化和分析性能分析数据的工具

pprof 以 profile.proto 读取分析样本的集合，并生成报告以可视化并帮助分析数据（支持文本和图形报告）

profile.proto 是一个 Protocol Buffer v3 的描述文件，它描述了一组 callstack 和 symbolization 信息，作用是表示统计分析的一组采样的调用栈，是很常见的 stacktrace 配置文件格式

1.2 支持什么使用模式

Report generation：报告生成
Interactive terminal use：交互式终端使用
Web interface：Web 界面

2 安装依赖工具：

1）安装 PProf

$ go get -u github.com/google/pprof

2）安装 FlameGraph

cd $WORK_PATH 
git clone https://github.com/brendangregg/FlameGraph.git
export PATH=$PATH:$WORK_PATH/FlameGraph

3）安装 graphviz
pprof生成的prof文件时二进制的，需要把这个二进制的文件转换为我们人类可读的，graphviz可以帮助我们把二进制的prof文件转换为图像

brew install graphviz

3 采集、分析数据

做 Profiling 第一步就是怎么获取应用程序的运行情况数据。go 语言提供了两个库：

runtime/pprof：采集程序（非 Server）的运行数据进行分析
net/http/pprof：对runtime/pprof包进行简单封装，并在http端口上暴露出来，采集 HTTP Server 的运行时数据进行分析

核心工具是：go tool pprof 命令：获取和分析 Profiling 数据

更详细的 pprof 使用方法可以参考 pprof --help 或者 pprof 文档。

对于不同类型的应该也有不同的采集和分析方式

3.1 工具型应用

3.1.1 脚本类

如果你的应用是一次性的，运行一段时间就结束。那么最好的办法，就是在应用退出的时候把 profiling 的报告保存到文件中，进行分析。对于这种情况，可以使用 runtime/pprof库。

pprof 封装了很好的接口供我们使用，比如要想进行 CPU Profiling，可以调用 pprof.StartCPUProfile() 方法，它会对当前应用程序进行 CPU profiling，并写入到提供的参数中（w io.Writer），要停止调用 StopCPUProfile() 即可。

示例：

import (
    "os"
    "runtime/pprof"
)
func main() {
    f, _ := os.Create("cpu.prof")
    defer f.Close()

    do somethings...

    pprof.StopCPUProfile()
}

应用执行结束后，就会生成一个文件(cpu.prof)，保存了我们的 CPU profiling 数据。

想要获得内存的数据，直接使用 WriteHeapProfile 就行，不用 start 和 stop 这两个步骤了：

f, err := os.Create(*memprofile)
pprof.WriteHeapProfile(f)f.Close()

3.1.2 通过测试用例进行性能测试

编写测试用例

（1）新建 data/d_test.go，文件内容：

package data
 
import "testing"
 
const url = "https://github.com/EDDYCJY"
 
func TestAdd(t *testing.T) {
    s := Add(url)
    if s == "" {
        t.Errorf("Test.Add error!")
    }
}
 
func BenchmarkAdd(b *testing.B) {
    for i := 0; i < b.N; i++ {
        Add(url)
    }
}

（2）执行测试用例，生成prof文件

$go test -bench=. -cpuprofile=cpu.prof

pkg: github.com/EDDYCJY/go-pprof-example/data

BenchmarkAdd-410000000187ns/op

PASS

ok      github.com/EDDYCJY/go-pprof-example/data2.300s

-memprofile 也可以了解一下

启动 PProf 可视化界面

$go tool pprof -http=:8080 cpu.prof

3.2 服务型应用(web服务)

3.2.1 引入

如果你的应用是一直运行的，比如 web 应用，那么可以使用 net/http/pprof 库，它能够在提供 HTTP 服务进行分析。

import中只需要添加一行：

import _ "net/http/pprof"

main函数中启动监听：

go func() {
    if err := http.ListenAndServe(fmt.Sprintf("%s:%s", "0.0.0.0", "6060"), nil); nil != err {
            panic(err)
        }
    }()

3.2.2 分析数据

3.2.2.1 通过 Web 界面

查看当前总览：访问 http://127.0.0.1:6060/debug/pprof/

/debug/pprof/
 
profiles:
0    block
5    goroutine
3    heap
0    mutex
9    threadcreate
 
full goroutine stack dump

这个页面中有许多子页面，咱们继续深究下去，看看可以得到什么？

cpu（CPU Profiling）: $HOST/debug/pprof/profile，默认进行 30s 的 CPU Profiling，得到一个分析用的 profile 文件
block（Block Profiling）：$HOST/debug/pprof/block，查看导致阻塞同步的堆栈跟踪
goroutine：$HOST/debug/pprof/goroutine，查看当前所有运行的 goroutines 堆栈跟踪
heap（Memory Profiling）: $HOST/debug/pprof/heap，查看活动对象的内存分配情况
mutex（Mutex Profiling）：$HOST/debug/pprof/mutex，查看导致互斥锁的竞争持有者的堆栈跟踪
threadcreate：$HOST/debug/pprof/threadcreate，查看创建新OS线程的堆栈跟踪

pprof监控的内容项目入下表所示。

3.2.2.1 通过交互式终端使用

能通过对应的库获取想要的 Profiling 数据之后（不管是文件还是 http），下一步就是要对这些数据进行保存和分析，我们可以使用 go tool pprof 命令行工具。
（1）go tool pprof http://localhost:6060/debug/pprof/profile?seconds=60

$ go tool pprof http://localhost:6060/debug/pprof/profile\?seconds\=60

Fetching profile over HTTP from http://localhost:6060/debug/pprof/profile?seconds=60

Saved profilein/Users/eddycjy/pprof/pprof.samples.cpu.007.pb.gz

Type: cpu

Duration: 1mins, Total samples = 26.55s (44.15%)

Entering interactive mode (type"help"forcommands,"o"foroptions)

(pprof)

执行该命令后，需等待 60 秒（可调整 seconds 的值），pprof 会进行 CPU Profiling。结束后将默认进入 pprof 的交互式命令模式，可以对分析的结果进行查看或导出。具体可执行 pprof help 查看命令说明

(pprof) top10

Showing nodes accountingfor25.92s, 97.63% of 26.55s total

Dropped 85 nodes (cum <= 0.13s)

Showing top 10 nodes out of 21

      flat  flat%  sum%        cum  cum%

    23.28s 87.68% 87.68%    23.29s 87.72%  syscall.Syscall

    0.77s  2.90% 90.58%      0.77s  2.90%  runtime.memmove

    0.58s  2.18% 92.77%      0.58s  2.18%  runtime.freedefer

    0.53s  2.00% 94.76%      1.42s  5.35%  runtime.scanobject

    0.36s  1.36% 96.12%      0.39s  1.47%  runtime.heapBitsForObject

    0.35s  1.32% 97.44%      0.45s  1.69%  runtime.greyobject

    0.02s 0.075% 97.51%    24.96s 94.01%  main.main.func1

    0.01s 0.038% 97.55%    23.91s 90.06%  os.(*File).Write

    0.01s 0.038% 97.59%      0.19s  0.72%  runtime.mallocgc

    0.01s 0.038% 97.63%    23.30s 87.76%  syscall.Write

flat：给定函数上运行耗时
flat%：同上的 CPU 运行耗时总比例
sum%：给定函数累积使用 CPU 总比例
cum：当前函数加上它之上的调用运行总耗时
cum%：同上的 CPU 运行耗时总比例

最后一列为函数名称，在大多数的情况下，我们可以通过这五列得出一个应用程序的运行情况，加以优化

pprof 不仅能打印出最耗时的地方(top)，还能列出函数代码以及对应的取样数据(list)、汇编代码以及对应的取样数据(disasm)，而且能以各种样式进行输出，比如 svg、gv、callgrind、png、gif等等

（2）go tool pprof http://localhost:6060/debug/pprof/heap

$ go tool pprof http://localhost:6060/debug/pprof/heap

Fetching profile over HTTP from http://localhost:6060/debug/pprof/heap

Saved profilein/Users/eddycjy/pprof/pprof.alloc_objects.alloc_space.inuse_objects.inuse_space.008.pb.gz

Type: inuse_space

Entering interactive mode (type"help"forcommands,"o"foroptions)

(pprof) top

Showing nodes accountingfor837.48MB, 100% of 837.48MB total

     flat  flat%  sum%        cum  cum%

 837.48MB  100%  100%  837.48MB  100%  main.main.func1

-inuse_space：分析应用程序的常驻内存占用情况
-alloc_objects：分析应用程序的内存临时分配情况

（3） go tool pprof http://localhost:6060/debug/pprof/block

（4） go tool pprof http://localhost:6060/debug/pprof/mutex

3.2.2.3 直接可视化查看（火焰图）

只需添加一个参数-http并指定端口，即可直接在web页面查看

go tool pprof -http=:8099 http://localhost:6060/debug/pprof/heap

或者通过pprof查看，效果是一样的

$ pprof -http=:8099 http://localhost:6060/debug/pprof/heap

火焰图
它的最大优点是动态的。调用顺序由上到下（A -> B -> C -> D），每一块代表一个函数，越大代表占用 CPU 的时间更长。同时它也支持点击块深入进行分析！

Top

Graph

框越大，线越粗代表它占用的时间越大哦

Peek

Source

通过 PProf 的可视化界面，我们能够更方便、更直观的看到 Go 应用程序的调用链、使用情况等，并且在 View 菜单栏中，还支持如上多种方式的切换

你想想，在烦恼不知道什么问题的时候，能用这些辅助工具来检测问题，是不是瞬间效率翻倍了呢 ?

总结

在本章节，粗略地介绍了 Go 的性能利器 PProf。在特定的场景中，PProf 给定位、剖析问题带了极大的帮助

参考链接：
https://blog.csdn.net/cscrazybing/article/details/78686941
http://www.brendangregg.com/flamegraphs.html

网友评论

本文标题：2020-03-19

本文链接：https://www.haomeiwen.com/subject/ebgkyhtx.html

延伸阅读

深度阅读

您也可以注册成为美文阅读网的作者，发表您的原创作品、分享您的心情！