美文网首页微服务 DevOps
微服务监控 - 监控自己的服务

微服务监控 - 监控自己的服务

作者: CatchZeng | 来源:发表于2021-05-17 13:57 被阅读0次

    原文:https://makeoptim.com/service-mesh/prometheus-client

    上一篇 讲解了使用 Exporter 监控 Kubernetes 集群应用。本篇主要向大家介绍如何监控自己的服务。

    要想自己的服务能够被监控,必须要将服务运行中的各项目指标暴露出来,提供给 Prometheus 采集信息。我们可以使用 Prometheus 提供的客户端库暴露自身的运行时信息。

    客户端库

    Prometheus 官方提供了 GoJava or ScalaPythonRuby 的客户端库。其他大部分语言,第三方也提供了相应的支持,详见客户端库文档

    在讲述如何使用客户端在服务中暴露指标前,让我们先来了解一下 Prometheus 库提供的各种指标类型。

    指标类型

    Prometheus 客户端库提供了四种核心指标类型

    Counter(计数器)

    一个计数器是代表一个累积指标单调递增计数器,它的值只会增加或在重启时重置为零。例如,您可以使用计数器来表示服务过的请求数、已完成任务数或错误次数

    注:不要使用计数器来暴露可以减小的值。例如,请勿对当前正在运行的进程数使用计数器;而是使用计量器。

    Gauge(计量器)

    gauge 是代表一个数值类型的指标,它的值可以增或减。gauge 通常用于一些度量的值例如温度或是当前内存使用,也可以用于一些可以增减的“计数”,如正在运行的 Goroutine 个数。

    Histogram(直方图)

    histogram 对观测值(类似请求延迟或回复包大小)进行采样,并用一些可配置的 buckets计数。它也会给出一个所有观测值的总和

    基本指标名称为 <basename> 的 histogram,在指标抓取期间会暴露多个时间序列:

    • 观测 buckets 的累积计数器,暴露为 <basename>_bucket{le="<upper inclusive bound>"}
    • 所有观察值的总和,暴露为 <basename>_sum
    • 已观察到的事件的计数,暴露为 <basename>_count(等同于上文的 <basename>_bucket{le="+Inf"}

    使用 histogram_quantile() 方法可以根据直方图甚至是直方图的聚合来计算分位数。直方图也适用于计算 Apdex 得分。在 buckets 上操作时,请记住直方图是累积的。有关直方图用法的详细信息以及与摘要的差异,请参见直方图和摘要

    Summary(摘要)

    跟 histogram 类似,summary 也对观测值(类似请求延迟或回复包大小)进行采样。同时它会给出一个总数以及所有观测值的总和,它在一个滑动的时间窗口上计算可配置的分位数。

    基本度量标准名称为 <basename> 的摘要会在指标抓取期间暴露多个时间序列:

    • streaming φ-位数(0≤φ≤1)观察到的事件,暴露为 <basename>{quantile="<φ>"}
    • 所有观察值的总和,暴露为 <basename>_sum
    • 已经被观察到的事件总数,暴露为 <basename>_count

    有关 φ 分位数的详细说明,摘要用法以及与直方图的差异,请参见直方图和摘要

    Demo

    下面以 Go 为例,讲解下如何使用 Prometheus 客户端监控自己的服务。

    提供 metrics 接口

    在服务中集成 Prometheus 的第一步就是提供 /metrics 接口。服务应该监听一个只在基础设施内可用的内部端口,通常是在 9xxx 范围内。Prometheus 团队维护一个默认端口分配的列表,选择端口时可以参考。

    以下代码,创建了一个新 HTTP 服务(demo1),通过 http://localhost:9001/metrics 暴露了 Prometheus Golang 应用的默认指标

    // demo1.go
    package main
    
    import (
        "net/http"
    
        "github.com/prometheus/client_golang/prometheus/promhttp"
    )
    
    func main() {
        http.Handle("/metrics", promhttp.Handler())
        http.ListenAndServe(":9001", nil)
    }
    

    启动服务

    go run demo1.go
    

    查看指标

    ❯ curl http://localhost:9001/metrics
    # HELP go_gc_duration_seconds A summary of the pause duration of garbage collection cycles.
    # TYPE go_gc_duration_seconds summary
    go_gc_duration_seconds{quantile="0"} 0
    go_gc_duration_seconds{quantile="0.25"} 0
    go_gc_duration_seconds{quantile="0.5"} 0
    go_gc_duration_seconds{quantile="0.75"} 0
    go_gc_duration_seconds{quantile="1"} 0
    go_gc_duration_seconds_sum 0
    go_gc_duration_seconds_count 0
    # HELP go_goroutines Number of goroutines that currently exist.
    # TYPE go_goroutines gauge
    go_goroutines 9
    # HELP go_info Information about the Go environment.
    # TYPE go_info gauge
    go_info{version="go1.13.1"} 1
    # HELP go_memstats_alloc_bytes Number of bytes allocated and still in use.
    # TYPE go_memstats_alloc_bytes gauge
    go_memstats_alloc_bytes 1.499288e+06
    # HELP go_memstats_alloc_bytes_total Total number of bytes allocated, even if freed.
    # TYPE go_memstats_alloc_bytes_total counter
    go_memstats_alloc_bytes_total 1.499288e+06
    # HELP go_memstats_buck_hash_sys_bytes Number of bytes used by the profiling bucket hash table.
    # TYPE go_memstats_buck_hash_sys_bytes gauge
    go_memstats_buck_hash_sys_bytes 1.443808e+06
    # HELP go_memstats_frees_total Total number of frees.
    # TYPE go_memstats_frees_total counter
    go_memstats_frees_total 151
    # HELP go_memstats_gc_cpu_fraction The fraction of this program's available CPU time used by the GC since the program started.
    # TYPE go_memstats_gc_cpu_fraction gauge
    go_memstats_gc_cpu_fraction 0
    # HELP go_memstats_gc_sys_bytes Number of bytes used for garbage collection system metadata.
    # TYPE go_memstats_gc_sys_bytes gauge
    go_memstats_gc_sys_bytes 2.240512e+06
    # HELP go_memstats_heap_alloc_bytes Number of heap bytes allocated and still in use.
    # TYPE go_memstats_heap_alloc_bytes gauge
    go_memstats_heap_alloc_bytes 1.499288e+06
    # HELP go_memstats_heap_idle_bytes Number of heap bytes waiting to be used.
    # TYPE go_memstats_heap_idle_bytes gauge
    go_memstats_heap_idle_bytes 6.4118784e+07
    # HELP go_memstats_heap_inuse_bytes Number of heap bytes that are in use.
    # TYPE go_memstats_heap_inuse_bytes gauge
    go_memstats_heap_inuse_bytes 2.531328e+06
    # HELP go_memstats_heap_objects Number of allocated objects.
    # TYPE go_memstats_heap_objects gauge
    go_memstats_heap_objects 2806
    # HELP go_memstats_heap_released_bytes Number of heap bytes released to OS.
    # TYPE go_memstats_heap_released_bytes gauge
    go_memstats_heap_released_bytes 6.4118784e+07
    # HELP go_memstats_heap_sys_bytes Number of heap bytes obtained from system.
    # TYPE go_memstats_heap_sys_bytes gauge
    go_memstats_heap_sys_bytes 6.6650112e+07
    # HELP go_memstats_last_gc_time_seconds Number of seconds since 1970 of last garbage collection.
    # TYPE go_memstats_last_gc_time_seconds gauge
    go_memstats_last_gc_time_seconds 0
    # HELP go_memstats_lookups_total Total number of pointer lookups.
    # TYPE go_memstats_lookups_total counter
    go_memstats_lookups_total 0
    # HELP go_memstats_mallocs_total Total number of mallocs.
    # TYPE go_memstats_mallocs_total counter
    go_memstats_mallocs_total 2957
    # HELP go_memstats_mcache_inuse_bytes Number of bytes in use by mcache structures.
    # TYPE go_memstats_mcache_inuse_bytes gauge
    go_memstats_mcache_inuse_bytes 13888
    # HELP go_memstats_mcache_sys_bytes Number of bytes used for mcache structures obtained from system.
    # TYPE go_memstats_mcache_sys_bytes gauge
    go_memstats_mcache_sys_bytes 16384
    # HELP go_memstats_mspan_inuse_bytes Number of bytes in use by mspan structures.
    # TYPE go_memstats_mspan_inuse_bytes gauge
    go_memstats_mspan_inuse_bytes 23936
    # HELP go_memstats_mspan_sys_bytes Number of bytes used for mspan structures obtained from system.
    # TYPE go_memstats_mspan_sys_bytes gauge
    go_memstats_mspan_sys_bytes 32768
    # HELP go_memstats_next_gc_bytes Number of heap bytes when next garbage collection will take place.
    # TYPE go_memstats_next_gc_bytes gauge
    go_memstats_next_gc_bytes 4.473924e+06
    # HELP go_memstats_other_sys_bytes Number of bytes used for other system allocations.
    # TYPE go_memstats_other_sys_bytes gauge
    go_memstats_other_sys_bytes 1.050904e+06
    # HELP go_memstats_stack_inuse_bytes Number of bytes in use by the stack allocator.
    # TYPE go_memstats_stack_inuse_bytes gauge
    go_memstats_stack_inuse_bytes 458752
    # HELP go_memstats_stack_sys_bytes Number of bytes obtained from system for stack allocator.
    # TYPE go_memstats_stack_sys_bytes gauge
    go_memstats_stack_sys_bytes 458752
    # HELP go_memstats_sys_bytes Number of bytes obtained from system.
    # TYPE go_memstats_sys_bytes gauge
    go_memstats_sys_bytes 7.189324e+07
    # HELP go_threads Number of OS threads created.
    # TYPE go_threads gauge
    go_threads 9
    # HELP promhttp_metric_handler_requests_in_flight Current number of scrapes being served.
    # TYPE promhttp_metric_handler_requests_in_flight gauge
    promhttp_metric_handler_requests_in_flight 1
    # HELP promhttp_metric_handler_requests_total Total number of scrapes by HTTP status code.
    # TYPE promhttp_metric_handler_requests_total counter
    promhttp_metric_handler_requests_total{code="200"} 1
    promhttp_metric_handler_requests_total{code="500"} 0
    promhttp_metric_handler_requests_total{code="503"} 0
    

    添加自己的指标

    demo1 只暴露了默认的指标。下面,我们添加一个名为 myapp_processed_ops_total计数器指标。该计数器对到目前为止已处理的操作数进行计数。每 2 秒,计数器将增加 1。

    // demo2.go
    package main
    
    import (
        "net/http"
        "time"
    
        "github.com/prometheus/client_golang/prometheus"
        "github.com/prometheus/client_golang/prometheus/promauto"
        "github.com/prometheus/client_golang/prometheus/promhttp"
    )
    
    func recordMetrics() {
        go func() {
            for {
                opsProcessed.Inc()
                time.Sleep(2 * time.Second)
            }
        }()
    }
    
    var (
        opsProcessed = promauto.NewCounter(prometheus.CounterOpts{
            Name: "myapp_processed_ops_total",
            Help: "The total number of processed events",
        })
    )
    
    func main() {
        recordMetrics()
    
        http.Handle("/metrics", promhttp.Handler())
        http.ListenAndServe(":9001", nil)
    }
    

    启动服务

    go run demo2.go
    

    查看指标

    ❯ curl http://localhost:9001/metrics
    ...
    # HELP myapp_processed_ops_total The total number of processed events
    # TYPE myapp_processed_ops_total counter
    myapp_processed_ops_total 5
    # HELP promhttp_metric_handler_requests_in_flight Current number of scrapes being served.
    # TYPE promhttp_metric_handler_requests_in_flight gauge
    promhttp_metric_handler_requests_in_flight 1
    ...
    

    多次查看,可以看到指标 myapp_processed_ops_total 值一直在增加。

    ❯ curl http://localhost:9001/metrics
    ...
    # HELP myapp_processed_ops_total The total number of processed events
    # TYPE myapp_processed_ops_total counter
    myapp_processed_ops_total 26
    # HELP promhttp_metric_handler_requests_in_flight Current number of scrapes being served.
    # TYPE promhttp_metric_handler_requests_in_flight gauge
    promhttp_metric_handler_requests_in_flight 1
    ...
    

    小结

    本篇以计数器为例为大家介绍了如何向自己的服务添加指标。你还可以暴露其他指标类型,详见用法参见 client_golang

    下一篇将为大家带来,Grafana 使用教程

    注:本章内容涉及的 yaml 文件可前往 https://github.com/MakeOptim/service-mesh/prometheus 获取。

    相关文章

      网友评论

        本文标题:微服务监控 - 监控自己的服务

        本文链接:https://www.haomeiwen.com/subject/xidjjltx.html