使用 SkyWalking 进行分布式链路跟踪

作者: 王睿同学 | 来源:发表于2018-10-11 13:46 被阅读1763次

使用 SkyWalking 进行分布式链路跟踪
skywalking搭建
分布式链路追踪skywalking(二)-oap服务篇
分布式链路追踪skywalking(一)-agent客户端
分布式链路追踪
基于SkyWalking的分布式跟踪系统 - 微服务监控
分布式链路追踪实践(二) - 基于 OpenTracing 设计
谁说Cat不能做链路跟踪的，给我站出来
1、Skywalking的初识-概念
SkyWalking - 实现微服务监控告警

SkyWalking 简介

SkyWalking是一款APM（应用程序性能监视器），尤其适用于微服务，Cloud Native 和基于容器的架构系统，也称为分布式跟踪系统。它提供了一种自动检测应用程序的方法：无需更改目标应用程序的任何源代码，以及具有高效流媒体模块的收集器。
针对分布式系统的APM（应用性能监控）系统，特别针对微服务、Cloud Native 和容器化（Docker, Kubernetes, Mesos）架构，其核心是个分布式跟踪系统。

使用步骤

监控服务搭建

所需的第三方软件

Elasticsearch 5.x(集群模式或不使用)
Zookeeper 3.4.10
被监控应用的宿主服务器系统时间(包含时区)与collectors，UIs部署的宿主服务器时间设置正确且相同

部署 Zookeeper

Zookeeper用于collector协作,仅在需要多个collector实例时才需要。
在每个collector实例的application.yml中添加Zookeeper集群配置：

cluster:
    # zk用于管理collector集群协作.
    zookeeper:
        # 多个zk连接地址用逗号分隔.
        hostPort: localhost:2181
        sessionTimeout: 100000

部署Elasticsearch

修改elasticsearch.yml文件：

设置 cluster.name: CollectorDBCluster。此名称需要和collector配置文件一致。
设置 node.name: anyname，可以设置为任意名字，如Elasticsearch为集群模式，则每个节点名称需要不同。
修改/增加的内容如下：

cluster.name: elasticsearch

node.name: node-1

# ES监听的IP地址
network.host: 172.20.15.52

thread_pool.bulk.queue_size: 1000

下载发布版本

前往发布页面下载

配置 SkyWalking Collector

修改 config/application.yml：

#cluster:
#  zookeeper:
#    hostPort: localhost:2181
#    sessionTimeout: 100000
naming:
  jetty:
    #OS real network IP(binding required), for agent to find collector cluster
    host: 172.20.15.53
    port: 10800
    contextPath: /
cache:
#  guava:
  caffeine:
remote:
  gRPC:
    # OS real network IP(binding required), for collector nodes communicate with each other in cluster. collectorN --(gRPC) --> collectorM
    host: 172.20.15.53
    port: 11800
agent_gRPC:
  gRPC:
    #OS real network IP(binding required), for agent to uplink data(trace/metrics) to collector. agent--(gRPC)--> collector
    host: 172.20.15.53
    port: 11800
    # Set these two setting to open ssl
    #sslCertChainFile: $path
    #sslPrivateKeyFile: $path

    # Set your own token to active auth
    #authentication: xxxxxx
agent_jetty:
  jetty:
    # OS real network IP(binding required), for agent to uplink data(trace/metrics) to collector through HTTP. agent--(HTTP)--> collector
    # SkyWalking native Java/.Net/node.js agents don't use this.
    # Open this for other implementor.
    host: localhost
    port: 12800
    contextPath: /
analysis_register:
  default:
analysis_jvm:
  default:
analysis_segment_parser:
  default:
    bufferFilePath: ../buffer/
    bufferOffsetMaxFileSize: 10M
    bufferSegmentMaxFileSize: 500M
    bufferFileCleanWhenRestart: true
ui:
  jetty:
    # Stay in `localhost` if UI starts up in default mode.
    # Change it to OS real network IP(binding required), if deploy collector in different machine.
    host: localhost
    port: 12800
    contextPath: /
storage:
  elasticsearch:
    clusterName: elasticsearch
    clusterTransportSniffer: true
    clusterNodes: 172.20.15.53:9300
    indexShardsNumber: 2
    indexReplicasNumber: 0
    highPerformanceMode: true
    # Batch process setting, refer to https://www.elastic.co/guide/en/elasticsearch/client/java-api/5.5/java-docs-bulk-processor.html
    bulkActions: 2000 # Execute the bulk every 2000 requests
    bulkSize: 20 # flush the bulk every 20mb
    flushInterval: 10 # flush the bulk every 10 seconds whatever the number of requests
    concurrentRequests: 2 # the number of concurrent requests
    # Set a timeout on metric data. After the timeout has expired, the metric data will automatically be deleted.
    traceDataTTL: 90 # Unit is minute
    minuteMetricDataTTL: 90 # Unit is minute
    hourMetricDataTTL: 36 # Unit is hour
    dayMetricDataTTL: 45 # Unit is day
    monthMetricDataTTL: 18 # Unit is month
#storage:
#  h2:
#    url: jdbc:h2:~/memorydb
#    userName: sa
configuration:
  default:
    #namespace: xxxxx
    # alarm threshold
    applicationApdexThreshold: 2000
    serviceErrorRateThreshold: 10.00
    serviceAverageResponseTimeThreshold: 2000
    instanceErrorRateThreshold: 10.00
    instanceAverageResponseTimeThreshold: 2000
    applicationErrorRateThreshold: 10.00
    applicationAverageResponseTimeThreshold: 2000
    # thermodynamic
    thermodynamicResponseTimeStep: 50
    thermodynamicCountOfResponseTimeSteps: 40
    # max collection's size of worker cache collection, setting it smaller when collector OutOfMemory crashed.
    workerCacheMaxSize: 10000
#receiver_zipkin:
#  default:
#    host: localhost
#    port: 9411
#    contextPath: /

配置 SkyWalking UI

修改webapp/webapp.yml：

server:
  port: 8080

collector:
  path: /graphql
  ribbon:
    ReadTimeout: 10000
    listOfServers: 172.20.15.53:10800

security:
  user:
    admin:
      password: admin

启动 SkyWalking Collector 节点

使用 bin/startup.sh同时启动collector和UI，若不使用1启动，需要单独启动，参考2和3
单独启动collector，运行 bin/collectorService.sh
单独启动UI，运行 bin/webappService.sh

部署 SkyWalking Java Agent

拷贝agent目录到所需位置，日志、插件和配置都包含在包中，不要改变目录结构。
增加JVM启动参数， -javaagent:/path/to/skywalking-agent/skywalking-agent.jar. 参数值为skywalking-agent.jar的绝对路径。
启动被监控应用。

高级特性

插件全部放置在 /plugins 目录中，新的插件只需要在启动阶段放在目录中就自动生效，删除则失效。
Log默认使用文件输出到 /logs目录中。

部署Java Agent FAQs

Linux Tomcat 7, Tomcat 8
修改 tomcat/bin/catalina.sh，在首行加入如下信息：

CATALINA_OPTS="$CATALINA_OPTS -javaagent:/path/to/skywalking-agent/skywalking-agent.jar"; export CATALINA_OPTS

Windows Tomcat 7, Tomcat 8
修改 tomcat/bin/catalina.bat，在首行加入如下信息：

set "CATALINA_OPTS=-javaagent:/path/to/skywalking-agent/skywalking-agent.jar"
JAR file

在启动你的应用程序的命令行中，添加 -javaagent 参数，并确保在-jar参数之前添加它。例如：

java -javaagent:/path/to/skywalking-agent/skywalking-agent.jar -jar yourApp.jar

个性化服务过滤

提供了一个可选插件 apm-trace-ignore-plugin

介绍

这个插件的作用是对追踪的个性化服务过滤。
你可以设置多个需要忽略的URL路径，意味着包含这些路径的追踪信息不会被agent发送到 collector。
当前的路径匹配规则是 Ant Path匹配风格，例如 /path/*, /path/**, /path/?。
将apm-trace-ignore-plugin-x.jar拷贝到agent/plugins后，重启探针即可生效。
Skywalking-使用可选插件 apm-trace-ignore-plugin 有详细使用介绍。

如何配置路径

有两种配置方式，可使用任意一种，配置生效的优先级从高到低：

在系统环境变量中配置，你需要在系统变量中添加skywalking.trace.ignore_path, 值是你需要忽略的路径，多个以,号分隔。
将/agent/optional-plugins/apm-trace-ignore-plugin/apm-trace-ignore-plugin.config 复制或剪切到 /agent/config/ 目录下，加上配置：

trace.ignore_path=/your/path/1/**,/your/path/2/**

使用中发现的问题

在使用过程中发现，apm-springmvc-annotation-3/4.x-plugin-5.0.0-RC2插件会导致Spring MVC中非GET请求中的@RequestBody对象中的属性值全部为null，从plugins目录中移出后即可恢复正常，具体原因还未查明。

网友评论

本文标题：使用 SkyWalking 进行分布式链路跟踪

本文链接：https://www.haomeiwen.com/subject/uxfbaftx.html

延伸阅读

深度阅读

您也可以注册成为美文阅读网的作者，发表您的原创作品、分享您的心情！

使用 SkyWalking 进行分布式链路跟踪

SkyWalking 简介

使用步骤

监控服务搭建

所需的第三方软件

部署 Zookeeper

部署Elasticsearch

下载发布版本

配置 SkyWalking Collector

配置 SkyWalking UI

启动 SkyWalking Collector 节点

部署 SkyWalking Java Agent

高级特性

部署Java Agent FAQs

个性化服务过滤

介绍

如何配置路径

使用中发现的问题

相关文章

使用 SkyWalking 进行分布式链路跟踪

skywalking搭建

分布式链路追踪skywalking(二)-oap服务篇

分布式链路追踪skywalking(一)-agent客户端

分布式链路追踪

基于SkyWalking的分布式跟踪系统 - 微服务监控

分布式链路追踪实践(二) - 基于 OpenTracing 设计

谁说Cat不能做链路跟踪的，给我站出来

1、Skywalking的初识-概念

SkyWalking - 实现微服务监控告警

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读

微服务架构

SkyWalking