美文网首页
DB2调优(二)资源监控

DB2调优(二)资源监控

作者: 吉米曲 | 来源:发表于2018-01-30 15:29 被阅读0次

    本次性能调优项目中由于涉及的环节较多,最好能够将生成环境中的所有内容进行监控,同时考虑最低开销,这样就从应用服务器和数据库服务器两个服务器进行,以nmon作为监控基础数据,同时监控JVM和数据库告警和快照。
    所有监控的内容都是手段,只有从海量的监控日志中得到规律性、有意义的数据才是性能优化的基础。有了数据就是对数据的分析,本文将首先介绍需要获取的数据,内容也将是我从项目获取的经验。
    基础环境:

    两台数据库服务器,做的数据库集群。
    

    应用服务器 - JVM线程

    项目中主要使用tongweb(老系统版本很低),监控内容类似如下:

    监控内容

    ...
    "2018-01-11T02:25:55.663+0800","com.tongtech.tongweb:name=***,type=jdbc-connection-pool,category=monitor,server=server","NumConnCreated","10",
    "2018-01-11T02:25:55.663+0800","com.tongtech.tongweb:name=***,type=jdbc-connection-pool,category=monitor,server=server","NumConnAcquired","111292",
    "2018-01-11T02:25:55.663+0800","com.tongtech.tongweb:name=***,type=jdbc-connection-pool,category=monitor,server=server","NumConnNotSuccessfullyMatched","0",
    "2018-01-11T02:26:25.670+0800","com.tongtech.tongweb:type=jvm,category=monitor,server=server","UpTime","222520621",
    "2018-01-11T02:26:25.670+0800","com.tongtech.tongweb:type=jvm,category=monitor,server=server","HeapSize","2143485952",
    "2018-01-11T02:26:25.671+0800","com.tongtech.tongweb:name=***,type=jdbc-connection-pool,category=monitor,server=server","NumConnUsed","0",
    "2018-01-11T02:26:25.671+0800","com.tongtech.tongweb:name=***,type=jdbc-connection-pool,category=monitor,server=server","NumConnSuccessfullyMatched","0",
    "2018-01-11T02:26:25.671+0800","com.tongtech.tongweb:name=***,type=jdbc-connection-pool,category=monitor,server=server","WaitQueueLength","0",
    "2018-01-11T02:26:25.671+0800","com.tongtech.tongweb:name=***,type=jdbc-connection-pool,category=monitor,server=server","NumConnDestroyed","0",
    "2018-01-11T02:26:25.671+0800","com.tongtech.tongweb:name=***,type=jdbc-connection-pool,category=monitor,server=server","ConnRequestWaitTime","4",
    "2018-01-11T02:26:25.672+0800","com.tongtech.tongweb:name=***,type=jdbc-connection-pool,category=monitor,server=server","NumConnFailedValidation","0",
    "2018-01-11T02:26:25.672+0800","com.tongtech.tongweb:name=***,type=jdbc-connection-pool,category=monitor,server=server","NumConnReleased","111292",
    "2018-01-11T02:26:25.672+0800","com.tongtech.tongweb:name=***,type=jdbc-connection-pool,category=monitor,server=server","NumConnFree","10",
    ...
    

    关注内容

    tongweb的监控数据获取连接池状态等信息,我们的方法是通过Excel宏的方式将日志内转换成可读数据,并进行图形分析。具体内容将单独说明。
    JVM线程监控说明

    监控意义

    通过对tongweb的JVM监控,可初步判定性能高峰时间点、连接池是否满,同时进一步判定连接高峰期的性能瓶颈是否出现在应用上,这对今后的性能分析尤为重要,可将主要性能问题归类,减少不必要的工作。

    应用服务器 - netstat

    在Internet RFC标准中,Netstat的定义是: Netstat是在内核中访问网络连接状态及其相关信息的程序,它能提供TCP连接,TCP和UDP监听,进程内存管理的相关报告。

    监控内容

    以下是在项目中获取的日志摘取

    ...
    Active Internet connections (servers and established)
    Proto Recv-Q Send-Q Local Address           Foreign Address         State      
    tcp        0      0 0.0.0.0:2049            0.0.0.0:*               LISTEN      
    tcp        0      0 0.0.0.0:139             0.0.0.0:*               LISTEN      
    tcp        0      0 0.0.0.0:427         0.0.0.0:*               LISTEN      
    tcp        0      0 127.0.0.1:427           0.0.0.0:*               LISTEN      
    tcp        0      0 0.0.0.0:58862           0.0.0.0:*               LISTEN      
    tcp        0      0 0.0.0.0:111             0.0.0.0:*               LISTEN      
    tcp        0      0 0.0.0.0:2544            0.0.0.0:*               LISTEN      
    tcp        0      0 0.0.0.0:21              0.0.0.0:*               LISTEN      
    tcp        0      0 0.0.0.0:631             0.0.0.0:*               LISTEN      
    tcp        0      0 127.0.0.1:25            0.0.0.0:*               LISTEN      
    tcp        0      0 0.0.0.0:445             0.0.0.0:*               LISTEN      
    tcp        0      0 0.0.0.0:669             0.0.0.0:*               LISTEN  
    ...
    

    应用服务器 - nmon

    作为本次性能优化主要的分析手段,nmon起着尤为重要的作用,以下是wiki的解释,有时间可以了解

    nmon collects the following operating system statistics:
    CPU and CPU threads Utilisation
    CPU frequency for servers or virtual machines that can alter their clock rate
    GPU stats including utilisation, MHz and temperatures
    Physical and Virtual Memory use
    Disk read & write and transfers
    Disk Groups decided by the user
    Swap and Paging
    Network read & write and transfers
    Local File-systems
    Network File-system (NFS)
    Top Processes by CPU use, Memory size and I/O rates
    Kernel stats including Run Queue, context-switch, fork, Load Average & Uptime
    Large and Huge memory pages
    Virtual Machine stats (depending on the hardware) - useful for Linux running KVM to host virtual machines
    Resources in the Server and virtual machine

    总结其实nmon更像是系统性能开销的快照,结合对nmon的分析工具可以很清楚的掌握系统的各项指标。
    下载分析工具

    数据库服务器 - 告警

    了解数据库的告警日志也是掌握当前性能的关键环节。

    日志如下,如出现error可以针对具体情况进行分析解决。

    2018-01-11-00.36.36.090562+480 I13363168A459      LEVEL: Error
    PID     : 2228842              TID  : 142490      PROC : db2sysc
    INSTANCE: db2             NODE : 000         DB   : TRADE
    EDUID   : 142490               EDUNAME: db2agent (**) 0
    FUNCTION: DB2 UDB, Query Gateway, sqlqg_fedstp_hook, probe:40
    MESSAGE : Unexpected error returned from outer RC=
    DATA #1 : Hexdump, 4 bytes
    0x07000007053F28D0 : 8126 0012                                  .&..
    
    

    数据库服务器 - 快照

    数据库日志快照将作为主要分析依据,在快照中可以分析数据库时间的开销情况,如下:

    ...
    
    Number of automatic storage paths          = 1
    Automatic storage path                     = /db2data
          Node number                          = 0
          State                                = In Use
          File system ID                       = 9223372079804448776
          Storage path free space (bytes)      = 69730709504
          File system used space (bytes)       = 139648946176
          File system total space (bytes)      = 209379655680
    
    ...
    
    
    

    本文只是列出了分析的方法,具体操作有时间我会慢慢总结。
    工具的利用固然重要,但是性能调优并不是仅仅如此,必须步步为营做好长期作战的准备。

    相关文章

      网友评论

          本文标题:DB2调优(二)资源监控

          本文链接:https://www.haomeiwen.com/subject/twjczxtx.html