美文网首页运维点滴
使用goaccess分析nginx日志

使用goaccess分析nginx日志

作者: 三杯水Plus | 来源:发表于2020-09-18 12:23 被阅读0次

    GoAccess简介

    GoAccess 是一款开源(MIT许可证)的且具有交互视图界面的实时 Web 日志分析工具,通过你的 Web 浏览器或者 *nix 系统下的终端程序即可访问。

    能为系统管理员提供快速且有价值的 HTTP 统计,并以在线可视化服务器的方式呈现。 GoAccess 解析指定的 Web 日志文件并将统计结果输出到 X 终端。功能如下:

    • 通用统计: 此面板展示了几个主要指标,比如:有效和无效请求的数量,分析这些数据所花费的时间,独立访客的情况,请求的文件,静态文件(CSS, ICO, JPG 等)的完整URL,404错误,被解析的日志文件的大小以及消耗的带宽。
    • 独立访客: 此面板按照日期展示了访问次数,独立访客数,以及累计消耗的带宽等指标。具有相同IP,相同访问时间,相同的 UserAgent 的 HTTP 请求将会被识别为独立访客。默认情况下包含了网络爬虫。
      您也可以选择使用 --date-spec=hr 参数将按照日期分析修改为按照小时,例如:05/Jun/2016:16 。这对于希望在小时级别去跟踪每日流量非常有帮助。
    • 请求的文件: 此面板展示您服务器上被请求次数最多的文件。包含访问次数,独立访客数,百分比,累计消耗带宽,使用的协议,请求方式。
    • 请求的静态文件: 列出请求频率最高的静态文件类型,例如: JPG, CSS, SWF, JS, GIF, 和 PNG , 以及和上一个面板一样的其他指标。 另外静态文件可以被添加到配置文件中。
    • 404 或者文件未找到: 展示内容与之前的面板类似,但是其数据包含了所有未找到的页面,以及俗称的 404 状态码。
    • 主机: 此面板展示主机自身的详细信息。能够很好的发现不怀好意的爬虫以及识别出是谁吃掉了你的带宽。
      扩展面板将向您展示更多信息,比如主机的反向DNS解析结果,主机所在国家和城市。如果开启了 参数,选择想查看的 IP 地址并按回车,将会显示 UserAgent 列表。
    • 操作系统: 此面板将显示主机使用的操作系统的信息。GoAccess 将尽可能尝试为每一款操作系统提供详细的信息。
    • 浏览器: 此面板将显示来访主机使用的浏览器信息。GoAccess 将尽可能尝试为每一款浏览器提供详细的信息。
    • 访问次数: 此面板按小时报告。因此将显示24个数据点,每一个均对应每一天的某一个小时。
      使用 --hour-spec=min 参数可以设定为按每十分钟报告,并将以 16:4 的格式显示时间。这对发现服务器的峰值访问时段很有帮助。
    • 虚拟主机: 此面板将显示从访问日志中解析出来的不同的虚拟主机的情况。此面板仅在日志格式中启用了 %v 参数时显示。
    • 来路URL: 如果问题主机通过其他的资源访问了你的站点,以及通过从其他主机上的链接或者跳转到你的站点,则这些来路URL将会被显示在此面板。可以在配置文件中通过 --ignore-panel 开启此功能。(默认关闭)
    • 来路站点: 此面板将仅显示主机的部分,而不是完整的URL。
    • 关键字: 报告支持用在谷歌搜索,谷歌缓存,谷歌翻译上使用关键字。目前仅支持通过 HTTP 使用谷歌搜索。 可以在配置文件中通过 --ignore-panel 开启此功能。(默认关闭)
    • 地理位置: 根据 IP 地址判断地理位置。统计数据按照大洲和国家分组。需要地理位置模块的支持。
    • HTTP 状态码: 以数字表示的 HTTP 请求的状态编码。
    • **远程用户(HTTP验证) **通过 HTTP 验证来确定访问文档的权限。如果文档没有被密码保护起来,这部分将会显示为 “-”。此面板默认为开启,除非在日志格式变量中设置了参数 %e 。

    GoAccess使用

    安装goaccess

    [root@VM_0_26_centos logs]# yum install goaccess
    Loaded plugins: fastestmirror, langpacks
    Repository epel is listed more than once in the configuration
    epel                                                                | 4.7 kB  00:00:00     
    extras                                                              | 2.9 kB  00:00:00     
    nux-dextop                                                          | 2.9 kB  00:00:00     
    os                                                                  | 3.6 kB  00:00:00     
    rpmfusion-free-updates                                              | 3.7 kB  00:00:00     
    rpmfusion-nonfree-updates                                           | 3.7 kB  00:00:00     
    updates                                                             | 2.9 kB  00:00:00     
    zabbix                                                              | 2.9 kB  00:00:00     
    zabbix-non-supported                                                |  951 B  00:00:00     
    (1/2): epel/7/x86_64/updateinfo                                     | 1.0 MB  00:00:00     
    (2/2): epel/7/x86_64/primary_db                                     | 6.9 MB  00:00:02     
    Loading mirror speeds from cached hostfile
     * nux-dextop: mirror.li.nux.ro
     * rpmfusion-free-updates: mirrors.ustc.edu.cn
     * rpmfusion-nonfree-updates: mirrors.ustc.edu.cn
    Resolving Dependencies
    --> Running transaction check
    ---> Package goaccess.x86_64 0:1.3-1.el7 will be installed
    --> Processing Dependency: libtokyocabinet.so.9()(64bit) for package: goaccess-1.3-1.el7.x86_64
    --> Running transaction check
    ---> Package tokyocabinet.x86_64 0:1.4.48-3.el7 will be installed
    --> Finished Dependency Resolution
    
    Dependencies Resolved
    
    ===========================================================================================
     Package                 Arch              Version                   Repository       Size
    ===========================================================================================
    Installing:
     goaccess                x86_64            1.3-1.el7                 epel            240 k
    Installing for dependencies:
     tokyocabinet            x86_64            1.4.48-3.el7              os              459 k
    
    Transaction Summary
    ===========================================================================================
    Install  1 Package (+1 Dependent package)
    
    Total download size: 699 k
    Installed size: 2.0 M
    Is this ok [y/d/N]: y
    Downloading packages:
    (1/2): goaccess-1.3-1.el7.x86_64.rpm                                | 240 kB  00:00:00     
    (2/2): tokyocabinet-1.4.48-3.el7.x86_64.rpm                         | 459 kB  00:00:00     
    -------------------------------------------------------------------------------------------
    Total                                                      1.3 MB/s | 699 kB  00:00:00     
    Running transaction check
    Running transaction test
    Transaction test succeeded
    Running transaction
      Installing : tokyocabinet-1.4.48-3.el7.x86_64                                        1/2 
      Installing : goaccess-1.3-1.el7.x86_64                                               2/2 
      Verifying  : tokyocabinet-1.4.48-3.el7.x86_64                                        1/2 
      Verifying  : goaccess-1.3-1.el7.x86_64                                               2/2 
    
    Installed:
      goaccess.x86_64 0:1.3-1.el7                                                              
    
    Dependency Installed:
      tokyocabinet.x86_64 0:1.4.48-3.el7 
    

    查看使用方式

    [root@VM_0_26_centos logs]# goaccess -help
    
    GoAccess - 1.3
    
    Usage: goaccess [filename] [ options ... ] [-c][-M][-H][-S][-q][-d][...]
    The following options can also be supplied to the command:
    
    Log & Date Format Options
    
      --date-format=<dateformat>      - Specify log date format. e.g., %d/%b/%Y
      --log-format=<logformat>        - Specify log format. Inner quotes need to be
                                        escaped, or use single quotes.
      --time-format=<timeformat>      - Specify log time format. e.g., %H:%M:%S
    
    User Interface Options
    
      -c --config-dialog              - Prompt log/date/time configuration window.
      -i --hl-header                  - Color highlight active panel.
      -m --with-mouse                 - Enable mouse support on main dashboard.
      --color=<fg:bg[attrs, PANEL]>   - Specify custom colors. See manpage for more
                                        details and options.
      --color-scheme=<1|2|3>          - Schemes: 1 => Grey, 2 => Green, 3 => Monokai.
      --html-custom-css=<path.css>    - Specify a custom CSS file in the HTML report.
      --html-custom-js=<path.js>      - Specify a custom JS file in the HTML report.
      --html-prefs=<json_obj>         - Set default HTML report preferences.
      --html-report-title=<title>     - Set HTML report page title and header.
      --json-pretty-print             - Format JSON output w/ tabs & newlines.
      --max-items                     - Maximum number of items to show per panel.
                                        See man page for limits.
      --no-color                      - Disable colored output.
      --no-column-names               - Don't write column names in term output.
      --no-csv-summary                - Disable summary metrics on the CSV output.
      --no-html-last-updated          - Hide HTML last updated field.
      --no-parsing-spinner            - Disable progress metrics and parsing spinner.
      --no-progress                   - Disable progress metrics.
      --no-tab-scroll                 - Disable scrolling through panels on TAB.
    
    Server Options
    
      --addr=<addr>                   - Specify IP address to bind server to.
      --daemonize                     - Run as daemon (if --real-time-html enabled).
      --fifo-in=<path>                - Path to read named pipe (FIFO).
      --fifo-out=<path>               - Path to write named pipe (FIFO).
      --origin=<addr>                 - Ensure clients send the specified origin header
                                        upon the WebSocket handshake.
      --pid-file=<path>               - Write PID to a file when --daemonize is used.
      --port=<port>                   - Specify the port to use.
      --real-time-html                - Enable real-time HTML output.
      --ssl-cert=<cert.crt>           - Path to TLS/SSL certificate.
      --ssl-key=<priv.key>            - Path to TLS/SSL private key.
      --ws-url=<url>                  - URL to which the WebSocket server responds.
    
    File Options
    
      -                               - The log file to parse is read from stdin.
      -f --log-file=<filename>        - Path to input log file.
      -S --log-size=<number>          - Specify the log size, useful when piping in logs.
      -l --debug-file=<filename>      - Send all debug messages to the specified
                                        file.
      -p --config-file=<filename>     - Custom configuration file.
      --invalid-requests=<filename>   - Log invalid requests to the specified file.
      --no-global-config              - Don't load global configuration file.
    
    Parse Options
    
      -a --agent-list                 - Enable a list of user-agents by host.
      -b --browsers-file=<path>       - Use additional custom list of browsers.
      -d --with-output-resolver       - Enable IP resolver on HTML|JSON output.
      -e --exclude-ip=<IP>            - Exclude one or multiple IPv4/6. Allows IP
                                        ranges e.g. 192.168.0.1-192.168.0.10
      -H --http-protocol=<yes|no>     - Set/unset HTTP request protocol if found.
      -M --http-method=<yes|no>       - Set/unset HTTP request method if found.
      -o --output=file.html|json|csv  - Output either an HTML, JSON or a CSV file.
      -q --no-query-string            - Ignore request's query string. Removing the
                                        query string can greatly decrease memory
                                        consumption.
      -r --no-term-resolver           - Disable IP resolver on terminal output.
      --444-as-404                    - Treat non-standard status code 444 as 404.
      --4xx-to-unique-count           - Add 4xx client errors to the unique visitors
                                        count.
      --anonymize-ip                  - Anonymize IP addresses before outputting to report.
      --all-static-files              - Include static files with a query string.
      --crawlers-only                 - Parse and display only crawlers.
      --date-spec=<date|hr>           - Date specificity. Possible values: `date`
                                        (default), or `hr`.
      --double-decode                 - Decode double-encoded values.
      --enable-panel=<PANEL>          - Enable parsing/displaying the given panel.
      --hide-referer=<NEEDLE>         - Hide a referer but still count it. Wild cards
                                        are allowed. i.e., *.bing.com
      --hour-spec=<hr|min>            - Hour specificity. Possible values: `hr`
                                        (default), or `min` (tenth of a min).
      --ignore-crawlers               - Ignore crawlers.
      --ignore-panel=<PANEL>          - Ignore parsing/displaying the given panel.
      --ignore-referer=<NEEDLE>       - Ignore a referer from being counted. Wild cards
                                        are allowed. i.e., *.bing.com
      --ignore-statics=<req|panel>    - Ignore static requests.
                                        req => Ignore from valid requests.
                                        panel => Ignore from valid requests and panels.
      --ignore-status=<CODE>          - Ignore parsing the given status code.
      --num-tests=<number>            - Number of lines to test. >= 0 (10 default)
      --process-and-exit              - Parse log and exit without outputting data.
      --real-os                       - Display real OS names. e.g, Windows XP, Snow
                                        Leopard.
      --sort-panel=PANEL,METRIC,ORDER - Sort panel on initial load. For example:
                                        --sort-panel=VISITORS,BY_HITS,ASC. See
                                        manpage for a list of panels/fields.
      --static-file=<extension>       - Add static file extension. e.g.: .mp3.
                                        Extensions are case sensitive.
    
    GeoIP Options
    
      -g --std-geoip                  - Standard GeoIP database for less memory
                                        consumption.
      --geoip-database=<path>         - Specify path to GeoIP database file. i.e.,
                                        GeoLiteCity.dat, GeoIPv6.dat ...
    
    Other Options
    
      -h --help                       - This help.
      -V --version                    - Display version information and exit.
      -s --storage                    - Display current storage method. e.g., B+
                                        Tree, Hash.
      --dcf                           - Display the path of the default config
                                        file when `-p` is not used.
    
    Examples can be found by running `man goaccess`.
    
    For more details visit: http://goaccess.io
    GoAccess Copyright (C) 2009-2017 by Gerardo Orellana
    

    获取Nginx日志格式
    格式转换脚本在https://github.com/stockrt/nginx2goaccess/blob/master/nginx2goaccess.sh,具体内容如下

    [root@VM_0_26_centos logs]# cat nginx2goaccess.sh 
    #!/bin/bash
    #
    # Convert from this:
    #   http://nginx.org/en/docs/http/ngx_http_log_module.html
    # To this:
    #   https://goaccess.io/man
    #
    # Conversion table:
    #   $time_local         %d:%t %^
    #   $host               %v
    #   $http_host          %v
    #   $remote_addr        %h
    #   $request_time       %T
    #   $request_method     %m
    #   $request_uri        %U
    #   $server_protocol    %H
    #   $request            %r
    #   $status             %s
    #   $body_bytes_sent    %b
    #   $bytes_sent         %b
    #   $http_referer       %R
    #   $http_user_agent    %u
    #
    # Samples:
    #
    # log_format combined '$remote_addr - $remote_user [$time_local] '
    # '"$request" $status $body_bytes_sent '
    # '"$http_referer" "$http_user_agent"';
    #   ./nginx2goaccess.sh '$remote_addr - $remote_user [$time_local] "$request" $status $body_bytes_sent "$http_referer" "$http_user_agent"'
    #
    # log_format compression '$remote_addr - $remote_user [$time_local] '
    # '"$request" $status $bytes_sent '
    # '"$http_referer" "$http_user_agent" "$gzip_ratio"';
    #   ./nginx2goaccess.sh '$remote_addr - $remote_user [$time_local] "$request" $status $bytes_sent "$http_referer" "$http_user_agent" "$gzip_ratio"'
    #
    # log_format main
    # '$remote_addr\t$time_local\t$host\t$request\t$http_referer\t$http_x_mobile_group\t'
    # 'Local:\t$status\t$body_bytes_sent\t$request_time\t'
    # 'Proxy:\t$upstream_cache_status\t$upstream_status\t$upstream_response_length\t$upstream_response_time\t'
    # 'Agent:\t$http_user_agent\t'
    # 'Fwd:\t$http_x_forwarded_for';
    #   ./nginx2goaccess.sh '$remote_addr\t$time_local\t$host\t$request\t$http_referer\t$http_x_mobile_group\tLocal:\t$status\t$body_bytes_sent\t$request_time\tProxy:\t$upstream_cache_status\t$upstream_status\t$upstream_response_length\t$upstream_response_time\tAgent:\t$http_user_agent\tFwd:\t$http_x_forwarded_for'
    #
    # log_format main
    # '${time_local}\t${remote_addr}\t${host}\t${request_method}\t${request_uri}\t${server_protocol}\t'
    # '${http_referer}\t${http_x_mobile_group}\t'
    # 'Local:\t${status}\t*${connection}\t${body_bytes_sent}\t${request_time}\t'
    # 'Proxy:\t${upstream_status}\t${upstream_cache_status}\t'
    # '${upstream_response_length}\t${upstream_response_time}\t${uri}${log_args}\t'
    # 'Agent:\t${http_user_agent}\t'
    # 'Fwd:\t${http_x_forwarded_for}';
    #   ./nginx2goaccess.sh '${time_local}\t${remote_addr}\t${host}\t${request_method}\t${request_uri}\t${server_protocol}\t${http_referer}\t${http_x_mobile_group}\tLocal:\t${status}\t*${connection}\t${body_bytes_sent}\t${request_time}\tProxy:\t${upstream_status}\t${upstream_cache_status}\t${upstream_response_length}\t${upstream_response_time}\t${uri}${log_args}\tAgent:\t${http_user_agent}\tFwd:\t${http_x_forwarded_for}'
    #
    # Author: Rogério Carvalho Schneider <stockrt@gmail.com>
    
    # Params
    log_format="$1"
    
    # Usage
    if [[ -z "$log_format" ]]; then
        echo "Usage: $0 '<log_format>'"
        exit 1
    fi
    
    # Variables map
    conversion_table="time_local,%d:%t_%^
    host,%v
    http_host,%v
    remote_addr,%h
    request_time,%T
    request_method,%m
    request_uri,%U
    server_protocol,%H
    request,%r
    status,%s
    body_bytes_sent,%b
    bytes_sent,%b
    http_referer,%R
    http_user_agent,%u"
    
    # Conversion
    for item in $conversion_table; do
        nginx_var=${item%%,*}
        goaccess_var=${item##*,}
        goaccess_var=${goaccess_var//_/ }
        log_format=${log_format//\$\{$nginx_var\}/$goaccess_var}
        log_format=${log_format//\$$nginx_var/$goaccess_var}
    done
    log_format=$(echo "$log_format" | sed 's/${[a-z_]*}/%^/g')
    log_format=$(echo "$log_format" | sed 's/$[a-z_]*/%^/g')
    
    # Config output
    echo "
    - Generated goaccess config:
    time-format %T
    date-format %d/%b/%Y
    log_format $log_format
    "
    
    # EOF
    

    注意,其中nginx配置文件的log_format如下,下面转换时需要与实际情况保持一致

          log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                            '$status $upstream_addr $body_bytes_sent "$http_referer" '
                            '"$http_user_agent" "$http_x_forwarded_for"';
    

    获取日志格式

    [root@VM_0_26_centos logs]# sh nginx2goaccess.sh '$remote_addr - $remote_user [$time_local] "$request" $status $upstream_addr $body_bytes_sent "$http_referer" "$http_user_agent" "$http_x_forwarded_for"'
    
    - Generated goaccess config:
    time-format %T
    date-format %d/%b/%Y
    log_format %h - %^ [%d:%t %^] "%r" %s %^ %b "%R" "%u" "%^"
    

    设置日志格式

    [root@VM_0_26_centos logs]# cat /etc/goaccess/goaccess.conf 
    time-format %T
    date-format %d/%b/%Y
    log_format %h - %^ [%d:%t %^] "%r" %s %^ %b "%R" "%u" "%^"
    

    生成分析报告

    [root@VM_0_26_centos logs]# goaccess -f ./nginx_access.log -p ./nginxlog.conf -o day-report.html
    [root@VM_0_26_centos logs]# ls
    day-report.html      nginx_access.log             nginx2goaccess.sh           nginxlog.conf
    

    查看报告效果
    浏览器打开day-report.html,效果如下

    相关文章

      网友评论

        本文标题:使用goaccess分析nginx日志

        本文链接:https://www.haomeiwen.com/subject/uxfjyktx.html