美文网首页程序员造个轮子
nginx-kafka 数据采集

nginx-kafka 数据采集

作者: XIAO_WS | 来源:发表于2018-11-30 17:04 被阅读8次

    传统大数据采集一般通过flume采集nginx的log来实现,然后再经过kafka传递数据
    有了ngx_kafak_module 数据采集就能通过nginx直接向kafka发送数据(用户行为日志)

    多逛逛全球最大的同性交友网站还是能学到很多东西滴~

    nginx-kafka安装脚本

    注意CentOS/Ubuntu安装依赖库时的区别

    install-nginx-kafka.sh

    #!/bin/bash
    
    # centos
    #yum update; yum install -y gcc gcc-c++ pcre-devel zlib-devel make git wget curl vim
    #ubuntu
    apt-get update; apt-get install -y gcc g++ libpcre3 libpcre3-dev zlib1g-dev libssl-dev make git wget curl vim
    
    cd /tmp
    git clone https://github.com/edenhill/librdkafka
    git clone https://github.com/brg-liuwei/ngx_kafka_module
    wget http://nginx.org/download/nginx-1.15.5.tar.gz
    
    cd /tmp/librdkafka
    ./configure; make; sudo make install
    
    tar -zxvf nginx-1.15.5.tar.gz
    
    cd /tmp/nginx-1.15.5
    ./configure --prefix=/usr/local/nginx_kafka --add-module=/tmp/ngx_kafka_module; make; sudo make install
    sudo ln -s /usr/local/nginx_kafka/sbin/nginx /usr/local/bin/nginx-kafka
    
    sudo echo "/usr/local/lib" >> /etc/ld.so.conf
    sudo ldconfig
    
    1. 更新软件源 & 安装依赖库、软件
    2. 下载librdkafka、ngx_kafka_module、nginx源码
    3. 编译安装librdkafka
    4. 解压nginx源码 & 带上ngx_kafka_module编译安装
    5. 为了方便,制作nginx-kafka软链(不与其他nginx冲突)
    6. 如果启动nginx报错,找不到kafka.so.1的文件
      error while loading shared libraries: librdkafka.so.1: cannot open shared object file: No such file or directory
    7. 加载so库
      echo "/usr/local/lib" >> /etc/ld.so.conf; ldconfig

    nginx-kafka.conf

    #user  nobody;
    worker_processes  1;
    #error_log  logs/error.log;
    #error_log  logs/error.log  notice;
    #error_log  logs/error.log  info;
    #pid        logs/nginx.pid;
    events {
        worker_connections  1024;
    }
    
    http {
        include       mime.types;
        default_type  application/octet-stream;
        #log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
        #                  '$status $body_bytes_sent "$http_referer" '
        #                  '"$http_user_agent" "$http_x_forwarded_for"';
        #access_log  logs/access.log  main;
        sendfile        on;
        #tcp_nopush     on;
        #keepalive_timeout  0;
        keepalive_timeout  65;
        #gzip  on;
        
        kafka;
        kafka_broker_list kafka-1:9092 kafka-2:9092 kafka-3:9092;  
        
        server {
            listen       80;
            server_name  localhost;
            #charset koi8-r;
            #access_log  logs/host.access.log  main;
            location = /kafka/log {
                    kafka_topic log;
            }
            location = /kafka/user {
                    kafka_topic user;
            }
            #error_page  404              /404.html;
            # redirect server error pages to the static page /50x.html
            #
            error_page   500 502 503 504  /50x.html;
            location = /50x.html {
                root   html;
            }
        }
    }
    
    1. 指定kafka集群kafka_broker_list ip | host:port;
    2. location 可以根据topic划分URL

    启动nginx

    • 启动zookeeper集群和kafka集群(创建topic)
      略。。。

    • 测试配置文件
      nginx-kafka -c nginx-kafka.conf -t

    • 启动nginx-kafka
      nginx-kafka -c nginx-kafka.conf -s reload

    • enjoy

    相关文章

      网友评论

        本文标题:nginx-kafka 数据采集

        本文链接:https://www.haomeiwen.com/subject/qovafqtx.html