搜索引擎

作者: Miracle001 | 来源:发表于2018-05-28 11:21 被阅读2次
    流程
    机制1
    机制2
    index/search
    index--数据库、type--table、document--row
    倒排
    切词--构建索引
    搜索组件
    索引组件
    加标签--构建文档--文档化
    build query--是否搜索xxx
    
    Filebeat   Real-time insight into log data.
    
    index  数据库---切为多片
    
    主机名解析  最好不要依赖dns服务
    vim /etc/hosts
    192.168.1.6  node1.fgq.com  node1
    192.168.1.7  node2.fgq.com  node2
    192.168.1.5  node3.fgq.com  node3
    192.168.1.10  node4.fgq.com  node4
    时间同步
    下载地址:https://www.elastic.co/cn/downloads  RPM包格式
    
    node1/2/3
    rz  上传elasticsearch-6.2.4.rpm
    yum -y install java-1.8.0-openjdk-devel
    rpm -ivh elasticsearch-6.2.4.rpm
    rpm -ql elasticsearch |less  很吃内存--建议2g以上设置
    mkdir -pv /els/{data,logs}  不放在默认的/var/lib/下
    tail /etc/passwd
    ' chown -R elasticsearch.elasticsearch /els/* '
    cp /etc/elasticsearch/elasticsearch.yml{,.bak}
    vim /etc/elasticsearch/elasticsearch.yml
      cluster.name: myels
      node.name: node1/2/3  出现于日志中
      rack机架,不定义了
      path.data: /els/data
      path.logs: /els/logs
      network.host: 192.168.1.6/7/5  当前节点的ip
      http.port: 9200  9200--提供服务的端口(客户访问)  9300--集群内部协商的端口
      discovery.zen.ping.unicast.hosts: ["node1", "node2","node3"]  /etc/hosts要能解析,否则使用ip地址
      discovery.zen.minimum_master_nodes: 2  半数+1,quorum机制
    vim jvm.options--其他方式安装可能配置文件就不在这里
      -Xms1g  默认
      -Xmx1g
      其他默认即可
    注意:编辑node2/3时,不同节点--节点名和ip地址
    systemctl start elasticsearch.service
    free -m
    ss -ntl  9300/9200端口
    curl -XGET 'http://192.168.1.6:9200'
      或  curl -XGET '192.168.1.6:9200'
    curl -XGET 'http://192.168.1.7:9200'
    curl -XGET 'http://192.168.1.5:9200'  显示"You Know, for Search",即集群成功
      ?参数--美观的显示
    curl -XGET 'http://192.168.1.6:9200/_cluster/health?pretty'
    curl -XGET 'http://192.168.1.6:9200/_cluster/stats?pretty'
    curl -XGET 'http://192.168.1.6:9200/_cat?pretty'
    curl -XGET 'http://192.168.1.6:9200/_cat/nodes?pretty'
    curl -XGET 'http://192.168.1.6:9200/_cat/plugins?pretty'  没有安装插件
    curl -XGET 'http://192.168.1.6:9200/_cat/indices?pretty'  没有建index
      或  curl -XGET '192.168.1.6:9200/_cat/indices?pretty'
    
    构建索引
    curl -XPUT '192.168.1.6:9200/books'  可直接给出文档,类型会自动创建
    curl -XPUT '192.168.1.6:9200/books'  创建index
    curl -XDELETE '192.168.1.6:9200/books'  删除index
    curl -XGET '192.168.1.6:9200/_cat/indices?pretty'
    curl -XGET '192.168.1.6:9200/_cluster/health?pretty'
    
    设置内容
    curl -XPUT '192.168.1.6:9200/books'  创建index
    curl -XPUT -H'Content-Type: application/json' '192.168.1.6:9200/books/computer/1' -d '{
    > "name": "ELK",
    > "date": "Dec 3, 2018",
    > "author": "Fgq"
    > }'
    分析:-d json格式内容  -H 指明header
    curl -XGET '192.168.1.6:9200/books/computer/1?pretty'
    curl -XPUT -H'Content-Type: application/json' '192.168.1.6:9200/books/computer/2' -d '{
    > "name": "Mysql",
    > "date": "May 3,2017",
    > "author": "Csn"
    > }'
    curl -XPUT -H'Content-Type: application/json' '192.168.1.6:9200/books/computer/3' -d '{
    > "name": "Kubernetes",
    > "date": "Aug 2, 2016",
    > "author": "Gj"
    > }'
    curl -XPUT -H'Content-Type: application/json' '192.168.1.6:9200/books/computer/4' -d '{
    > "name": "ELK and Mysql",
    > "date": "Dec 9, 2018",
    > "author": "Flq"
    > }'
    curl -XGET '192.168.1.6:9200/books/computer/_search?pretty'  匹配computer下的所有内容
    curl -XGET '192.168.1.6:9200/books/computer/_search?q=elk&pretty'
    curl -XGET '192.168.1.6:9200/books/computer/_search?q=el*&pretty'  通配符、模糊查询
    curl -XGET '192.168.1.6:9200/books/computer/_search?q=kubernetes&pretty'
    curl -XGET '192.168.1.6:9200/books/computer/_search?q=dec&pretty'
    curl -XGET '192.168.1.6:9200/books/computer/_search?q=fgq&pretty'
    curl -XGET '192.168.1.6:9200/books/computer/_search?q=author:csn&pretty'
    curl -XGET -H'Content-Type:application/json' '192.168.1.6:9200/books/computer/_search?pretty' -d '
    > {
    >   "query": {
    >       "match_all": {}
    >   }
    > }'
      搜索匹配computer下的所有内容
    文档https://www.elastic.co/guide/en/elasticsearch/reference/6.0/setup-upgrade.html
    
    
    插件
    node1
    /usr/share/elasticsearch/bin/elasticsearch-plugin -h  命令不在path变量里面
    /usr/share/elasticsearch/bin/elasticsearch-plugin install -h
    https://github.com/mobz/elasticsearch-head
    yum -y install git npm
    git clone https://github.com/mobz/elasticsearch-head.git
    cd elasticsearch-head/
    npm install
    npm run start &
    回车  即可跳出,程序运行在后台
    ss -ntl  9100端口
    浏览器:192.168.1.6:9100  如下图1
    
    node1/2/3  都操作
    vim /etc/elasticsearch/elasticsearch.yml
    文件最下面添加信息:
    # ---------------------------------- Http cores -----------------------------------
    http.cors.enabled: true
    http.cors.allow-origin: "*"
    systemctl restart elasticsearch.service
    这样就可以连接各个节点了,如下图2、3、4
    点击连接后,如果没有显示内容,需要等几秒,再连接
    
    图1
    图2
    图3
    图4
    插件
    安装logstash
    node4 192.168.1.10
    yum -y install java-1.8.0-openjdk-devel
    rz  上传logstash-6.2.4.rpm
    rpm -ivh logstash-6.2.4.rpm
    rpm -ql logstash |less
    vim /etc/logstash/jvm.options
      -Xms256m  默认即可
      -Xmx1g  默认即可
    vim /etc/logstash/logstash.yml
      path.config: /etc/logstash/conf.d  该目录下的任何文件都是配置文件的一部分
    vim /etc/logstash/conf.d/test.conf
    input {
      stdin {}
    }
    
    output {
      stdout {}
    }
    5.5.1的版本需要加上"codec => rubydebug",显示文本格式,如下:
    input {
      stdin {}
    }
    output {
      stdout {
            codec => rubydebug
      }
    }
    /usr/share/logstash/bin/logstash -h  
    /usr/share/logstash/bin/logstash -t -f /etc/logstash/conf.d/test.conf  
      -t  测试
      -f  指明配置文件
      显示Result: OK.
    /usr/share/logstash/bin/logstash -f /etc/logstash/conf.d/test.conf  
      显示"The stdin plugin is now waiting for input"和"successfully"
    hello logstash  输入内容,回车即可出现以下信息:
    {
              "host" => "node4.fgq.com",
           "message" => "hello logstash",
        "@timestamp" => 2018-05-27T10:11:05.548Z,
          "@version" => "1"
    }
    
    --------------------------------------------------------------------------
    Input Plugins
    node4
    yum -y install httpd
    vim /var/www/html/index.html
      test page
    systemctl start httpd.service
    浏览器:192.168.1.10
    node1: curl 192.168.1.10
    访问后,产生日志
    tail /var/log/httpd/access_log
    vim /etc/logstash/conf.d/test.conf
    input {
      file {
        path => ["/var/log/httpd/access_log"]
        start_position => "beginning"
      }
    }
    
    output {
      stdout {}
    } 
    /usr/share/logstash/bin/logstash -t -f /etc/logstash/conf.d/test.conf
    /usr/share/logstash/bin/logstash -f /etc/logstash/conf.d/test.conf
    即可显示日志信息
    
    
    Filter Plugins
    rpm -ql logstash |grep grok-pattern
    less /usr/share/logstash/vendor/bundle/jruby/2.3.0/gems/logstash-patterns-core-4.1.2/patterns/grok-patterns
    less /usr/share/logstash/vendor/bundle/jruby/2.3.0/gems/logstash-patterns-core-4.1.2/patterns/httpd
    vim /etc/logstash/conf.d/test.conf
    input {
      file {
        path => ["/var/log/httpd/access_log"]
        start_position => "beginning"
      }
    }
    
    filter {
      grok {
        match => {
          "message" => "%{COMBINEDAPACHELOG}"
        }
      }
    }
    
    output {
      stdout {}
    }
    /usr/share/logstash/bin/logstash -t -f /etc/logstash/conf.d/test.conf
    /usr/share/logstash/bin/logstash -f /etc/logstash/conf.d/test.conf
      访问后,即可出现过滤信息
    node1: curl 192.168.1.10
    浏览器: 192.168.1.10
    
    rpm -ql logstash |grep pattern
    没有nginx、Tomcat可以自定义相关pattern模式
    
    
    Output Plugins--path
    vim /etc/logstash/conf.d/test.conf
    input {
      file {
        path => ["/var/log/httpd/access_log"]
        start_position => "beginning"
      }
    }
    
    filter {
      grok {
        match => {
          "message" => "%{COMBINEDAPACHELOG}"
        }
      }
    }
    
    output {
      file {
        path => ["/tmp/httpd_access_log.json"]
      }
    }
    /usr/share/logstash/bin/logstash -t -f /etc/logstash/conf.d/test.conf
    /usr/share/logstash/bin/logstash -f /etc/logstash/conf.d/test.conf
      访问后,文件内部即可保存相关信息
    node1: curl 192.168.1.10
    浏览器: 192.168.1.10
    less /tmp/httpd_access_log.json  json格式信息
    
    https://www.elastic.co/guide/en/logstash/current/plugins-outputs-elasticsearch.html
    内容如何保存至集群中
    hosts: 
      Value type is uri
      Default value is [//127.0.0.1]
    index: 
      Value type is string
      Default value is "logstash-%{+YYYY.MM.dd}"
    action:
      Value type is string
      Default value is "index"
    document_id  document_type  http_compression  path    routing  template
    
    
    Output Plugins--elasticsearch
    vim /etc/logstash/conf.d/test.conf
    input {
      file {
        path => ["/var/log/httpd/access_log"]
        start_position => "beginning"
      }
    }
    
    filter {
      grok {
        match => {
          "message" => "%{COMBINEDAPACHELOG}"
        }
      }
    }
    
    output {
      elasticsearch {
        hosts => ["http://192.168.1.6:9200","http://192.168.1.7:9200","http://192.168.1.5:9200"]
        index => "logstash-%{+YYYY.MM.dd}"
        action => "index"
      }
    }
    /usr/share/logstash/bin/logstash -t -f /etc/logstash/conf.d/test.conf
    /usr/share/logstash/bin/logstash -f /etc/logstash/conf.d/test.conf
      访问后,文件内部即可保存相关信息
    node1/2: for i in {1..5};do curl 192.168.1.10 ;done
    使用浏览器查看: http://192.168.1.6:9100/
    node1
    curl -XGET '192.168.1.6:9200/logstash-*/_search?pretty'
    curl -XGET '192.168.1.6:9200/logstash-*/_search?q=clientip:192.168.1.4&pretty'
    上面实验--如下面拓扑图所示
    
    结果显示
    拓扑图
    实验的架构图--如下面的拓扑图所示
    此处把beats服务器和logstash服务器放在一起了
    node4  192.168.1.10
    rz  filebeat-6.2.4-x86_64.rpm
    yum -y install filebeat-6.2.4-x86_64.rpm
    rpm -ql filebeat |less
    cp /etc/filebeat/filebeat.yml{,.bak}
    vim /etc/filebeat/filebeat.yml
    - type: log
      enabled: true  已更改
      paths:
        - /var/log/httpd/access_log*  可能滚动  已更改
    output.elasticsearch:
      hosts: ["192.168.1.6:9200","192.168.1.7:9200","192.168.1.5:9200"]  已更改
      protocol: "http"  已更改
    systemctl start filebeat.service
    systemctl status filebeat.service
    浏览器:http://192.168.1.6:9100/  如下图1
    curl -XGET '192.168.1.6:9200/filebeat-*/_search?pretty'  
    日志信息直接发送给elasticsearch,没有做过滤和拆分
    
    把日志发送给logstash处理,然后再发送给elasticsearch
    vim /etc/filebeat/filebeat.yml
    #output.elasticsearch:
    #  hosts: ["192.168.1.6:9200","192.168.1.7:9200","192.168.1.5:9200"]  已更改
    #  protocol: "http"
    output.logstash:
      hosts: ["192.168.1.10:5044"]
      如果logstash做了ssl连接,需要配置ssl证书和私钥
    vim /etc/logstash/conf.d/test.conf 
    仅更改input即可
    input {
      beats {
        port => 5044
      }
    }   
          
    filter {
      grok {
        match => {
          "message" => "%{COMBINEDAPACHELOG}"
        }
      }
    }   
        
    output {
      elasticsearch {
        hosts => ["http://192.168.1.6:9200","http://192.168.1.7:9200","http://192.168.1.5:9200"]
        index => "logstash-%{+YYYY.MM.dd}"
        action => "index"
      }
    }
    /usr/share/logstash/bin/logstash -t -f /etc/logstash/conf.d/test.conf
    systemctl start logstash.service;  ss -ntl  5044端口
    systemctl stop filebeat.service 
    systemctl start filebeat.service
    登录界面手动删除filebeat和logstash信息,如下图2
    只保留books,如下图3
    systemctl restart logstash.service;  ss -ntl  5044端口
    重启后,可以显示logstash内容,如下图4
    node1: curl -XGET '192.168.1.6:9200/logstash-*/_search?q=clientip:192.168.1.4&pretty'
    对信息进行了过滤和拆分
    
    
    图1
    图2
    图3
    图4
    安装kibana
    rz  上传kibana-6.2.4-x86_64.rpm
    yum -y install kibana-6.2.4-x86_64.rpm 
    rpm -ql kibana |less
    vim /etc/kibana/kibana.yml
    server.port: 5601
    server.host: "192.168.1.6"
    elasticsearch.url: "http://192.168.1.6:9200"
    systemctl enable kibana.service
    systemctl start kibana.service;  ss -ntl  5601端口
    浏览器:
    192.168.1.6:5601  显示如下图1
    192.168.1.6:5601/status 查看是否正常,如下图2
    设置--图3-图9
    文档:https://www.elastic.co/guide/en/kibana/current/index.html
    
    
    图1
    图2
    图3
    图4
    图5
    图6
    图7
    图8
    图9

    相关文章

      网友评论

        本文标题:搜索引擎

        本文链接:https://www.haomeiwen.com/subject/fkhmjftx.html