美文网首页
《ElasticSearch权威指南》实践 - 入门

《ElasticSearch权威指南》实践 - 入门

作者: SlowGO | 来源:发表于2019-07-08 15:02 被阅读0次

    使用的ElasticSearch版本为 7.2

    创建一个员工目录

    员工文档内容包括:

    • 姓名(first_name、last_name)
    • 年龄(age)
    • 简介(about)
    • 兴趣(interests)

    为一个员工文档建立索引,文档的类型为employee,属于索引 megacorp

    curl -X PUT "localhost:9200/megacorp/employee/1?pretty" -H 'Content-Type: application/json' -d'
    {
    "first_name" : "John",
    "last_name" : "Smith",
    "age" : 25,
    "about" : "I love to go rock climbing", "interests": [ "sports", "music" ]
    }
    '
    

    /megacorp/employee/1 表示/索引名/类型名/员工ID

    再建立2个员工索引:

    curl -X PUT "localhost:9200/megacorp/employee/2?pretty" -H 'Content-Type: application/json' -d'
    {
    "first_name" : "Jane",
    "last_name" : "Smith",
    "age" : 32,
    "about" : "I like to collect rock albums", 
    "interests": ["music" ]
    }
    '
    
    curl -X PUT "localhost:9200/megacorp/employee/3?pretty" -H 'Content-Type: application/json' -d'
    {
    "first_name" : "Douglas",
    "last_name" : "Fir",
    "age" : 35,
    "about" : "I like to build cabinets", 
    "interests": ["forestry" ]
    }
    '
    

    搜索

    查询megacorp索引中类型employee下ID为1的员工文档:

    curl -X GET "localhost:9200/megacorp/employee/1?pretty"
    

    查询所有员工:

    curl -X GET "localhost:9200/megacorp/employee/_search?pretty"
    

    查询last_nameSmith的员工:

    # query string 形式
    curl -X GET "localhost:9200/megacorp/employee/_search?q=last_name:Smith&pretty"
    
    # DSL 形式
    curl -X GET "localhost:9200/megacorp/employee/_search?pretty" -H 'Content-Type: application/json' -d'
    {
      "query": { 
        "match": {
            "last_name":"Smith"
        } 
      }
    }
    '
    

    查询last_nameSmith的员工,并且年龄大于30的,需要使用过滤器:

    curl -X GET "localhost:9200/megacorp/employee/_search?pretty" -H 'Content-Type: application/json' -d'
    {
      "query":{
        "bool" : {
          "filter" : {
            "range" : {
              "age" : { "gt" : 30 }
            }
          },
          "must" : {
            "match" : {
              "last_name" : "Smith"
            }
          }
        }
      }
    }
    '
    

    全文搜索

    查询about描述中匹配rock climbing的员工:

    curl -X GET "localhost:9200/megacorp/employee/_search?pretty" -H 'Content-Type: application/json' -d'
    {
      "query" : { 
        "match" : {
          "about" : "rock climbing" 
        }
      } 
    }
    '
    

    查询结果:

    {
      ...
      "hits" : {
        ...
        "hits" : [
          {
            ...
            "_id" : "1",
            "_score" : 1.4167402,
            "_source" : {
              "first_name" : "John",
              "last_name" : "Smith",
              "age" : 25,
              "about" : "I love to go rock climbing",
              "interests" : [
                "sports",
                "music"
              ]
            }
          },
          {
            ...
            "_id" : "2",
            "_score" : 0.45895916,
            "_source" : {
              "first_name" : "Jane",
              "last_name" : "Smith",
              "age" : 32,
              "about" : "I like to collect rock albums",
              "interests" : [
                "music"
              ]
            }
          }
        ]
      }
    }
    

    查到2条记录,每条都包含_score,含义是”相关性评分“,默认会根据其进行排序。

    第一条的分值高,是因为其about中明确包含rock climbing,第二条中只包含rock

    相关性是ES中很重要的一个概念,在传统数据库中对记录的查询只有匹配或者不匹配。

    短语搜索

    上面的搜索中rock climbing会被拆成2个词进行匹配,如果想将其视为一个整体进行匹配,可以使用match_phrase

    curl -X GET "localhost:9200/megacorp/employee/_search?pretty" -H 'Content-Type: application/json' -d'
    {
      "query" : {
        "match_phrase" : {
          "about" : "rock climbing"
        }
      }
    }
    '
    

    高亮搜索结果

    curl -X GET "localhost:9200/megacorp/employee/_search?pretty" -H 'Content-Type: application/json' -d'
    {
      "query" : {
        "match_phrase" : {
          "about" : "rock climbing"
        }
      },
      "highlight" : {
        "fields" : {
          "about": {}
        }
      }
    }
    '
    

    聚合

    ES有强大的聚合功能,可以在数据上生成复杂的分析统计,类似SQL中的group by

    例如,查询所有员工的最大的兴趣爱好:

    curl -X GET "localhost:9200/megacorp/employee/_search?pretty" -H 'Content-Type: application/json' -d'
    {
      "aggs" : {
        "all_interests" : {
          "terms" : { "field": "interests.keyword" }
        }
      }
    }
    '
    

    注意:interests 要写成 interests.keyword,否则会报错:
    Fielddata is disabled on text fields by default ...

    查询结果:

    {
      ...
      "aggregations" : {
        "all_interests" : {
          "doc_count_error_upper_bound" : 0,
          "sum_other_doc_count" : 0,
          "buckets" : [
            {
              "key" : "music",
              "doc_count" : 2
            },
            {
              "key" : "forestry",
              "doc_count" : 1
            },
            {
              "key" : "sports",
              "doc_count" : 1
            }
          ]
        }
      }
    }
    

    可以看到喜欢music的最多。

    上面是对所有文档进行查询,可以添加查询条件:

    curl -X GET "localhost:9200/megacorp/employee/_search?pretty" -H 'Content-Type: application/json' -d'
    {
      "query": {
        "match": {
          "last_name": "smith"
        }
      },
      "aggs" : {
        "all_interests" : {
          "terms" : { "field": "interests.keyword" }
        }
      }
    }
    '
    

    查询结果中会同时给出匹配的记录和聚合结果。

    可以分级汇总,例如统计每种兴趣下员工的平均年龄:

    curl -X GET "localhost:9200/megacorp/employee/_search?pretty" -H 'Content-Type: application/json' -d'
    {
      "aggs" : {
        "all_interests" : {
          "terms" : { "field": "interests.keyword" },
          "aggs" : {
            "avg_age" : {
              "avg" : { "field": "age" }
            }
          }
        }
      }
    }
    '
    

    查询结果:

      ...
      "aggregations" : {
        "all_interests" : {
          "doc_count_error_upper_bound" : 0,
          "sum_other_doc_count" : 0,
          "buckets" : [
            {
              "key" : "music",
              "doc_count" : 2,
              "avg_age" : {
                "value" : 28.5
              }
            },
            {
              "key" : "forestry",
              "doc_count" : 1,
              "avg_age" : {
                "value" : 35.0
              }
            },
            {
              "key" : "sports",
              "doc_count" : 1,
              "avg_age" : {
                "value" : 25.0
              }
            }
          ]
        }
      }
    

    相关文章

      网友评论

          本文标题:《ElasticSearch权威指南》实践 - 入门

          本文链接:https://www.haomeiwen.com/subject/abnohctx.html