ElasticSearch 7.x 实战入门03

作者: 众神开挂 | 来源:发表于2020-03-21 17:02 被阅读0次

主要内容:

聚合分析、嵌套聚合,下钻分析的简单操作

1、聚合分析

需求1:计算每个tag下的商品数量

执行语句:

GET ecommerce/_search
{
  "aggs": {
    "group_by_tags": {
      "terms": {
        "field": "tags"
      }
    }
  }
}

直接执行之后会报错

Fielddata is disabled on text fields by default. Set fielddata=true on [tags] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead

在ElasticSearch中默认fielddata默认是false的,因为开启Text的fielddata后对内存的占用很高

对tags字段开启fielddata

PUT ecommerce/_mapping
{
  "properties": {
    "tags": {
      "type": "text",
      "fielddata": true
    }
  }
}

可以正常执行了,doc_count字段就是不同tag的document 数量

{
  "took" : 339,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 6,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
         ·········省略···········
    ]
  },
  "aggregations" : {
    "group_by_tags" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : "fangzhu",
          "doc_count" : 4
        },
        {
          "key" : "meibai",
          "doc_count" : 2
        },
        {
          "key" : "qingxin",
          "doc_count" : 1
        }
      ]
    }
  }
}

tip 可以设置"size": 0只显示聚合结果,如下所示

GET ecommerce/_search
{
  "size": 0, 
  "aggs": {
    "group_by_tags": {
      "terms": {
        "field": "tags"
      }
    }
  }
}
需求2:对名称中包含jiajieshi的商品,计算每个tag下的商品数量

比较简单,增加一个query功能,不列出结果了

GET ecommerce/_search
{
  "query": {
    "match": {
      "name": "jiajieshi"
    }
  }, 
  "aggs": {
    "group_by_tags": {
      "terms": {
        "field": "tags"
      }
    }
  }
}

2、嵌套聚合/下钻分析

需求3:先分组,再算每组的平均值,计算每个tag下的商品的平均价格
GET /ecommerce/_search
{
  "size": 0,
  "aggs": {
    "group_by_tags": {
      "terms": {
        "field": "tags"
      },
      "aggs": {
        "avg_price": {
          "avg": {
            "field": "price"
          }
        }
      }
    }
  }
}

返回的 avg_price就是我们需要的平均价格

{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 6,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "group_by_tags" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : "fangzhu",
          "doc_count" : 4,
          "avg_price" : {
            "value" : 27.25
          }
        },
        {
          "key" : "meibai",
          "doc_count" : 2,
          "avg_price" : {
            "value" : 42.0
          }
        },
        {
          "key" : "qingxin",
          "doc_count" : 1,
          "avg_price" : {
            "value" : 40.0
          }
        }
      ]
    }
  }
}

需求4:计算每个tag下的商品的平均价格,并且按照平均价格降序排序

增加order字段

GET /ecommerce/_search
{
  "size": 0,
  "aggs": {
    "group_by_tags": {
      "terms": {
        "field": "tags",
        "order": {
          "avg_price": "desc"
        }
      },
      "aggs": {
        "avg_price": {
          "avg": {
            "field": "price"
          }
        }
      }
    }
  }
}
需求5:按照指定的价格范围区间进行分组,然后在每组内再按照tag进行分组,最后再计算每组的平均价格
GET /ecommerce/_search
{
  "size": 0,
  "aggs": {
    "group_by_price": {
      "range": {
        "field": "price",
        "ranges": [
          {
            "from": 0,
            "to": 20
          },
          {
            "from": 20,
            "to": 40
          },
          {
            "from": 40,
            "to": 60
          }
        ]
      },
      "aggs": {
        "group_by_tags": {
          "terms": {
            "field": "tags",
            "order": {
              "avg_price": "desc"
            }
          },
          "aggs": {
            "avg_price": {
              "avg": {
                "field": "price"
              }
            }
          }
        }
      }
    }
  }
} 

返回的结果

{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 6,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "group_by_price" : {
      "buckets" : [
        {
          "key" : "0.0-20.0",
          "from" : 0.0,
          "to" : 20.0,
          "doc_count" : 0,
          "group_by_tags" : {
            "doc_count_error_upper_bound" : 0,
            "sum_other_doc_count" : 0,
            "buckets" : [ ]
          }
        },
        {
          "key" : "20.0-40.0",
          "from" : 20.0,
          "to" : 40.0,
          "doc_count" : 4,
          "group_by_tags" : {
            "doc_count_error_upper_bound" : 0,
            "sum_other_doc_count" : 0,
            "buckets" : [
              {
                "key" : "meibai",
                "doc_count" : 1,
                "avg_price" : {
                  "value" : 34.0
                }
              },
              {
                "key" : "fangzhu",
                "doc_count" : 4,
                "avg_price" : {
                  "value" : 27.25
                }
              }
            ]
          }
        },
        {
          "key" : "40.0-60.0",
          "from" : 40.0,
          "to" : 60.0,
          "doc_count" : 2,
          "group_by_tags" : {
            "doc_count_error_upper_bound" : 0,
            "sum_other_doc_count" : 0,
            "buckets" : [
              {
                "key" : "meibai",
                "doc_count" : 1,
                "avg_price" : {
                  "value" : 50.0
                }
              },
              {
                "key" : "qingxin",
                "doc_count" : 1,
                "avg_price" : {
                  "value" : 40.0
                }
              }
            ]
          }
        }
      ]
    }
  }
}

3、更多内容参考

Elasticsearch笔记(七):聚合查询大数据布道-CSDN博客
https://blog.csdn.net/alex_xfboy/article/details/86100037

相关文章

网友评论

    本文标题:ElasticSearch 7.x 实战入门03

    本文链接:https://www.haomeiwen.com/subject/upsyyhtx.html