主要内容:
聚合分析、嵌套聚合,下钻分析的简单操作
1、聚合分析
需求1:计算每个tag下的商品数量
执行语句:
GET ecommerce/_search
{
"aggs": {
"group_by_tags": {
"terms": {
"field": "tags"
}
}
}
}
直接执行之后会报错
Fielddata is disabled on text fields by default. Set fielddata=true on [tags] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead
在ElasticSearch中默认fielddata默认是false的,因为开启Text的fielddata后对内存的占用很高
对tags字段开启fielddata
PUT ecommerce/_mapping
{
"properties": {
"tags": {
"type": "text",
"fielddata": true
}
}
}
可以正常执行了,doc_count字段就是不同tag的document 数量
{
"took" : 339,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 6,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
·········省略···········
]
},
"aggregations" : {
"group_by_tags" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "fangzhu",
"doc_count" : 4
},
{
"key" : "meibai",
"doc_count" : 2
},
{
"key" : "qingxin",
"doc_count" : 1
}
]
}
}
}
tip 可以设置"size": 0
只显示聚合结果,如下所示
GET ecommerce/_search
{
"size": 0,
"aggs": {
"group_by_tags": {
"terms": {
"field": "tags"
}
}
}
}
需求2:对名称中包含jiajieshi的商品,计算每个tag下的商品数量
比较简单,增加一个query功能,不列出结果了
GET ecommerce/_search
{
"query": {
"match": {
"name": "jiajieshi"
}
},
"aggs": {
"group_by_tags": {
"terms": {
"field": "tags"
}
}
}
}
2、嵌套聚合/下钻分析
需求3:先分组,再算每组的平均值,计算每个tag下的商品的平均价格
GET /ecommerce/_search
{
"size": 0,
"aggs": {
"group_by_tags": {
"terms": {
"field": "tags"
},
"aggs": {
"avg_price": {
"avg": {
"field": "price"
}
}
}
}
}
}
返回的 avg_price
就是我们需要的平均价格
{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 6,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
"aggregations" : {
"group_by_tags" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "fangzhu",
"doc_count" : 4,
"avg_price" : {
"value" : 27.25
}
},
{
"key" : "meibai",
"doc_count" : 2,
"avg_price" : {
"value" : 42.0
}
},
{
"key" : "qingxin",
"doc_count" : 1,
"avg_price" : {
"value" : 40.0
}
}
]
}
}
}
需求4:计算每个tag下的商品的平均价格,并且按照平均价格降序排序
增加order字段
GET /ecommerce/_search
{
"size": 0,
"aggs": {
"group_by_tags": {
"terms": {
"field": "tags",
"order": {
"avg_price": "desc"
}
},
"aggs": {
"avg_price": {
"avg": {
"field": "price"
}
}
}
}
}
}
需求5:按照指定的价格范围区间进行分组,然后在每组内再按照tag进行分组,最后再计算每组的平均价格
GET /ecommerce/_search
{
"size": 0,
"aggs": {
"group_by_price": {
"range": {
"field": "price",
"ranges": [
{
"from": 0,
"to": 20
},
{
"from": 20,
"to": 40
},
{
"from": 40,
"to": 60
}
]
},
"aggs": {
"group_by_tags": {
"terms": {
"field": "tags",
"order": {
"avg_price": "desc"
}
},
"aggs": {
"avg_price": {
"avg": {
"field": "price"
}
}
}
}
}
}
}
}
返回的结果
{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 6,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
"aggregations" : {
"group_by_price" : {
"buckets" : [
{
"key" : "0.0-20.0",
"from" : 0.0,
"to" : 20.0,
"doc_count" : 0,
"group_by_tags" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [ ]
}
},
{
"key" : "20.0-40.0",
"from" : 20.0,
"to" : 40.0,
"doc_count" : 4,
"group_by_tags" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "meibai",
"doc_count" : 1,
"avg_price" : {
"value" : 34.0
}
},
{
"key" : "fangzhu",
"doc_count" : 4,
"avg_price" : {
"value" : 27.25
}
}
]
}
},
{
"key" : "40.0-60.0",
"from" : 40.0,
"to" : 60.0,
"doc_count" : 2,
"group_by_tags" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "meibai",
"doc_count" : 1,
"avg_price" : {
"value" : 50.0
}
},
{
"key" : "qingxin",
"doc_count" : 1,
"avg_price" : {
"value" : 40.0
}
}
]
}
}
]
}
}
}
3、更多内容参考
Elasticsearch笔记(七):聚合查询大数据布道-CSDN博客
https://blog.csdn.net/alex_xfboy/article/details/86100037
网友评论