美文网首页
ES搜索篇-折叠搜索(Collapse Search)

ES搜索篇-折叠搜索(Collapse Search)

作者: 走过分叉路 | 来源:发表于2022-06-14 23:06 被阅读0次

1、基础知识

  • 折叠使用的关键字必须是单个值的keyword类型或者numeric类型并且存放于doc_values数据结构中,如果折叠字段是数组类型的则不支持
  • 折叠不影响搜索结果中的total计数,如果想要统计折叠结果中的唯一组的个数,可以使用聚合
    测试数据
[
    {
        "name": "xiaomi phone",
        "desc": "shouji zhong de zhandouji",
        "price": 3999,
        "tags": [
            "xingjiabi",
            "fashao",
            "buka"
        ]
    },
    {
        "name": "xiaomi phone",
        "desc": "exercitation commodo cillum",
        "price": 23,
        "tags": [
            "sunt"
        ]
    },
    {
        "name": "几水即统",
        "desc": "labore commodo ullamco",
        "price": 46,
        "tags": [
            "qui",
            "proident",
            "ut Duis sint cillum"
        ]
    },
    {
        "name": "天常表效",
        "desc": "fugiat culpa dolor",
        "price": 46,
        "tags": [
            "non cupidatat aute magna occaecat",
            "veniam reprehenderit",
            "nulla commodo quis laborum",
            "fugiat ex minim nulla"
        ]
    },
    {
        "name": "上容等位得志",
        "desc": "consequat do laboris magna anim",
        "price": 46,
        "tags": [
            "amet ex ut sed aute",
            "dolor veniam",
            "consequat ut aute sunt fugiat",
            "ullamco ipsum sed",
            "tempor enim veniam eu consectetur"
        ]
    }
]

2、搜索案例

2.1折叠搜索

搜索语句

  • 此处query查询条件为的是匹配全部文档;
  • 按照price字段进行折叠;
  • 相同价格的以字段价格排序;
  • from参数表示跳过几个折叠后的结果,如果折叠后有3个结果且from=1,则会从第二个折叠文档开始显示
{
    "query": {
        "exists": {
            "field": "name"
        }
    },
    "collapse": {
        "field": "price"
    },
    "sort": [
        {
            "price": {
                "order": "desc"
            }
        }
    ],
    "from": 0
}

搜索结果

{
    "took": 2,
    "timed_out": false,
    "_shards": {
        "total": 1,
        "successful": 1,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": {
            "value": 5, // 这里的值是折叠前的文档个数
            "relation": "eq"
        },
        "max_score": null,
        "hits": [
            {
                "_index": "product",
                "_id": "1",
                "_score": null,
                "_source": {
                    "name": "xiaomi phone",
                    "desc": "shouji zhong de zhandouji",
                    "price": 3999,
                    "tags": [
                        "xingjiabi",
                        "fashao",
                        "buka"
                    ]
                },
                "fields": {
                    "price": [
                        3999
                    ]
                },
                "sort": [
                    3999
                ]
            },
            {
                "_index": "product",
                "_id": "3",
                "_score": null,
                "_source": {
                    "name": "几水即统",
                    "desc": "labore commodo ullamco",
                    "price": 46,
                    "tags": [
                        "qui",
                        "proident",
                        "ut Duis sint cillum"
                    ]
                },
                "fields": {
                    "price": [
                        46
                    ]
                },
                "sort": [
                    46
                ]
            },
            {
                "_index": "product",
                "_id": "2",
                "_score": null,
                "_source": {
                    "name": "xiaomi phone",
                    "desc": "exercitation commodo cillum",
                    "price": 23,
                    "tags": [
                        "sunt"
                    ]
                },
                "fields": {
                    "price": [
                        23
                    ]
                },
                "sort": [
                    23
                ]
            }
        ]
    }
}

2.2折叠扩展

  • 关键参数为inner_hits,如果一个折叠key折叠了100个搜索结果,我们可以只取前面N个,N由size参数定义
  • max_concurrent_group_searches参数:由于折叠扩展的inner_hits需要发送额外的请求,所以限制这个并发请求的个数是有必要的,默认值由es的data节点数和查询线程池大小决定。
  • inner_hits也可以是一个数组,这样可以按照不同维度取回自己需要的结果,例如
"inner_hits": [
      {
        "name": "largest_responses",     
        "size": 3,
        "sort": [
          {
            "http.response.bytes": {
              "order": "desc"
            }
          }
        ]
      },
      {
        "name": "most_recent",             
        "size": 3,
        "sort": [
          {
            "@timestamp": {
              "order": "desc"
            }
          }
        ]
      }
    ]
  • 折叠扩展的实现原理:
    每个折叠组的每个inner_hits都是通过发送额外的查询请求完成的,如果有太多这样的请求,那么响应速度会显著下降。
  • inner_hits内部支持2次折叠,例如:
{
  "query": {
    "match": {
      "message": "GET /search"
    }
  },
  "collapse": {
    "field": "geo.country_name",
    "inner_hits": {
      "name": "by_location",
      "collapse": { "field": "user.id" },
      "size": 3
    }
  }
}

搜索条件

{
    "query": {
        "exists": {
            "field": "name"
        }
    },
    "collapse": {
        "field": "price",
        "inner_hits":{
            "name":"top_2_price",
            "size":2,
            "sort":[
                {
                    "price":"desc"
                }
            ]
        },
        "max_concurrent_group_searches":1
    },
    "sort": [
        {
            "price": {
                "order": "desc"
            }
        }
    ],
    "from": 0
}

响应
可以看到,价格为46的文档返回了2个

{
    "took": 22,
    "timed_out": false,
    "_shards": {
        "total": 1,
        "successful": 1,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": {
            "value": 5,
            "relation": "eq"
        },
        "max_score": null,
        "hits": [
            {
                "_index": "product",
                "_id": "1",
                "_score": null,
                "_source": {
                    "name": "xiaomi phone",
                    "desc": "shouji zhong de zhandouji",
                    "price": 3999,
                    "tags": [
                        "xingjiabi",
                        "fashao",
                        "buka"
                    ]
                },
                "fields": {
                    "price": [
                        3999
                    ]
                },
                "sort": [
                    3999
                ],
                "inner_hits": {
                    "top_2_price": {
                        "hits": {
                            "total": {
                                "value": 1,
                                "relation": "eq"
                            },
                            "max_score": null,
                            "hits": [
                                {
                                    "_index": "product",
                                    "_id": "1",
                                    "_score": null,
                                    "_source": {
                                        "name": "xiaomi phone",
                                        "desc": "shouji zhong de zhandouji",
                                        "price": 3999,
                                        "tags": [
                                            "xingjiabi",
                                            "fashao",
                                            "buka"
                                        ]
                                    },
                                    "sort": [
                                        3999
                                    ]
                                }
                            ]
                        }
                    }
                }
            },
            {
                "_index": "product",
                "_id": "3",
                "_score": null,
                "_source": {
                    "name": "几水即统",
                    "desc": "labore commodo ullamco",
                    "price": 46,
                    "tags": [
                        "qui",
                        "proident",
                        "ut Duis sint cillum"
                    ]
                },
                "fields": {
                    "price": [
                        46
                    ]
                },
                "sort": [
                    46
                ],
                "inner_hits": {
                    "top_2_price": {
                        "hits": {
                            "total": {
                                "value": 3,
                                "relation": "eq"
                            },
                            "max_score": null,
                            "hits": [
                                {
                                    "_index": "product",
                                    "_id": "3",
                                    "_score": null,
                                    "_source": {
                                        "name": "几水即统",
                                        "desc": "labore commodo ullamco",
                                        "price": 46,
                                        "tags": [
                                            "qui",
                                            "proident",
                                            "ut Duis sint cillum"
                                        ]
                                    },
                                    "sort": [
                                        46
                                    ]
                                },
                                {
                                    "_index": "product",
                                    "_id": "4",
                                    "_score": null,
                                    "_source": {
                                        "name": "天常表效",
                                        "desc": "fugiat culpa dolor",
                                        "price": 46,
                                        "tags": [
                                            "non cupidatat aute magna occaecat",
                                            "veniam reprehenderit",
                                            "nulla commodo quis laborum",
                                            "fugiat ex minim nulla"
                                        ]
                                    },
                                    "sort": [
                                        46
                                    ]
                                }
                            ]
                        }
                    }
                }
            },
            {
                "_index": "product",
                "_id": "2",
                "_score": null,
                "_source": {
                    "name": "xiaomi phone",
                    "desc": "exercitation commodo cillum",
                    "price": 23,
                    "tags": [
                        "sunt"
                    ]
                },
                "fields": {
                    "price": [
                        23
                    ]
                },
                "sort": [
                    23
                ],
                "inner_hits": {
                    "top_2_price": {
                        "hits": {
                            "total": {
                                "value": 1,
                                "relation": "eq"
                            },
                            "max_score": null,
                            "hits": [
                                {
                                    "_index": "product",
                                    "_id": "2",
                                    "_score": null,
                                    "_source": {
                                        "name": "xiaomi phone",
                                        "desc": "exercitation commodo cillum",
                                        "price": 23,
                                        "tags": [
                                            "sunt"
                                        ]
                                    },
                                    "sort": [
                                        23
                                    ]
                                }
                            ]
                        }
                    }
                }
            }
        ]
    }
}

相关文章

网友评论

      本文标题:ES搜索篇-折叠搜索(Collapse Search)

      本文链接:https://www.haomeiwen.com/subject/wretvrtx.html