ES 查询语法解析之SearchAPI - 文章很长，慎入

作者: 右耳菌 | 来源:发表于2022-11-10 23:30 被阅读0次

本文以6.8版本为例

1. 搜索API(_search API)

下面的地址可以找到很多版本的参考文档：
https://www.elastic.co/guide/en/elasticsearch/reference/index.html

搜索的端点地址可以是多索引多mapping type的。搜索的参数可作为URI请求参数给出,也可用request body给出。

搜索API端点地址（6.8版本）

GET /bank/account/_search
GET /twitter/ _search?q=user:kimchy
GET /kimchy,elasticsearch/_search?q=tag:wow
GET /_all/_search?q=tag:wow
GET /_search?q=tag:wow

1.search

# 指定index
GET /bank/_search?q=address:bristol

# 查询多个索引
GET /bank,songs_v1/_search

# 查询所有
GET /_all/_search

# 查询所有
GET /_search

2. URL Search

这里还可以点击Copy as curl查看另外的一种访问形式，当然上边的语句中也有描述：
but it can be handy for quick "curl tests".
此时粘贴到nodepad等编辑器中的显示结果如下：
curl -X GET "localhost:9200/twitter/_search?q=user:kimchy&pretty"
这个指令可以直接在linux中执行，很方便进行测试
对上面的返回内容进行简单的解释：
{
   "timed_out": false,  #是否超时
   "took": 62,  # 本次查询消耗的时间，这里是62ms
   "_shards":{ # 分片相关的内容
       "total" : 1, # 总共有多少个分片
       "successful" : 1, # 成功查询了多少个分片
       "skipped" : 0, # 跳过了多少个分片
       "failed" : 0 # 有多少个分片是失败的
   },
   "hits":{ # 查询命中的信息
       "total" : 1, # 总共命中的数量
       "max_score": 1.3862944, # 所有命中中的最高得分，因为这里只有一个命中，所以得分等于该命中的内容
       "hits" : [ # 所有命中的文档所组成的一个数组
           {
               "_index" : "twitter",  # 属于哪个index
               "_type" : "_doc", # 是什么类型的type
               "_id" : "0", # id 是多少
               "_score": 1.3862944, # 命中得分
               "_source" : {  # source字段
                   "user" : "kimchy",
                   "date" : "2009-11-15T14:12:12",
                   "message" : "trying out Elasticsearch",
                   "likes": 0
               }
           }
       ]
   }
}
相关的parameters，这里仅罗列了一些，具体可以到网站上进行查询:

3. Request Body Search

From/Size：分页相关，但是不建议使用，原因如下：
因为在有分片的情况下，需要先查询每个分片上的内容之后合并然后再符合的内容，比方说一共有5个分片，当查询前10的时候需要分别查询5个分片的前10(因为其分布情况是未知的，所以我们只能这样查询)，归整排序后再选择前面的10个，这好像也没有什么问题，但是如果取11~20的时候呢？这个时候就需要5个分片分别取20个，然后归整，最后取11~20中的内容。所以这里我们就发现问题了，在不断往后去取数据的时候，五个分片导致的后果就是要查询的内容不断扩大，本质上效率没有提升反而下降了。

官网的建议是批量的时候使用Scorll（不适合交互式查询），其他的时候使用Search After
使用的例子：

#  From/Size分页相关
GET /bank/_search
{
 "sort": [
   {
     "account_number": {
       "order": "asc"
     }
   }
 ],
 "from": 0,"size": 10
}

Search After: 分页相关，解决from/size的相关问题

例子：

# 分页相关 Search After
GET /bank/_search
{
  "size": 6,
  "sort": [
    {
      "account_number": {
        "order": "asc"
      }
    }
  ]
}


GET /bank/_search
{
  "size": 6,
  "sort": [
    {
      "account_number": {
        "order": "asc"
      }
    }
  ],
  "search_after": [1]
}

但是这个方法也有一定的问题，比如排序的字段内容最好都是唯一的，不然可能会导致某些数据无法查询到，比如如果返回的排序数字是 10，但是排序的字段为10的内容条数过多，比如第一次是查询10条内容，最后一条的排序字段的数字是10，而其后边还有很多个数字为10的内容，那么下一次在发送查询的时候，由于search_after填写的是10，那么这些没有罗列出来的10的内容，就被跳过了。

Field Collapsing: 字段折叠

例子：

# field collapse
GET /bank/_search
{
 "query": {
   "match": {
     "address": "street"
   }
 }, 
 "collapse": {
   "field": "gender.keyword"
 }
}

# 将数据展开
GET /bank/_search
{
 "query": {
   "match": {
     "address": "street"
   }
 }, 
 "collapse": {
   "field": "gender.keyword",
   "inner_hits" : {
     "name": "details",
     "size": 3,
     "sort": ["account_number"]
   }
 }
}

# 多次折叠
GET /bank/_search
{
 "query": {
   "match": {
     "address": "street"
   }
 },
 "collapse": {
   "field": "gender.keyword",
   "inner_hits": {
     "name": "age_coll",
     "collapse": {"field": "age"},
     "size": 3
   }
 }
}

Source filtering: source 过滤

例子：

# _source
# 展示（默认是true）
GET /bank/_search
{
 "query": {
   "match_all": {}
 },
 "_source": true
}

# 隐藏
GET /bank/_search
{
 "query": {
   "match_all": {}
 },
 "_source": false
}

# 指定返回的内容
GET /bank/_search
{
 "query": {
   "match_all": {}
 },
 "_source": ["address", "balance"]
}

# 返回指定的内容以外的内容
GET /bank/_search
{
 "query": {
   "match_all": {}
 },
 "_source": {
   "excludes": ["address", "balance"]
 }
}

# includes，本质与返回指定内容一样
GET /bank/_search
{
 "query": {
   "match_all": {}
 },
 "_source": {
   "includes": ["address", "balance"]
 }
}

Fields

例子：

#stored_fields 

#对于非store字段，没有任何意义

PUT /songs_v20
PUT /songs_v20/_mapping/classic
{
 "properties": {
   "songName" : {"type": "text"},
   "singer" : {"type": "keyword"},
   "lyrics" : {
     "type": "text",
     "store": true
   }
 }
}

POST /songs_v20/classic
{
 "songName" : "could this be love",
 "singer" : "James",
 "lyrics" : "Could This Be love,Woke Up This Morning Just Sat In My Bed,8 a.m. First Thing In My Head,Is A Certain Someone Who's Always On My Mind,He Treats Me"
}


GET /songs_v20/_search
{
 "_source": true,
 "query": {
   "match_all": {}
 },
 "stored_fields": ["songName", "singer", "lyrics"]
}

Version: 是否返回版本信息

例子:

#   version 是否返回版本信息

GET /bank/account/_search
{
 "query": {
   "term": {
     "_id": {
       "value": "20"
     }
   }
 },
 "version": true
}

Script Fields: 将字段计算后返回

例子：

# script field 将字段计算后返回
GET /bank/_search
{
 "_source": true,
 "query": {
   "term": {
     "_id": {
       "value": "20"
     }
   }
 },
 "script_fields": {
   "age_2year_later": {
     "script" : {
       "lang": "painless",
      "source" : "doc['age'].value + 2"
     }
   },
   "age_2year_before" : {
     "script" : {
       "lang": "painless",
      "source" : "doc['age'].value - 2"
     }
   }
 }
}

min_score: 过滤掉评分太低的

例子：

#有评分太低的，
GET /songs_v1/popular/_search
{
 "query": {
   "match": {
     "lyrics": "So many people all around the world"
   }
 }
}

#加上min_score 去掉评分低的
GET /songs_v1/_search
{
 "query": {
   "match": {
     "lyrics": "so many people all around world"
   }
 },
 "min_score" : 1
}

Sort：排序

例子：

# sort
GET /bank/_search?size=4
{
 "_source": false,
 "query": {
   "term": {
     "state.keyword": {
       "value": "DC"
     }
   }
 },
 "sort": [
   {
     "age": {
       "order": "asc"
     }
   }
 ]
}

# sort mode

# min max avg
GET /numbers/_search
{
 "sort": [
   {
     "numbers": {
       "order": "desc",
       "mode": "min"
     }
   }
 ]
}


# 按照数组的长度来排序的实现方式
GET /numbers/_search
{
 "sort": [
   {
     "scriptFIELD": {
       "type": number,
       "script": {
         "lang": "painless",
         "source": "doc['numbers'].length"
       },
       "order": "desc"
     }
   }
 ]
}

Highlighting: 高亮

例子：

高亮

GET /bank/_search
{
 "query": {
   "match": {
     "address": "green"
   }
 },
 "highlight": {
   "fields": {
     "address": {}
   }
 }
}
#当我们指定需要高亮显示时，返回结果时，就会在与搜索内容匹配的地方包裹一个标签

#这个标签是可以替换掉的
GET /bank/_search
{
 "query": {
   "match": {
     "address": "green"
   }
 },
 "highlight": {
   "fields": {
     "address": {
       "pre_tags": "<strong>",
       "post_tags": "</strong>"
     }
   }
 }
}

如果觉得有收获就点个赞吧，更多知识，请点击关注查看我的主页信息哦~

网友评论

本文标题：ES 查询语法解析之SearchAPI - 文章很长，慎入

本文链接：https://www.haomeiwen.com/subject/mnxztdtx.html

延伸阅读

深度阅读

您也可以注册成为美文阅读网的作者，发表您的原创作品、分享您的心情！

ES 查询语法解析之SearchAPI - 文章很长，慎入

1. 搜索API(_search API)

相关文章

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读