Elasticsearch的index、doc_values以及

作者: 白奕新 | 来源:发表于2019-08-27 21:12 被阅读0次

ElasticSearch doc_values、index以及
Elasticsearch的index、doc_values以及
Elasticsearch Index
【Elasticsearch】index [index] blo
Instructions for setting up the
Elasticsearch 存储设计与MySQL数据同步方案
【ES实践篇三：ES Restful Web API使用】
【es】Elasticsearch 的 Shard Alloca
【Elasticsearch 7 探索之路】（二）文档的 CRU
Elasticsearch Shard Allocation机制

1.正排索引

（1）描述

GET /my_index/_search
{
  "query" : {
    "match" : {
      "body" : "brown"
    }
  },
  "aggs" : {
    "popular_terms": {
      "terms" : {
        "field" : "body"
      }
    }
  }
}

这句sql，会进行查询并且聚合。
针对倒排索引，能很快的查询出『brown』所在的documentid；
针对聚合部分，如果还使用倒排索引，则需要对倒排索引进行遍历并对有上述documentid列的的数据进行检索，效率很差。则正排索引保存了一份documentid映射出词的关系表，在要聚合的时候使用正排索引再进一步操作即可。

（2）倒排索引与正排索引

倒排索引

倒排索引.png

正排索引

正排索引.png

（3）conclusion

搜索用倒排索引
聚合、排序用正排索引

2.doc_value

使用doc_values可以把数据 序列化持久化到磁盘中，与正排索引有关 。

PUT /my_index
{
  "mappings": {
    "my_type": {
      "properties": {
        "status_code": {
          "type": "string",
          "doc_values": true/false
        }
      }
    }
  }
}

parameter	meaning	other
true	意味着这些字段都可以被聚合、排序	除了analyzed的string字段，doc_values都默认开启
false	这个字段不能用于聚合、排序	可以省下磁盘空间

3.index

与倒排索引有关

（1）低版本

PUT /my_index
{
  "mappings": {
    "my_type": {
      "properties": {
        "status_code": {
          "type": "string",
          "index": “not_analyzed/no/analyzed”
        }
      }
    }
  }
}

parameter	meaning
not_analyzed	表示数据不分词，直接存储。*是除"String"以外的默认配置*
no	字符字段表示不能被搜索，不建立倒排索引，节省内存使用。*数值字段被设置为0，则sum的结果为0*
analyzed	字段会被分词，是"String"的默认配置