ES集群状态为red问题总结

作者: BarretX | 来源:发表于2022-05-08 22:11 被阅读0次

ES集群状态为red问题总结
谁再问elasticsearch集群Red怎么办？把这篇笔记给他
磁盘空间引起ES集群shard unassigned的处理过程
ElasticSearch 集群与索引Red&Yellow状态分
ES集群red状态排查与恢复
ES集群Unassigned（脑裂现象）
安装head插件 (es 6.2)
解决es集群unassigned_shards的问题
ElasticSearch基础操作 - 浏览集群
ES 安全

线上三台集群部署的集群状态为red，持续了好几天，下面分析一下问题根源以及解决思路。

确定集群状态`curl http://*:9200/_cluster/health?pretty`

集群状态

有两个未分配的分片，初步怀疑是由于分片分配不成功，导致索引异常，下一步查找异常索引确认问题

确定异常索引

# 查看所有索引状态
curl -XGET "http://*:9200/_cat/indices?pretty"
# 查看异常索引状态
curl -XGET "http://*:9200/_cat/indices?v&health=red"

异常索引

找到异常索引

查看异常索引分片分配状态`curl -XGET "http://*:9200/_cat/shards/some_index_name?v"`

异常索引分片分配状态

查看分片分配不成功的原因`curl -XGET "http://*:9200/_cat/shards/some_index_name?v&h=n,index,shard,prirep,state,sto,sc,unassigned.reason,unassigned.details"`

片分配失败原因：获取共享锁超时 ALLOCATION_FAILED failed shard on node [TFLMMP_nS6iPUx_C098SbA]: failed to create shard, failure IOException[failed to obtain in-memory shard lock]; nested: ShardLockObtainFailedException[[some_index_name][0]: obtaining shard lock timed out after 5000ms]

重新分配失败的分片`curl -XPOST "http://*:9200/_cluster/reroute?retry_failed=true"`

重新分配失败的分片后，集群状态恢复green

问题避免

修改ES设置，调整ES进行分片分配的重试次数（默认为5次）

curl -XPUT "http://*:9200/some_index_name/_settings" -d'
{
  "index": {
    "allocation": {
      "max_retries": 20
    }
  }
}'

可进一步查看ES的CPU和内存占用情况，比如都比较高，根本原因是由于历史数据太多导致，应在业务层增加逻辑去定时关闭或删除索引

参考资料

线上 Elasticsearch 集群健康值 red 状态问题排查与解决 - 云+社区 - 腾讯云 (tencent.com)
Elasticsearch集群异常状态（RED、YELLOW）原因分析 - 云+社区 - 腾讯云 (tencent.com)
es实战-分片分配失败解决方案_casterQ的博客-CSDN博客_es 分片失败

网友评论

本文标题：ES集群状态为red问题总结

本文链接：https://www.haomeiwen.com/subject/duugkrtx.html

延伸阅读

深度阅读

您也可以注册成为美文阅读网的作者，发表您的原创作品、分享您的心情！

ES集群状态为red问题总结

确定集群状态`curl http://*:9200/_cluster/health?pretty`

确定异常索引

查看异常索引分片分配状态`curl -XGET "http://*:9200/_cat/shards/some_index_name?v"`

查看分片分配不成功的原因`curl -XGET "http://*:9200/_cat/shards/some_index_name?v&h=n,index,shard,prirep,state,sto,sc,unassigned.reason,unassigned.details"`

重新分配失败的分片`curl -XPOST "http://*:9200/_cluster/reroute?retry_failed=true"`

问题避免

参考资料

相关文章

ES集群状态为red问题总结

谁再问elasticsearch集群Red怎么办？把这篇笔记给他

磁盘空间引起ES集群shard unassigned的处理过程

ElasticSearch 集群与索引Red&Yellow状态分

ES集群red状态排查与恢复

ES集群Unassigned（脑裂现象）

安装head插件 (es 6.2)

解决es集群unassigned_shards的问题

ElasticSearch基础操作 - 浏览集群

ES 安全

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读

ES集群状态为red问题总结

确定集群状态curl http://*:9200/_cluster/health?pretty

确定异常索引

查看异常索引分片分配状态curl -XGET "http://*:9200/_cat/shards/some_index_name?v"

查看分片分配不成功的原因curl -XGET "http://*:9200/_cat/shards/some_index_name?v&h=n,index,shard,prirep,state,sto,sc,unassigned.reason,unassigned.details"

重新分配失败的分片curl -XPOST "http://*:9200/_cluster/reroute?retry_failed=true"

问题避免

参考资料

相关文章

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读

确定集群状态`curl http://*:9200/_cluster/health?pretty`

查看异常索引分片分配状态`curl -XGET "http://*:9200/_cat/shards/some_index_name?v"`

查看分片分配不成功的原因`curl -XGET "http://*:9200/_cat/shards/some_index_name?v&h=n,index,shard,prirep,state,sto,sc,unassigned.reason,unassigned.details"`

重新分配失败的分片`curl -XPOST "http://*:9200/_cluster/reroute?retry_failed=true"`