ES 故障

作者: Ary_zz | 来源:发表于2019-10-18 10:41 被阅读0次

    2019-10-18

    primary shard lost

    unassigned_info

    "can_allocate" : "no_valid_shard_copy", "allocate_explanation" : "cannot allocate because all found copies of the shard are either stale or corrupt"

    几种报错:

    •    "node_id" : "dBd4onKFSLSvrxgFCIP6GQ",
         "node_name" : "elasticsearch-data-86d6d959c5-ddlfd",
         "transport_address" : "172.16.38.77:9300",
         "node_decision" : "no",
         "store" : {
         "in_sync" : false,
         "allocation_id" : "SA157qPdRViXVK2ie2QgKg",
         "store_exception" : {
         "type" : "file_not_found_exception",
         "reason" : "no segments* file found in SimpleFSDirectory@/data/db/nodes/0/indices/sRSOtM-URGGeOs49IACD8w/1/index lockFactory=org.apache.lucene.store.NativeFSLockFactory@42971bc2: files: [write.lock]"
         }
         }
         }
      
    •  "node_id" : "WJSf3f08Riuy-4kajyLb6A",
       "node_name" : "elasticsearch-data-86d6d959c5-8jb7x",
       "transport_address" : "172.16.126.15:9300",
       "node_decision" : "no",
       "store" : {
       "in_sync" : false,
       "allocation_id" : "A8L3_M7SQG-ZSy7zbBNWFg"
       }
       }
      
      
    •   "node_id" : "9S-fKqTkQg-06muuMl20Uw",
        "node_name" : "elasticsearch-data-86d6d959c5-j89bq",
        "transport_address" : "172.16.23.40:9300",
        "node_decision" : "no",
        "deciders" : [
        {
        "decider" : "disk_threshold",
        "decision" : "NO",
        "explanation" : "the node is above the low watermark cluster setting [cluster.routing.allocation.disk.watermark.low=85%], using more disk space than the maximum allowed [85.0%], actual free: [13.930268425592923%]"
        }
        ]
        },
        {
        "node_id" : "E2BvZJ4jQu2anQzaQrgCLA",
        "node_name" : "elasticsearch-data-86d6d959c5-f2957",
        "transport_address" : "172.16.63.6:9300",
        "node_decision" : "no",
        "deciders" : [
        {
        "decider" : "disk_threshold",
        "decision" : "NO",
        "explanation" : "the node is above the low watermark cluster setting [cluster.routing.allocation.disk.watermark.low=85%], using more disk space than the maximum allowed [85.0%], actual free: [14.635894063472007%]"
        }
        ]
        },
        {
        "node_id" : "F_iPC-LbQ9uH9NquGrnYmw",
        "node_name" : "elasticsearch-data-86d6d959c5-2dlpb",
        "transport_address" : "172.16.95.7:9300",
        "node_decision" : "no",
        "deciders" : [
        {
        "decider" : "disk_threshold",
        "decision" : "NO",
        "explanation" : "the node is above the low watermark cluster setting [cluster.routing.allocation.disk.watermark.low=85%], using more disk space than the maximum allowed [85.0%], actual free: [14.663439808514775%]"
        }
        ]
        },
        {
        "node_id" : "lj1_omL1RYOSo5xws7ibQg",
        "node_name" : "elasticsearch-data-86d6d959c5-khh96",
        "transport_address" : "172.16.53.13:9300",
        "node_decision" : "no",
        "deciders" : [
        {
        "decider" : "disk_threshold",
        "decision" : "NO",
        "explanation" : "the node is above the low watermark cluster setting [cluster.routing.allocation.disk.watermark.low=85%], using more disk space than the maximum allowed [85.0%], actual free: [14.5565950622084%]"
        }
        ]
        }
       ​
      
      

    ES 分配策略

    https://doc.yonyoucloud.com/doc/mastering-elasticsearch/chapter-4/43_README.html

    数据量太大

    shards disk.indices disk.used disk.avail disk.total disk.percent host           ip             node
     4060      442.9gb   443.4gb     56.5gb      500gb           88 172.16.100.115 172.16.100.115 elasticsearch-data-f8449cccf-2qblf
     3719        439gb   442.2gb     57.7gb      500gb           88 172.16.3.50    172.16.3.50    elasticsearch-data-f8449cccf-bxpmh
     490      103.8gb   459.2gb     40.7gb      500gb           91 172.16.98.138  172.16.98.138  elasticsearch-data-f8449cccf-mr2bm
     3742      439.8gb   446.6gb     53.3gb      500gb           89 172.16.51.88   172.16.51.88   elasticsearch-data-f8449cccf-b74rw
     3631      463.9gb   464.4gb     35.5gb      500gb           92 172.16.98.137  172.16.98.137  elasticsearch-data-f8449cccf-gz62r
     11294                                                                                         UNASSIGNED
    

    当容量超过80%就会有问题

    引用es文档

    cluster.routing.allocation.disk.threshold_enabled Defaults to true. Set to false to disable the disk allocation decider.

    cluster.routing.allocation.disk.watermark.low Controls the low watermark for disk usage. It defaults to 85%, meaning that Elasticsearch will not allocate shards to nodes that have more than 85% disk used. It can also be set to an absolute byte value (like 500mb) to prevent Elasticsearch from allocating shards if less than the specified amount of space is available. This setting has no effect on the primary shards of newly-created indices or, specifically, any shards that have never previously been allocated.

    cluster.routing.allocation.disk.watermark.high Controls the high watermark. It defaults to 90%, meaning that Elasticsearch will attempt to relocate shards away from a node whose disk usage is above 90%. It can also be set to an absolute byte value (similarly to the low watermark) to relocate shards away from a node if it has less than the specified amount of free space. This setting affects the allocation of all shards, whether previously allocated or not.

    cluster.routing.allocation.disk.watermark.flood_stage Controls the flood stage watermark. It defaults to 95%, meaning that Elasticsearch enforces a read-only index block (index.blocks.read_only_allow_delete) on every index that has one or more shards allocated on the node that has at least one disk exceeding the flood stage. This is a last resort to prevent nodes from running out of disk space. The index block must be released manually once there is enough disk space available to allow indexing operations to continue.

    Other issues

    内存段错误 core dump

    segment fault

    java offheap

    es memory limit 5 5 10

    相关文章

      网友评论

          本文标题:ES 故障

          本文链接:https://www.haomeiwen.com/subject/wecimctx.html