先来看一个异常
org.elasticsearch.ElasticsearchStatusException: Elasticsearch exception [type=search_phase_execution_exception, reason=all shards failed]
at org.elasticsearch.rest.BytesRestResponse.errorFromXContent(BytesRestResponse.java:177)
at org.elasticsearch.client.RestHighLevelClient.parseEntity(RestHighLevelClient.java:618)
at org.elasticsearch.client.RestHighLevelClient.parseResponseException(RestHighLevelClient.java:594)
at org.elasticsearch.client.RestHighLevelClient.performRequest(RestHighLevelClient.java:501)
at org.elasticsearch.client.RestHighLevelClient.performRequestAndParseEntity(RestHighLevelClient.java:474)
at org.elasticsearch.client.RestHighLevelClient.search(RestHighLevelClient.java:391)
at com.xxxx.assets.service.es.factory.rest.EsHighClientService.queryByPage(EsHighClientService.java:82)
... 21 common frames omitted
Suppressed: org.elasticsearch.client.ResponseException: method [POST], host [http://abc.xxxx.com:9900], URI [/blood_relation_index/blood_relation/_search?typed_keys=true&ignore_unavailable=false&expand_wildcards=open&allow_no_indices=true&search_type=dfs_query_then_fetch&batched_reduce_size=512], status line [HTTP/1.1 500 Internal Server Error]{"error":{"root_cause":[{"type":"query_phase_execution_exception","reason":"Result window is too large, from + size must be less than or equal to: [10000] but was [20000]. See the scroll api for a more efficient way to request large data sets. This limit can be set by changing the [index.max_result_window] index level setting."}],"type":"search_phase_execution_exception","reason":"all shards failed","phase":"query","grouped":true,"failed_shards":[{"shard":0,"index":"blood_relation_index","node":"RKah0wB7RDeQMmmawJqMHA","reason":{"type":"query_phase_execution_exception","reason":"Result window is too large, from + size must be less than or equal to: [10000] but was [20000]. See the scroll api for a more efficient way to request large data sets. This limit can be set by changing the [index.max_result_window] index level setting."}}]},"status":500}
at org.elasticsearch.client.RestClient$1.completed(RestClient.java:357)
at org.elasticsearch.client.RestClient$1.completed(RestClient.java:346)
at org.apache.http.concurrent.BasicFuture.completed(BasicFuture.java:122)
at org.apache.http.impl.nio.client.DefaultClientExchangeHandlerImpl.responseCompleted(DefaultClientExchangeHandlerImpl.java:177)
at org.apache.http.nio.protocol.HttpAsyncRequestExecutor.processResponse(HttpAsyncRequestExecutor.java:436)
at org.apache.http.nio.protocol.HttpAsyncRequestExecutor.inputReady(HttpAsyncRequestExecutor.java:326)
at org.apache.http.impl.nio.DefaultNHttpClientConnection.consumeInput(DefaultNHttpClientConnection.java:265)
at org.apache.http.impl.nio.client.InternalIODispatch.onInputReady(InternalIODispatch.java:81)
at org.apache.http.impl.nio.client.InternalIODispatch.onInputReady(InternalIODispatch.java:39)
at org.apache.http.impl.nio.reactor.AbstractIODispatch.inputReady(AbstractIODispatch.java:114)
at org.apache.http.impl.nio.reactor.BaseIOReactor.readable(BaseIOReactor.java:162)
at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvent(AbstractIOReactor.java:337)
at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvents(AbstractIOReactor.java:315)
at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:276)
at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104)
at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:588)
... 1 common frames omitted
关键信息摘录出来: ResponseException
POST 500
{
"error": {
"root_cause": [
{
"type": "query_phase_execution_exception",
"reason": "Result window is too large, from + size must be less than or equal to: [10000] but was [20000]. See the scroll api for a more efficient way to request large data sets. This limit can be set by changing the [index.max_result_window] index level setting."
}
],
"type": "search_phase_execution_exception",
"reason": "all shards failed",
"phase": "query",
"grouped": true,
"failed_shards": [
{
"shard": 0,
"index": "blood_relation_index",
"node": "RKah0wB7RDeQMmmawJqMHA",
"reason": {
"type": "query_phase_execution_exception",
"reason": "Result window is too large, from + size must be less than or equal to: [10000] but was [20000]. See the scroll api for a more efficient way to request large data sets. This limit can be set by changing the [index.max_result_window] index level setting."
}
}
]
},
"status": 500
}
返回的结果window太大,from+size 必须<=[10000],但是当前查询是[20000]。请求大数据集的更有效的方式可参阅scroll api。也可通过更改[index.max_result_window] 进行设置。
分析
ES服务器设置的 index.max_result_window=10000,我们查询的返回结果超出了这个限制。
问题:为什么会查过?
普通ES分页查询
假设分页查询,每页size=100,你查询第100页,from和size分别是from=(100 - 1) * 100=9900, size=100,这时ES需要从各个分片上跟别取出10000条数据,如果是3各分片,总共就是3*10000条数据,然后汇总排序、过滤,再取出最终符合条件的100条数据。如果查询 第101页,这时from=10000,ES从各分片取出10100条数据。
深度查询问题
显然,随着分页的越深入,ES从各分片上查询的数据量越大,性能时指数级下降。
为什么要设置 index.max_result_window=10000
,就是出于这种考虑,防止耗尽ES内存资源,产生OOM。
优化解决
可以根据场景区分:
1、对于深度翻阅查询没要求的需求,可以限制查询的翻页深度和数据量。
2、或者限制操作行为,禁止跳跃翻页查询,这时可以使用scroll进行滚动查询。
网友评论