美文网首页
ES的high level查询超时设置失效问题

ES的high level查询超时设置失效问题

作者: 梦想又照进现实 | 来源:发表于2020-01-19 11:18 被阅读0次

    背景

    最近在升级es查询的客户端版本时候,使用6.3.2版本时候超时设置失效,http、scoket设置叶不起作用,查询SearchSourceBuilder中设置timeout也没有起作用,为查到根本原因进行排查;

    使用问题

    官方API
    https://www.elastic.co/guide/en/elasticsearch/client/java-rest/6.3/java-rest-high-search.html

    版本:

    <dependency>
        <groupId>org.elasticsearch.client</groupId>
        <artifactId>elasticsearch-rest-high-level-client</artifactId>
        <version>6.3.2</version>
    </dependency>
    

    问题官方解析:

    SetTimeout方法是不可靠的,解释如下:
    Sadly, it is a best effort timeout, its not being checked on all places. Specifically, if you send a query that ends up being rewritten into many terms (fuzzy, or wildcard), that part (the rewrite part) does not check for a timeout.

    遗憾的是,这是一个最好的超时,它不是在所有地方都被检查。具体地说,如果您发送的查询最终被重写为许多术语(模糊或通配符),那么该部分(重写部分)不会检查超时。

    其他解决参考:
    网上其他遇到类似问题:
    1、
    https://discuss.elastic.co/t/settimeout-of-searchrequestbuilder-not-working/11611/2

    2、京东案例
    https://www.liangzl.com/get-article-detail-136496.html

    问题及解决描述:
    在es官方的issue中有具体说到:

    Sadly, it is a best effort timeout, its not being checked on all places. Specifically, if you send a query that ends up being rewritten into many terms (fuzzy, or wildcard), that part (the rewrite part) does not check for a timeout.

    传送门:Timeout on search not respected

    Transport api查询的最后一步也就说 actionGet()或者get()中设置timeout超时时间 actionGet(timeout) T actionGet(long var1, TimeUnit var3) throws ElasticsearchException; ,这样设置之后如果在设定的时间没有查询到数据,就会抛出timeout的异常,实际上这个超时并不是连接超时,而是处理超时,它的超时逻辑是java异步future的超时。不过这也已经满足了我们的需求。在设定时间内没有处理完毕,会抛出超时的异常。

     FilteredQueryBuilder fqb = QueryBuilders.filteredQuery(QueryBuilders.matchAllQuery(), boolFilterBuilder);
                SearchResponse searchResponse = client.prepareSearch(indexName)
                        .setTimeout(TimeValue.timeValueMillis(500))
                        .setTypes(documentType)
                        .setSearchType(SearchType.QUERY_THEN_FETCH)
                        .setQuery(fqb)
                        .addSort("created", SortOrder.DESC)
                        .setFrom(0).setSize(30)
                        .execute()
                        .actionGet(1000);
    

    RestHighLevelClient查询源码

    跟踪restHighLevelClient.search(searchRequest)的源码发现最终调用的异步超时监听参数只有这个maxRetryTimeoutMillis,源码如下:

    public Response performRequest(String method, String endpoint, Map<String, String> params,
                                       HttpEntity entity, HttpAsyncResponseConsumerFactory httpAsyncResponseConsumerFactory,
                                       Header... headers) throws IOException {
            SyncResponseListener listener = new SyncResponseListener(maxRetryTimeoutMillis);
            performRequestAsyncNoCatch(method, endpoint, params, entity, httpAsyncResponseConsumerFactory,
                listener, headers);
            return listener.get();
     }
    

    设置这个超时会打印io的exception,但不会抛出IOExcption,打印信息示例:

    java.io.IOException: listener timeout after waiting for [200] ms
    

    解决方法

    参考以上的transport api的原理,我们对high level rest client的查询api调用进行超时控制,大致做法:
    1、添加Spring的异步调用
    2、Java Future进行定时判断调用结果,进行超时后的处理
    AsyncResult代码如下:

    @Async("taskExecutor")
        public Future<List<ProcessMessage>> searchProcessMessageListBySizeAsyn(String[] indexName, String[] typeName, Integer pageSize, BoolQueryBuilder boolQuery){
            Assert.notNull(indexName,"the indexName string[] can't null");
            Assert.notNull(typeName,"the typeName string[] can't null");
            List<ProcessMessage> resultList =  this.searchProcessMessageListBySize(indexName, typeName, pageSize, boolQuery);
            return new AsyncResult<>(resultList);
    }
    

    FutureResult获取数据代码:

    try {
                list = futureResult.get(timeout, TimeUnit.MILLISECONDS);
            } catch (InterruptedException e) {
                e.printStackTrace();
            } catch (ExecutionException e) {
                e.printStackTrace();
            } catch (TimeoutException e) {
                log.info("ES查询超出时间设置:{}ms,进行空返回。", timeout);
                list = new ArrayList<>();
            }
    

    运行效果:
    对整个查询超时时候会进入TimeoutException后进行相关打印

    ElasticSearchAPITest       : ES查询超出时间设置:1000ms,进行空返回。                         
    ElasticSearchAPITest       : get:0                                                            
    ElasticSearchAPITest       : 耗时:1015  
    

    ES超时设置

    1、Transport api设置
    client连接集群节点超时(client.transport.ping_timeout)
    Settings settings = Settings.builder().put("client.transport.sniff", true).build();
    TransportClient client = new PreBuiltTransportClient(settings);
    client.transport.ping_timeout ,The time to wait for a ping response from a node. Defaults to 5s. 默认5s,client ping命令响应的时间,如果无返回,则认为此节点不可用。如果客户端和集群间网络延迟较大或者连接不稳定,可能需要调大这个值。

    scroll中的超时
    SearchResponse scrollResp = client.prepareSearch(test)
    .addSort(FieldSortBuilder.DOC_FIELD_NAME, SortOrder.ASC)
    .setScroll(new TimeValue(60000))
    .setQuery(qb)
    .setSize(100).get();
    scroll里面的时间,这个将启用超时的scroll滚动,经过测试,这个参数应该又是一个薛定谔的参数,没什么作用,还是少依赖它做一些事情吧

    2、RestHighLevel设置
    RestClient设置最大重试超时时间:
    RestClient.builder(hostList.toArray(new HttpHost[0])).setMaxRetryTimeoutMillis(1000);
    这样设置在api操作时候会进行超时的校验并打印IOException的异常信息,主线程并不会报错,源码解析是:

    //providing timeout is just a safety measure to prevent everlasting waits
    //the different client timeouts should already do their jobs
    

    相关文章

      网友评论

          本文标题:ES的high level查询超时设置失效问题

          本文链接:https://www.haomeiwen.com/subject/rnhlzctx.html