美文网首页
SpringBoot Actuator健康检查Elasticse

SpringBoot Actuator健康检查Elasticse

作者: 马飞凌凌漆 | 来源:发表于2020-05-21 19:50 被阅读0次

版本环境

  • spring boot: 2.2.4.RELEASE
  • spring-data-elasticsearch: 3.2.4.RELEASE
  • Elasticsearch: 6.3.0

问题描述

使用 spring data elasticsearch 来连接使用 elasticsearch, 配置如下:

spring:
  data:
    elasticsearch:
      cluster-name: aliyun-es
      cluster-nodes: 114.xx.xx.xx:9300

之前都运行好好的,已经确认 elasticsearch 的 9300 和 9200 端口无任何问题。但今天在项目中加入 Actuator 来监控系统运行情况时,报了如下错误:

2020-05-21 19:03:47,183 WARN  [http-nio-8080-exec-3] org.springframework.boot.actuate.elasticsearch.ElasticsearchRestHealthIndicator [AbstractHealthIndicator.java:87] Elasticsearch health check failed
java.net.ConnectException: Timeout connecting to [localhost/127.0.0.1:9200]
    at org.elasticsearch.client.RestClient$SyncResponseListener.get(RestClient.java:959)
    at org.elasticsearch.client.RestClient.performRequest(RestClient.java:233)
    at org.springframework.boot.actuate.elasticsearch.ElasticsearchRestHealthIndicator.doHealthCheck(ElasticsearchRestHealthIndicator.java:60)
    at org.springframework.boot.actuate.health.AbstractHealthIndicator.health(AbstractHealthIndicator.java:82)
    at org.springframework.boot.actuate.health.HealthIndicator.getHealth(HealthIndicator.java:37)
    at org.springframework.boot.actuate.health.HealthEndpointWebExtension.getHealth(HealthEndpointWebExtension.java:95)
    at org.springframework.boot.actuate.health.HealthEndpointWebExtension.getHealth(HealthEndpointWebExtension.java:43)
    at org.springframework.boot.actuate.health.HealthEndpointSupport.getContribution(HealthEndpointSupport.java:108)
    at org.springframework.boot.actuate.health.HealthEndpointSupport.getAggregateHealth(HealthEndpointSupport.java:119)
    at org.springframework.boot.actuate.health.HealthEndpointSupport.getContribution(HealthEndpointSupport.java:105)
    at org.springframework.boot.actuate.health.HealthEndpointSupport.getHealth(HealthEndpointSupport.java:83)
    at org.springframework.boot.actuate.health.HealthEndpointSupport.getHealth(HealthEndpointSupport.java:70)
    at org.springframework.boot.actuate.health.HealthEndpointWebExtension.health(HealthEndpointWebExtension.java:81)
    at org.springframework.boot.actuate.health.HealthEndpointWebExtension.health(HealthEndpointWebExtension.java:70)

问题解决

查看错误地方 ElasticsearchRestHealthIndicator 的源码:

    @Override
    protected void doHealthCheck(Health.Builder builder) throws Exception {
        Response response = this.client.performRequest(new Request("GET", "/_cluster/health/"));
        StatusLine statusLine = response.getStatusLine();
        if (statusLine.getStatusCode() != HttpStatus.SC_OK) {
            builder.down();
            builder.withDetail("statusCode", statusLine.getStatusCode());
            builder.withDetail("reasonPhrase", statusLine.getReasonPhrase());
            return;
        }
        try (InputStream inputStream = response.getEntity().getContent()) {
            doHealthCheck(builder, StreamUtils.copyToString(inputStream, StandardCharsets.UTF_8));
        }
    }

可以看到方法第一行检测 Elasticsearch 是否健康是使用 GET 请求访问了 /_cluster/health 路径,但为什么访问的地址是 localhost:9200 呢?猜测应该是 Spring Boot 默认的配置,于是在查看 elasticsearch 的自动配置类 org.springframework.boot.autoconfigure.elasticsearch.
在 RestClientProperties 中:

@ConfigurationProperties(prefix = "spring.elasticsearch.rest")
public class RestClientProperties {

    /**
     * Comma-separated list of the Elasticsearch instances to use.
     */
    private List<String> uris = new ArrayList<>(Collections.singletonList("http://localhost:9200"));
}

这个 uris 应该就是导致错误的原因,默认是 http://localhost:9200,所以配置下:

spring:
  data:
    elasticsearch:
      cluster-name: aliyun-es
      cluster-nodes: 114.xx.xx.xx:9300
  elasticsearch:
    rest:
      uris: ["114.xx.xx.xx:9200"]
      connection-timeout: 10s

重新运行后再次出错:

2020-05-21 19:28:51,726 WARN  [http-nio-8080-exec-8] org.springframework.boot.actuate.elasticsearch.ElasticsearchHealthIndicator [AbstractHealthIndicator.java:87] Elasticsearch health check failed
org.elasticsearch.ElasticsearchTimeoutException: java.util.concurrent.TimeoutException: Timeout waiting for task.
    at org.elasticsearch.common.util.concurrent.FutureUtils.get(FutureUtils.java:79)
    at org.elasticsearch.action.support.AdapterActionFuture.actionGet(AdapterActionFuture.java:54)
    at org.elasticsearch.action.support.AdapterActionFuture.actionGet(AdapterActionFuture.java:44)
    at org.springframework.boot.actuate.elasticsearch.ElasticsearchHealthIndicator.doHealthCheck(ElasticsearchHealthIndicator.java:79)
    at org.springframework.boot.actuate.health.AbstractHealthIndicator.health(AbstractHealthIndicator.java:82)
    at org.springframework.boot.actuate.health.HealthIndicator.getHealth(HealthIndicator.java:37)
    at org.springframework.boot.actuate.health.HealthEndpointWebExtension.getHealth(HealthEndpointWebExtension.java:95)

错误说是连接超时,debug 发现发送请求检测的超时时间只有 100 毫秒


debug

这个时间太快了,我的网络环境不支持,需要增加点超时时间,由于这个健康检测是由 Actuator 执行的,于是去查看 Actuator 中 Elasticsearch 的自动配置类,在 ElasticsearchHealthIndicatorProperties 中找到:

@ConfigurationProperties(
    prefix = "management.health.elasticsearch",
    ignoreUnknownFields = false
)
@Deprecated
public class ElasticsearchHealthIndicatorProperties {
    private List<String> indices = new ArrayList();
    private Duration responseTimeout = Duration.ofMillis(100L);
}

可以看到 responseTimeout 为 100 毫秒,和上面 debug 的时间一致,应该就是这一项了,修改 yml :

# actuator
management:
  endpoints:
    web:
      exposure:
        include: ['*']
  health:
    elasticsearch:
      response-timeout: 3s

再次运行,无误。

还有一种方式也可以解决,但是并不是一种好的解决方式,那就是关闭 actuator 对 elasticsearch 的健康检查:

management:
      health:
        elasticsearch:
          enabled: false

相关文章

网友评论

      本文标题:SpringBoot Actuator健康检查Elasticse

      本文链接:https://www.haomeiwen.com/subject/bxvtahtx.html