美文网首页
hive on spark Timed out waiting

hive on spark Timed out waiting

作者: Rex_2013 | 来源:发表于2020-08-20 09:23 被阅读0次

    在beeline中使用hive on spark ,报错

    ERROR : FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.spark.SparkTask. java.util.concurrent.ExecutionException: java.util.concurrent.TimeoutException: Timed out waiting for client connection.
    INFO  : Completed executing command(queryId=root_20200819100850_49d1303d-4b6a-4ef2-968b-419f5a9dd036); Time taken: 90.058 seconds
    INFO  : Concurrency mode is disabled, not creating a lock manager
    Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.spark.SparkTask. java.util.concurrent.ExecutionException: java.util.concurrent.TimeoutException: Timed out waiting for client connection. (state=08S01,code=1)
    

    由于hive程序的是通过yarn 去跑spark的,到Hadoop目录下查看resourcemanager日志

    [root@node09 logs]# tail -f  hadoop-root-resourcemanager-node09.log
    
    2020-08-19 10:41:22,695 INFO org.apache.hadoop.yarn.client.api.impl.TimelineConnector: Exception caught by TimelineClientConnectionRetry, will try 1 more time(s).
    Message: java.net.ConnectException: Call From null to node09:8188 failed on connection exception: java.net.ConnectException: Connection refused (Connection refused); For more details see:  http://wiki.apach   e.org/hadoop/ConnectionRefused
    2020-08-19 10:41:23,696 ERROR org.apache.hadoop.yarn.server.resourcemanager.metrics.TimelineServiceV1Publisher: Error when publishing entity [YARN_APPLICATION,application_1597802656468_0003]
    java.lang.RuntimeException: Failed to connect to timeline server. Connection retries limit exceeded. The posted timeline event may be missing
            at org.apache.hadoop.yarn.client.api.impl.TimelineConnector$TimelineClientConnectionRetry.retryOn(TimelineConnector.java:357)
            at org.apache.hadoop.yarn.client.api.impl.TimelineConnector$TimelineJerseyRetryFilter.handle(TimelineConnector.java:404)
            at com.sun.jersey.api.client.Client.handle(Client.java:652)
            at com.sun.jersey.api.client.WebResource.handle(WebResource.java:682)
            at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74)
            at com.sun.jersey.api.client.WebResource$Builder.post(WebResource.java:570)
            at org.apache.hadoop.yarn.client.api.impl.TimelineWriter.doPostingObject(TimelineWriter.java:157)
            at org.apache.hadoop.yarn.client.api.impl.TimelineWriter$1.run(TimelineWriter.java:115)
            at org.apache.hadoop.yarn.client.api.impl.TimelineWriter$1.run(TimelineWriter.java:112)
            at java.security.AccessController.doPrivileged(Native Method)
            at javax.security.auth.Subject.doAs(Subject.java:422)
            at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
            at org.apache.hadoop.yarn.client.api.impl.TimelineWriter.doPosting(TimelineWriter.java:112)
            at org.apache.hadoop.yarn.client.api.impl.TimelineWriter.putEntities(TimelineWriter.java:92)
            at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.putEntities(TimelineClientImpl.java:177)
            at org.apache.hadoop.yarn.server.resourcemanager.metrics.TimelineServiceV1Publisher.putEntity(TimelineServiceV1Publisher.java:370)
            at org.apache.hadoop.yarn.server.resourcemanager.metrics.TimelineServiceV1Publisher.access$100(TimelineServiceV1Publisher.java:52)
            at org.apache.hadoop.yarn.server.resourcemanager.metrics.TimelineServiceV1Publisher$TimelineV1EventHandler.handle(TimelineServiceV1Publisher.java:395)
            at org.apache.hadoop.yarn.server.resourcemanager.metrics.TimelineServiceV1Publisher$TimelineV1EventHandler.handle(TimelineServiceV1Publisher.java:391)
            at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:197)
            at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:126)
            at java.lang.Thread.run(Thread.java:748)
    
    
    • 原因分析: hive on spark 任务的状态一直是running,并且占用的内存资源也不能够释放
    • 问题描述:hive配置成spark引擎,提交任务到yarn,执行SQL 能够正确的返回结果,但是执行完毕,任务的状态一直是running,并且占用的内存资源也不能够释放

    问题分析:spark on hive本质是spark-shell.sh,spark-shell.sh会一直占用进程,这样后面提交的hive on spark任务就不需要重复上传spark依赖,加速任务执行速度

    • 解决思路:如需执行mapreduce或者其他类型任务,切换其他队列或者强制结束spark进程

    • 解决步骤:

      • kill掉没有使用的application
    [root@node09 logs]# yarn application --kill  application_1597802656468_0002
    [root@node09 logs]# yarn application --kill  application_1597802656468_0003
    
    0: jdbc:hive2://node09:10000/gmall>SET mapreduce.job.queuename 队列名;
    
    • 注意:不论使用的是beeline 还是DBeaver 连接hiveserver2,在spark on hive 如果配置是使用yarn的话。每一种客户端执行都会生成一个application。关闭DBeaver连接 或者关闭beeline。这个application还是会保留的。

    参考 https://blog.csdn.net/weixin_43976998/article/details/107395836?utm_medium=distribute.pc_relevant.none-task-blog-baidulandingword-1&spm=1001.2101.3001.4242

    相关文章

      网友评论

          本文标题:hive on spark Timed out waiting

          本文链接:https://www.haomeiwen.com/subject/ivhrjktx.html