最近下载了个CDH的quickstart vm玩玩,发现个问题,spark 的Job History Server无法查看已经跑过的Spark 任务。进入Server页面显示如下
Event log directory:
hdfs://quickstart.cloudera:8020/user/spark/applicationHistory
No completed applications found!
Did you specify the correct logging directory? Please verify your setting of spark.history.fs.logDirectory and whether you have the permissions to access it.
It is also possible that your application did not run to completion or did not stop the SparkContext.
去查看HDFS上的/user/spark/applicationHistory路径发现里面有跑过的spark任务,进入spark job history server的日志目录查看日志,果然报错了,原来是hdfs文件权限的问题,spark用户无法查看这个目录的东西。那么我们只需要把/user/spark/applicationHistory的用户组改为spark即可
2019-04-30 00:30:24,217 ERROR org.apache.spark.deploy.history.FsHistoryProvider: Exception encountered when attempting to load application log hdfs://quickstart.cloudera:8020/user/spark/applicationHistory/application_1554981419682_0001
org.apache.hadoop.security.AccessControlException: Permission denied: user=spark, access=READ, inode="/user/spark/applicationHistory/application_1554981419682_0001":cloudera:supergroup:-rwxrwx---
修改完文件夹权限后在查看history server,日志都显示出来了
image.png
网友评论