spark之Failed to create local dir

作者: 假文艺的真码农 | 来源:发表于2018-05-31 23:21 被阅读26次

    近日莫名遭遇异常一枚,如下:

    org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 271.0 failed 1 times, most recent failure: Lost task 0.0 in stage 271.0 (TID 544, localhost): java.io.IOException: Failed to create local dir in /tmp/blockmgr-4223dca8-7355-4ab2-98b9-87e763c7becd/1d.
            at org.apache.spark.storage.DiskBlockManager.getFile(DiskBlockManager.scala:87)
            at org.apache.spark.storage.DiskBlockManager.getFile(DiskBlockManager.scala:97)
            at org.apache.spark.shuffle.IndexShuffleBlockResolver.getIndexFile(IndexShuffleBlockResolver.scala:58)
            at org.apache.spark.shuffle.IndexShuffleBlockResolver.writeIndexFileAndCommit(IndexShuffleBlockResolver.scala:140)
            at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:127)
            at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:87)
            at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:52)
            at org.apache.spark.scheduler.Task.run(Task.scala:107)
            at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:277)
            at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
            at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
            at java.lang.Thread.run(Thread.java:745)
    

    原因分析:
    1 Failed to create local dir,什么时候spark会创建临时文件呢?
    shuffle时需要通过diskBlockManage将map结果写入本地,优先写入memory store,在memore store空间不足时会创建临时文件(二级目录,如异常中的blockmgr-4223dca8-7355-4ab2-98b9-87e763c7becd/1d)。
    2 shuffle又是咋回事呢?
    spark作为并行计算框架,同一个作业会被划分为多个任务在多个节点执行,reduce的输入可能存在于多个节点,因此需要shuffle将所有reduce的输入汇总起来。
    3 memory store的大小是多少,什么情况下会超出使用disk store?
    memory store的大小取决于spark.excutor.memory大小,默认为spark.excutor.memory*0.6
    4 临时文件默认创建于/temp,如果修改?
    spark.env中添加配置SPARK_LOCAL_DIRS或程序中配置,可配置多个路径,逗号分隔增强io效率

    SPARK_LOCAL_DIRS:
    Directory to use for "scratch" space in Spark, including map output files and RDDs that get stored on disk. This should be on a fast, local disk in your system. It can also be a comma-separated list of multiple directories on different disks.
    

    5 保证磁盘空间充足和磁盘读写权限。磁盘空间按需配置。

    相关文章

      网友评论

        本文标题:spark之Failed to create local dir

        本文链接:https://www.haomeiwen.com/subject/bzicsftx.html