Spark每个任务输出一次日志文件
版本信息:
spark-2.4.3
hadoop-2.6.4
前些天在解决spark local模式的日志输出问题,需要每次执行一次spark作业就把该次作业的日志输出到一个日志文件中,这里记录下,分享下实现过程以及踩的坑;
先自定义一个FileAppender,如下:
package com.demo.util;
import org.apache.log4j.FileAppender;
import org.apache.log4j.Layout;
import org.apache.log4j.spi.ErrorCode;
import java.io.File;
import java.io.IOException;
/**
* This is a customized log4j appender, which will create a new file for every
* run of the application.
*/
public class NewLogForEachRunFileAppender extends FileAppender {
private String homeDir = "E:\\tmp\\logs";
private String fmiLogDir = "";
public NewLogForEachRunFileAppender() {
}
public NewLogForEachRunFileAppender(Layout layout, String filename,
boolean append, boolean bufferedIO, int bufferSize)
throws IOException {
super(layout, filename, append, bufferedIO, bufferSize);
}
public NewLogForEachRunFileAppender(Layout layout, String filename,
boolean append) throws IOException {
super(layout, filename, append);
}
public NewLogForEachRunFileAppender(Layout layout, String filename)
throws IOException {
super(layout, filename);
}
public void activateOptions() {
if (fileName != null) {
try {
fileName = getNewLogFileName();
setFile(fileName, fileAppend, bufferedIO, bufferSize);
} catch (Exception e) {
errorHandler.error("Error while activating log options", e,
ErrorCode.FILE_OPEN_FAILURE);
}
}
}
private String getNewLogFileName() {
if (fileName != null) {
final String DOT = ".";
final String HIPHEN = "-";
final File logFile = new File(fileName);
final String fileName = logFile.getName();
String newFileName = "";
final int dotIndex = fileName.indexOf(DOT);
if (dotIndex != -1) {
// the file name has an extension. so, insert the time stamp
// between the file name and the extension
newFileName =
fileName.substring(0, dotIndex) + HIPHEN + System.currentTimeMillis() + fileName.substring(dotIndex);
} else {
// the file name has no extension. So, just append the timestamp
// at the end.
newFileName = fileName + HIPHEN + System.currentTimeMillis();
}
System.out.println("=============log output dir====================");
System.out.println(homeDir + fmiLogDir + File.separator + newFileName);
return homeDir + fmiLogDir + File.separator + newFileName;
}
return null;
}
}
在Resource目录添加log4j配置文件:
![](https://img.haomeiwen.com/i12973579/5d200b3d34b25884.png)
添加日志输出配置:
![](https://img.haomeiwen.com/i12973579/8748f7e7f18087ad.png)
本地搞个测试类测试,ok,测试通过,日志正常输出:
![](https://img.haomeiwen.com/i12973579/ffe7e9580ef6a5ba.png)
输出结果:
![](https://img.haomeiwen.com/i12973579/e81f992b8d8e4198.png)
现在打jar包,发测试环境测试,执行了一个jar包中的spark作业,发现自定义的log4j文件没生效,依然使用的是spark默认的log4j配置文件:
![](https://img.haomeiwen.com/i12973579/1bcbfbe715f5bf4e.png)
郁闷。。。。几经修改无济于事,各种google都不行,大部门的搜索结果都是要在sparkSession初始化的时候添加配置; 我也照搬,依然不行;随后我修改了下Resource目录下的log4j配置文件已经路径,然后就ok了。。。。。
![](https://img.haomeiwen.com/i12973579/7b6c43252e360a42.png)
打包测试环境测试:
![](https://img.haomeiwen.com/i12973579/a4d4f339879d1192.png)
至于为什么我Resource的org/apache/spark目录下配置log4j文件,原因在这里:
https://www.jianshu.com/p/547892d6657e
网友评论