linux日志埋点分析: 无敌强大的 grep、awk、sort、uniq
**需求:
1. 对日志进行埋点,各个环节的耗时进行统计,统计后生成图表,更直观明了。
2. 因为任务是异步的,所以选择了jmeter进行多线程模拟并发+日志埋点,统计各个环节耗时,综合分析看还可以怎么优化。
3. 日志用了关键词检索,分组各类求和后输出为.csv文件,可以无缝链接excel,生成图表;
- 在某个目录下的所有的.log中检索 含有 xxx的行,并写入a.txt文件 ,或者追加到a.txt
grep "xxx" *.log > a.txt
grep "xxx" *.log >> a.txt
grep -C 5 foo file 显示file文件中匹配foo字串那行以及上下5行
grep -B 5 foo file 显示foo及前5行
grep -A 5 foo file 显示foo及后5行
- 默认根据空格分隔,1为第一段, 可以在遍历前输入(BEGIN,END),可以进行分组
awk '{print 1,3}' a.txt
awk -F '|' '{print 2,2 } END { print "end----------"} }'
awk '{s[3}END{{printf"%s,","jobId(耗时均为ms)"} for(i in s){ printf "%s,",i } {printf "\n%s,",$1} for(i in s){ printf "%s,",s[i] }}'
- 去重
uniq 默认是去掉相邻的重复的
sort -u | uniq 这样可以把重复的都去掉,只保留唯一的
print 有默认换行
printf “%s,”,i 默认不换行
- 脚本:
#!/bin/bash
# log同目录下执行 ./logAnalysis.sh 就可以看到 logAnalysis.csv excel中打开就可以分析
i=0
list=()
for j in `grep "create a log gine named:" fos.log | awk -F ':' '{print $4}' | awk -F '-' '{print $1}' | sort -u | uniq` #这里替换成你的ls....
do
list[$i]=\"$j\"
i=`expr $i + 1`
done
printf "统计的日志文件列表:\n\t"
echo ${list[@]}
# 先删除上次的结果
rm -rf source.log logAnalysis.log logAnalysis.csv
grep '|logAnalysis|' *.log > source.log
for j in `grep "create a log gine named:" fos.log | awk -F ':' '{print $4}' | awk -F '-' '{print $1}' | sort -u | uniq` #遍历fos中的所有log文件
do
grep '|'$j'|logAnalysis|' source.log > $j.log
if ([[ `cat $j.log | wc -l` -ne 0 ]]) #$J.log文件不为null,继续
then
# cat $j.log >> source.log
echo '[2019-07-23 14:36:20,355] INFO 25534[main] - RobotJob.execute(99) - |'$j'|logAnalysis|查询tips耗时|0' >> $j.log
echo '[2019-07-23 14:36:20,355] INFO 25534[main] - RobotJob.execute(99) - |'$j'|logAnalysis|查询hbase耗时|0' >> $j.log
echo '[2019-07-23 14:36:20,355] INFO 25534[main] - RobotJob.execute(99) - |'$j'|logAnalysis|tips数量|0' >> $j.log
echo '[2019-07-23 14:36:20,355] INFO 25534[main] - RobotJob.execute(99) - |'$j'|logAnalysis|分层耗时|0' >> $j.log
echo '[2019-07-23 14:36:20,355] INFO 25534[main] - RobotJob.execute(99) - |'$j'|logAnalysis|整体耗时|0' >> $j.log
echo '[2019-07-23 14:36:20,355] INFO 25534[main] - RobotJob.execute(99) - |'$j'|logAnalysis|状态更新耗时|0' >> $j.log
echo '[2019-07-23 14:36:20,355] INFO 25534[main] - RobotJob.execute(99) - |'$j'|logAnalysis|写日志表耗时|0' >> $j.log
echo '[2019-07-23 14:36:20,355] INFO 25534[main] - RobotJob.execute(99) - |'$j'|logAnalysis|每次调用api耗时|0' >> $j.log
if([[ `cat logAnalysis.csv | wc -l` -ne 0 ]])
then
grep '|'$j'|logAnalysis|' $j.log | awk -F "|" '{print $2,$4,$5}' | awk '{s[$2] += $3}END{{printf "\n%s,",$1} for(i in s){ printf "%s,",s[i] }}' >> logAnalysis.csv
else
grep '|'$j'|logAnalysis|' $j.log | awk -F "|" '{print $2,$4,$5}' | awk '{s[$2] += $3}END{{printf"%s,","jobId(耗时均为ms)"} for(i in s){ printf "%s,",i } {printf "\n%s,",$1} for(i in s){ printf "%s,",s[i] }}' >> logAnalysis.csv
fi
fi
rm -rf $j.log #删掉新增的文件
done
# printf "\n源文件log: \n"
# cat source.log
# printf "\nlogAnalysis: \n"
# awk -F "|" '{print $2,$4,$5}' source.log > logAnalysis.log
# cat logAnalysis.log
printf "\n\nlogAnalysis.csv: \n"
cat logAnalysis.csv
5. 源数据:
RobotJob.execute(133) - |95789|logAnalysis|分层耗时|1032
95789-bdfdaabbaea94626a36b06f0fe817984-20190723201813.log:[2019-07-23 20:19:40,511] INFO134826[main] - RobotJob.execute(169) - |95789|logAnalysis|状态更新耗时|86811
95789-bdfdaabbaea94626a36b06f0fe817984-20190723201813.log:[2019-07-23 20:19:40,750] INFO135065[main] - RobotJob.execute(192) - |95789|logAnalysis|整体耗时|87050
fos.log:[2019-07-23 17:43:35,187] [INFO] [0] [] |logAnalysis|查询hbase耗时|228
fos.log:[2019-07-23 17:43:35,376] [INFO] [0] [] |logAnalysis|查询hbase耗时|10
fos.log:[2019-07-23 17:43:35,474] [INFO] [0] [] |logAnalysis|查询hbase耗时|90
fos.log:[2019-07-23 17:43:35,493] [INFO] [0] [] |logAnalysis|查询hbase耗时|12
6. 结果输出如图:
result导入excell
图片.png
图片.png
选中数据制作图表
图表效果的txt
网友评论