目录
- 监控需求
- 监控脚本
- 总结
监控需求
日志里面包含了许多需要注意的信息,返回码、响应时间、请求ip等等。网上有不少监控的工具,比如elk。但结合已有的监控,我决定用python把数据分析出来,存放到redis里面,用zabbix做可视化。监控到每台机器上leveldb的平均每分钟响应时间和超时次数。
监控脚本
pip install redis datetime
#!/usr/bin/python
from datetime import datetime,timedelta
import time
import os,sys
import redis
import mylog
count = 0
alltime = 0
avetime = 0
count1 = 0
###
利用生成器,倒叙读日志
###
def read_reverse(filename):
f = open(filename)
f.seek(0, 2)
last_position = f.tell()
while True:
line = f.readline()
current_position = f.tell()
i = 1
while current_position == last_position:
if len(line) == current_position:
yield line
return
i += 0.5
f.seek(max(int(-72 * i), -current_position), 1)
line = f.readline()
current_position = f.tell()
while current_position != last_position:
line = f.readline()
current_position = f.tell()
yield line
last_position = last_position - len(line)
f.seek(max(-72, -last_position) - len(line), 1)
###
格式化时间,为了防止遗漏数据,统计上一分钟的nginx日志
###
time0 = datetime.now()
time1 = datetime.now() - timedelta(minutes = 1)
time2 = datetime.now() - timedelta(minutes = 2)
format = '%Y-%m-%d:%H:%M'
Time1 = time1.strftime(format)
Time2 = time2.strftime(format)
ms = "ms"
try:
f = read_reverse('/path/access.log')
for line in f:
c = ':'.join(':'.join(line.split(" ")[1:3]).split(".")[0].split(":")[0:3])
b = line.split(":")[-1]
if (c == Time1):
if(b[-3:-1] == ms ):
count = count + 1
alltime = float(b[0:5]) + alltime
avetime = alltime / float(count)
avetime = round(avetime,3)
if(float(b[0:3]) > float(20)):
count1 = count1 + 1
else:
pass
elif (c == Time2):
break
except e:
mylog.logging.error('%s %s' %e %Time1 )
###
写到redis中,zabbix-agent直接从redis中捞数据即可
###
try:
r = redis.Redis(host = ip,port = port,password = pwd)
r.set('level03_count', count1)
r.set('level03_avetime',avetime)
except e:
mylog.logging.error('%s %s' %e %Time1 )
sys.exit()
总结
个人写的很挫的脚本之一。。。被师傅说过好多次。虽然实现了功能,但还有很多需要优化的地方,比如各个变量的命名非常之不规范,比如时间模块没有单独拎出来等等。写的比较混乱,本着开源(不要脸)精神,还是贴出来记录下,小白可以借鉴下,各位大神勿喷😭
网友评论