这个脚本花了一定的时间去完成,完整脚本会放到github上,因为确实比较多;
思路为从openfalcon里alarms的event_cases表中取得最新的报警数据;
过滤后根据ip从本地库里拼接出脚本,然后将该脚本到该ip的服务器上执行;
脚本和集群对应,集群数据每天从cmdb库里增量同步一次;
进程的频率通过主进程脚本控制。
https://github.com/xxwdll/my_scripts/tree/main/02falcon_auto-clean/auto-clean
将几个和核心脚本放上来,后续会更新,同步到github上,就不放到博客里来了;
目录结构如下
目录结构开机启动
开机启动获取open-falcon数据
#!/usr/bin/python3
import pymysql, sys, os, datetime
def getTodayAndYesterday():
today=datetime.date.today()
oneday=datetime.timedelta(days=1)
yesterday=today-oneday
today='%'+str(today)+'%'
yesterday='%'+str(yesterday)+'%'
return today, yesterday
BASE_DIR = sys.path[0]
os.chdir(BASE_DIR)
this_today, this_yesterday = getTodayAndYesterday()
sql = ("select endpoint, metric, priority, note, timestamp "
"from event_cases "
"where "
"status = 'PROBLEM' "
"and "
"(timestamp like '%s' or timestamp like '%s')"
"order by timestamp desc limit 100;"
)
conn = pymysql.connect(host='10.16.20.35', user='falcon',passwd='******', db='alarms', port=3306, charset='utf8')
cursor = conn.cursor(cursor=pymysql.cursors.DictCursor)
cursor.execute(sql % (this_today, this_yesterday))
result_of_sql = cursor.fetchall()
for li in result_of_sql :
this_str = str(li['endpoint'])
this_str = this_str + " " + str(li['metric'])
this_str = this_str + " " + str(li['priority'])
this_str = this_str + " " + str(li['note'])
#this_str = this_str + " " + str(li['timestamp'])
print(this_str)
cursor.close()
执行脚本
#!/bin/bash
# author : 021786
# date : 20201221
# info : 从falcon中获取最新报警数据,执行本地预存脚本
# v3 : 增加alarm-error.log 日志, 便于添加修改信息
cd "$(dirname "$0")"
#./alarm-get-falcon.sh | grep df.bytes | grep mount=/ > temp-ip.txt
./get_falcon_info.py | grep df.bytes | grep mount=/ > temp-ip.txt
# 过滤出磁盘报警信息 处理
cat temp-ip.txt | sort -u | while read li;
do
this_ip=`echo $li | awk '{print $1}' | awk -F '-' '{print $NF}'`
this_path=`echo $li | grep df.bytes | grep mount=/ | awk '{print $2}' | awk -F ',' '{print $2}' | awk -F '=' '{print $2}'`
this_sql='SELECT command
FROM cluster_info
WHERE
cluster_name IN
(select cluster_info from app_info where ip="'$this_ip'")
and type_info="'$this_path'";'
sqlite3 auto-clean.db "$this_sql" > temp-script
this_commad=`cat temp-script`
if [[ ! -z "$this_commad" ]];then
existence_of_exec_info=`cat ../log/alarm-clean.log | grep -A3 'ip==>' | tail -15 | grep -A1 -w $this_ip | grep $this_path | wc -l`
if [[ -z "$existence_of_exec_info" ]] || [[ "$existence_of_exec_info" -le "2" ]];then
echo "ip==> " $this_ip
echo "path==> " $this_path
echo `date` '==>start'
echo "df -h" >> temp-script
echo '---------'
cat temp-script
ssh $this_ip "bash" < temp-script
echo '---------'
else
echo '---USELESS---' >> ../log/alarm-error.log
echo `date` "==>" $this_ip "--" $this_path >> ../log/alarm-error.log
fi
else
existence_of_null_info=`cat ../log/alarm-clean.log | grep -A3 'ip==>' | tail -15 | grep -A1 -w $this_ip | grep $this_path | wc -l`
if [[ -z "$existence_of_null_info" ]] || [[ "$existence_of_null_info" -le "2" ]];then
echo "ip==> " $this_ip
echo "path==> " $this_path
echo `date` '==>start'
echo '---------'
echo 'null'
echo '---------'
else
echo '---NULL---' >> ../log/alarm-error.log
echo `date` "==>" $this_ip "--" $this_path >> ../log/alarm-error.log
fi
fi
done
主进程脚本
#!/bin/bash
i='1'
while true;
do
if [[ $i == '4500' ]];then
cat /dev/null > ./log/alarm-running.log
i=1
bash ./lib/update.sh >> ./log/alarm-update.log 2>&1
else
i=$[$i+1]
fi
echo 'check time ' `date` >> ./log/alarm-running.log
# 从openfalcon 获取数据 执行清理操作
bash ./lib/falcon-clean-disk.sh >> ./log/alarm-clean.log 2>&1
sleep 15
done
网友评论