美文网首页思科DevNet
基于open-falcon报警数据,shell&python自动

基于open-falcon报警数据,shell&python自动

作者: w_dll | 来源:发表于2021-01-03 13:05 被阅读0次

    这个脚本花了一定的时间去完成,完整脚本会放到github上,因为确实比较多;
    思路为从openfalcon里alarms的event_cases表中取得最新的报警数据;
    过滤后根据ip从本地库里拼接出脚本,然后将该脚本到该ip的服务器上执行;
    脚本和集群对应,集群数据每天从cmdb库里增量同步一次;
    进程的频率通过主进程脚本控制。

    https://github.com/xxwdll/my_scripts/tree/main/02falcon_auto-clean/auto-clean
    将几个和核心脚本放上来,后续会更新,同步到github上,就不放到博客里来了;

    目录结构如下

    目录结构

    开机启动

    开机启动

    获取open-falcon数据

    #!/usr/bin/python3
    import pymysql, sys, os, datetime
    
    def getTodayAndYesterday():
      today=datetime.date.today()
      oneday=datetime.timedelta(days=1)
      yesterday=today-oneday
      today='%'+str(today)+'%'
      yesterday='%'+str(yesterday)+'%'
      return today, yesterday
    
    BASE_DIR = sys.path[0]
    os.chdir(BASE_DIR)
    
    this_today, this_yesterday = getTodayAndYesterday()
    
    sql = ("select endpoint, metric, priority, note, timestamp "
    "from event_cases "
    "where "
    "status = 'PROBLEM' "
    "and "
    "(timestamp like '%s' or timestamp like '%s')"
    "order by timestamp desc limit 100;"
    )
    
    conn = pymysql.connect(host='10.16.20.35', user='falcon',passwd='******', db='alarms', port=3306, charset='utf8')
    cursor = conn.cursor(cursor=pymysql.cursors.DictCursor)
    
    cursor.execute(sql % (this_today, this_yesterday))
    result_of_sql = cursor.fetchall()
    
    for li in result_of_sql :
      this_str = str(li['endpoint'])
      this_str = this_str + " " + str(li['metric'])
      this_str = this_str + " " + str(li['priority'])
      this_str = this_str + " " + str(li['note'])
      #this_str = this_str + " " + str(li['timestamp'])
      print(this_str)
    
    cursor.close()
    

    执行脚本

    #!/bin/bash
    # author : 021786
    # date   : 20201221
    # info   : 从falcon中获取最新报警数据,执行本地预存脚本
    # v3     : 增加alarm-error.log 日志, 便于添加修改信息
    cd "$(dirname "$0")"
    #./alarm-get-falcon.sh | grep df.bytes | grep mount=/ > temp-ip.txt
    ./get_falcon_info.py | grep df.bytes | grep mount=/ > temp-ip.txt
    # 过滤出磁盘报警信息 处理
    cat temp-ip.txt | sort -u | while read li;
    do
      this_ip=`echo $li | awk '{print $1}' | awk -F '-' '{print $NF}'`
      this_path=`echo $li | grep df.bytes | grep mount=/ | awk '{print $2}' | awk -F ',' '{print $2}' | awk -F '=' '{print $2}'`
      this_sql='SELECT command
      FROM cluster_info
      WHERE
      cluster_name IN
      (select cluster_info from app_info where ip="'$this_ip'")
      and type_info="'$this_path'";'
      sqlite3 auto-clean.db "$this_sql" > temp-script
      this_commad=`cat temp-script`
      if [[ ! -z "$this_commad" ]];then
        existence_of_exec_info=`cat ../log/alarm-clean.log | grep -A3 'ip==>' | tail -15 | grep -A1 -w $this_ip | grep $this_path | wc -l`
        if [[ -z "$existence_of_exec_info" ]] || [[ "$existence_of_exec_info" -le "2" ]];then
          echo "ip==> " $this_ip
          echo "path==> " $this_path
          echo `date` '==>start'
          echo "df -h" >> temp-script
          echo '---------'
          cat temp-script
          ssh $this_ip "bash" < temp-script
          echo '---------'
        else
          echo '---USELESS---' >> ../log/alarm-error.log
          echo `date` "==>" $this_ip "--" $this_path >> ../log/alarm-error.log
        fi
      else
        existence_of_null_info=`cat ../log/alarm-clean.log | grep -A3 'ip==>' | tail -15 | grep -A1 -w $this_ip | grep $this_path | wc -l`
        if [[ -z "$existence_of_null_info" ]] || [[ "$existence_of_null_info" -le "2" ]];then
          echo "ip==> " $this_ip
          echo "path==> " $this_path
          echo `date` '==>start'
          echo '---------'
          echo 'null'
          echo '---------'
        else
          echo '---NULL---' >> ../log/alarm-error.log
          echo `date` "==>" $this_ip "--" $this_path >> ../log/alarm-error.log
        fi
      fi
    done
    

    主进程脚本

    #!/bin/bash
    i='1'
    while true;
    do
      if [[ $i == '4500' ]];then
        cat /dev/null > ./log/alarm-running.log
        i=1
        bash ./lib/update.sh >> ./log/alarm-update.log 2>&1
      else
        i=$[$i+1]
      fi
      echo 'check time ' `date` >> ./log/alarm-running.log
      # 从openfalcon 获取数据 执行清理操作
      bash ./lib/falcon-clean-disk.sh >> ./log/alarm-clean.log 2>&1
      sleep 15
    done
    

    相关文章

      网友评论

        本文标题:基于open-falcon报警数据,shell&python自动

        本文链接:https://www.haomeiwen.com/subject/xtpuoktx.html