美文网首页
Python检查zookeeper节点状态

Python检查zookeeper节点状态

作者: 平凡的运维之路 | 来源:发表于2023-12-19 19:29 被阅读0次

Python检查zookeeper节点信息状态

需求

  • 目前项目核心模块使用zookeeper做分布式锁,但是zookeeper的节点信息状态没有监控,所以需要监控zookeeper的节点信息状态
  • 监控zookeeper的节点信息状态,如果节点信息状态异常不存在,则输出错误关键字或者报警等

实现

  • 利用python的zookeeper模块,获取zookeeper的节点信息状态
  • 利用python的配合错误关键字,将zookeeper的节点信息异常状态推送网管告警

缺陷

  • 使用DataWatch只监控节点信息状态,如果该节点信息不存在,这时其他节点cti1信息变更后,会通知到其他不存在节点cti5,如下信息
#修改节点信息时
[zk: localhost:2182(CONNECTED) 28] set /ctimanager/bj-cti1  {"agent":2,"test":1,"a1":2ok23121111111}
2023-12-20 19:26:35,654 140538316216128 check_zookeeper_node.py:73 INFO get config file  zookeeper node list: ['/ctimanager/bj-cti1', '/ctimanager/bj-cti2', '/ctimanager/bj-cti3', '/ctimanager/bj-cti4', '/ctimanager/bj-cti5']
2023-12-20 19:26:35,654 140538316216128 check_zookeeper_node.py:76 INFO zookeeper login host: 127.0.0.1:2182
2023-12-20 19:26:35,660 140538316216128 check_zookeeper_node.py:81 INFO use user login auth conn zookeeper 
2023-12-20 19:26:35,661 140538316216128 check_zookeeper_node.py:83 INFO zookeeper user info: digest admin:123456
2023-12-20 19:26:35,662 140538316216128 check_zookeeper_node.py:21 DEBUG 从zookeeper节点 /ctimanager/bj-cti1 的数据获取信息: b'{"agent":2,"test":1,"a1":2ok2}' 状态正常
2023-12-20 19:26:35,662 140538316216128 check_zookeeper_node.py:22 INFO 从zookeeper获取节点 /ctimanager/bj-cti1 正常
2023-12-20 19:26:35,662 140538316216128 check_zookeeper_node.py:21 DEBUG 从zookeeper节点 /ctimanager/bj-cti2 的数据获取信息: b'{"agent":1}' 状态正常
2023-12-20 19:26:35,663 140538316216128 check_zookeeper_node.py:22 INFO 从zookeeper获取节点 /ctimanager/bj-cti2 正常
2023-12-20 19:26:35,664 140538316216128 check_zookeeper_node.py:19 ERROR 从zookeeper获取节点 /ctimanager/bj-cti3 不存在
2023-12-20 19:26:35,665 140538316216128 check_zookeeper_node.py:19 ERROR 从zookeeper获取节点 /ctimanager/bj-cti4 不存在
2023-12-20 19:26:35,666 140538316216128 check_zookeeper_node.py:19 ERROR 从zookeeper获取节点 /ctimanager/bj-cti5 不存在
2023-12-20 19:26:40,915 140538087216896 check_zookeeper_node.py:21 DEBUG 从zookeeper节点 /ctimanager/bj-cti5 的数据获取信息: b'{"agent":2,"test":1,"a1":2ok2222}' 状态正常
2023-12-20 19:26:40,915 140538087216896 check_zookeeper_node.py:22 INFO 从zookeeper获取节点 /ctimanager/bj-cti5 正常
2023-12-20 19:26:45,674 140538316216128 check_zookeeper_node.py:21 DEBUG 从zookeeper节点 /ctimanager/bj-cti1 的数据获取信息: b'{"agent":2,"test":1,"a1":2ok2222}' 状态正常
2023-12-20 19:26:45,675 140538316216128 check_zookeeper_node.py:22 INFO 从zookeeper获取节点 /ctimanager/bj-cti1 正常
2023-12-20 19:26:45,675 140538316216128 check_zookeeper_node.py:21 DEBUG 从zookeeper节点 /ctimanager/bj-cti2 的数据获取信息: b'{"agent":1}' 状态正常
2023-12-20 19:26:45,675 140538316216128 check_zookeeper_node.py:22 INFO 从zookeeper获取节点 /ctimanager/bj-cti2 正常
2023-12-20 19:26:45,676 140538316216128 check_zookeeper_node.py:19 ERROR 从zookeeper获取节点 /ctimanager/bj-cti3 不存在
2023-12-20 19:26:45,677 140538316216128 check_zookeeper_node.py:19 ERROR 从zookeeper获取节点 /ctimanager/bj-cti4 不存在

实现代码

  • 获取zookeeper的节点信息状态
#!/usr/bin/python3 
# -*- coding:utf-8 -*-
###############################################################
#针对平台监控注册zookeeper节点模块宕机后,节点丢失异常时,输出告警##
############2023年12月20日16点02分##############################
###############################################################

from kazoo.client import KazooClient
from kazoo.exceptions import KazooException
from logging.handlers import RotatingFileHandler
import time,configparser,os,logging

def watch_nodes():
    for node_path in node_paths:
        try:
            @zk.DataWatch(node_path)
            def on_data_change(data, stat):
                if data is None:
                    logger.error(f"从zookeeper获取节点 {node_path} 不存在")
                else:
                    logger.debug(f"从zookeeper节点 {node_path} 的数据获取信息: {data} 状态正常")
                    logger.info(f"从zookeeper获取节点 {node_path} 正常")
        except KazooException as e:
            logger.error(f"从zookeeper监控节点 {node_path} 时发生异常:{e}")

def logger_func(LogLevel):
  # 创建日志记录器
  logger = logging.getLogger("my_logger")
  if "DEBUG" == LogLevel:
    logger.setLevel(logging.DEBUG)
  if "INFO" == LogLevel:
    logger.setLevel(logging.INFO)
  if "WARNING" == LogLevel:
    logger.setLevel(logging.WARNING)
  if "ERROR" == LogLevel:
    logger.setLevel(logging.ERROR)
  # 创建RotatingFileHandler对象
  handler = RotatingFileHandler(log_file, maxBytes=max_log_size, backupCount=backup_count)

  # 定义日志格式
  formatter = logging.Formatter("%(asctime)s %(thread)d %(filename)s:%(lineno)d %(levelname)s %(message)s")
  handler.setFormatter(formatter)

  # 将处理程序添加到日志记录器
  logger.addHandler(handler)
  return logger

if __name__ == "__main__":
    for dirpath in os.popen("pwd"):
        dirpath = dirpath.strip('\n')
    cfgpath = os.path.join(dirpath, "cfg/config.ini")
    conf = configparser.ConfigParser()
        # conf = ConfigParser.ConfigParser()
    print("config file ---> ",cfgpath)
    # conf.read(cfgpath, encoding='UTF-8')
    conf.read(cfgpath)
    GetLogDir = conf.get("Base","LogDir")
    LogLevel = conf.get("Base","LogLevel")
    CheckintervalDate = conf.get("Base","CheckintervalDate")
    max_log_size = int(conf.get("Base","max_log_size"))
    backup_count = int(conf.get("Base","backup_count"))
    zk_hosts = conf.get("zookeeper","zookeeper_host")
    is_auth = conf.getboolean("zookeeper","is_auth")
    username = conf.get("zookeeper","zookeeper_user")
    password = conf.get("zookeeper","zookeeper_passwd")
    zookeeper_node = conf.get("zookeeper","zookeeper_node")

    log_file =  "./log/watch_zoo.log"
    logger = logger_func(LogLevel)

    # 要监控的节点路径列表
    node_paths = zookeeper_node.split(",")
    logger.info("get config file  zookeeper node list: " + str(node_paths))

    # 连接Zookeeper
    logger.info("zookeeper login host: " + zk_hosts) 
    zk = KazooClient(hosts=zk_hosts)
    try:
        zk.start()
        if True == is_auth:
            logger.info("use user login auth conn zookeeper ")
            zk.add_auth(scheme='digest',credential=username + ":" + password)
            logger.info("zookeeper user info: digest " + str(username) + ":" + password )
        else:
            logger.info("not use user auth login zookeeper")

        while True:
            watch_nodes()
            time.sleep(int(CheckintervalDate))

    except KazooException as e:
        if "Connection refused" in str(e):
            logger.error("conn zookeeper error: " + str(e) + " exit!!!")
            os.exit(110)
        else:
            logger.error("conn other zookeeper error: " + str(e) + " exit!!!" )
            os.exit(110)
    finally:
        zk.stop()
  • 配置文件说明
[Base]
#执行运行间隔休眠时间单位是s
CheckintervalDate=10
#设置日志级别: DEBUG、INFO、WARNING、ERROR
LogLevel=INFO
#日志目录
LogDir=./log
#日志文件大小100MB
max_log_size = 1104857600
#日志文件最大备份次数
backup_count = 10

[zookeeper]
#zookeeper 登录host ip
zookeeper_host = 127.0.0.1:2182
#认证用户
zookeeper_user = admin
#认证密码
zookeeper_passwd = 123456
#是否启用zookeeper密码链接
is_auth = True
#配置对应节点信息,最好对应上,cti上报多个cti,这里就配置多少个节点信息
zookeeper_node = /ctimanager/bj-cti1,/ctimanager/bj-cti2,/ctimanager/bj-cti3,/ctimanager/bj-cti4,/ctimanager/bj-cti5

脚本运行输出

  • 脚本运行
[devops@my-dev watch_zookeeper]$ ./check_zookeeper_node.py 
config file --->  /home/devops/Python/ABC/watch_zookeeper/cfg/config.ini
2023-12-20 19:20:18,946 139836511770432 check_zookeeper_node.py:73 INFO get config file  zookeeper node list: ['/ctimanager/bj-cti1', '/ctimanager/bj-cti2', '/ctimanager/bj-cti3', '/ctimanager/bj-cti4', '/ctimanager/bj-cti5']
2023-12-20 19:26:35,654 140538316216128 check_zookeeper_node.py:73 INFO get config file  zookeeper node list: ['/ctimanager/bj-cti1', '/ctimanager/bj-cti2', '/ctimanager/bj-cti3', '/ctimanager/bj-cti4', '/ctimanager/bj-cti5']
2023-12-20 19:26:35,654 140538316216128 check_zookeeper_node.py:76 INFO zookeeper login host: 127.0.0.1:2182
2023-12-20 19:26:35,660 140538316216128 check_zookeeper_node.py:81 INFO use user login auth conn zookeeper 
2023-12-20 19:26:35,661 140538316216128 check_zookeeper_node.py:83 INFO zookeeper user info: digest admin:123456
2023-12-20 19:26:35,662 140538316216128 check_zookeeper_node.py:21 DEBUG 从zookeeper节点 /ctimanager/bj-cti1 的数据获取信息: b'{"agent":2,"test":1,"a1":2ok2}' 状态正常
2023-12-20 19:26:35,662 140538316216128 check_zookeeper_node.py:22 INFO 从zookeeper获取节点 /ctimanager/bj-cti1 正常
2023-12-20 19:26:35,662 140538316216128 check_zookeeper_node.py:21 DEBUG 从zookeeper节点 /ctimanager/bj-cti2 的数据获取信息: b'{"agent":1}' 状态正常
2023-12-20 19:26:35,663 140538316216128 check_zookeeper_node.py:22 INFO 从zookeeper获取节点 /ctimanager/bj-cti2 正常
2023-12-20 19:26:35,664 140538316216128 check_zookeeper_node.py:19 ERROR 从zookeeper获取节点 /ctimanager/bj-cti3 不存在
2023-12-20 19:26:35,665 140538316216128 check_zookeeper_node.py:19 ERROR 从zookeeper获取节点 /ctimanager/bj-cti4 不存在
2023-12-20 19:26:35,666 140538316216128 check_zookeeper_node.py:19 ERROR 从zookeeper获取节点 /ctimanager/bj-cti5 不存在
2023-12-20 19:26:40,915 140538087216896 check_zookeeper_node.py:21 DEBUG 从zookeeper节点 /ctimanager/bj-cti5 的数据获取信息: b'{"agent":2,"test":1,"a1":2ok2222}' 状态正常
2023-12-20 19:26:40,915 140538087216896 check_zookeeper_node.py:22 INFO 从zookeeper获取节点 /ctimanager/bj-cti5 正常
2023-12-20 19:26:45,674 140538316216128 check_zookeeper_node.py:21 DEBUG 从zookeeper节点 /ctimanager/bj-cti1 的数据获取信息: b'{"agent":2,"test":1,"a1":2ok2222}' 状态正常
2023-12-20 19:26:45,675 140538316216128 check_zookeeper_node.py:22 INFO 从zookeeper获取节点 /ctimanager/bj-cti1 正常
2023-12-20 19:26:45,675 140538316216128 check_zookeeper_node.py:21 DEBUG 从zookeeper节点 /ctimanager/bj-cti2 的数据获取信息: b'{"agent":1}' 状态正常
2023-12-20 19:26:45,675 140538316216128 check_zookeeper_node.py:22 INFO 从zookeeper获取节点 /ctimanager/bj-cti2 正常
2023-12-20 19:26:45,676 140538316216128 check_zookeeper_node.py:19 ERROR 从zookeeper获取节点 /ctimanager/bj-cti3 不存在
2023-12-20 19:26:45,677 140538316216128 check_zookeeper_node.py:19 ERROR 从zookeeper获取节点 /ctimanager/bj-cti4 不存在
2023-12-20 19:26:45,678 140538316216128 check_zookeeper_node.py:19 ERROR 从zookeeper获取节点 /ctimanager/bj-cti5 不存在
2023-12-20 19:26:55,685 140538316216128 check_zookeeper_node.py:21 DEBUG 从zookeeper节点 /ctimanager/bj-cti1 的数据获取信息: b'{"agent":2,"test":1,"a1":2ok2222}' 状态正常
2023-12-20 19:26:55,685 140538316216128 check_zookeeper_node.py:22 INFO 从zookeeper获取节点 /ctimanager/bj-cti1 正常
2023-12-20 19:26:55,686 140538316216128 check_zookeeper_node.py:21 DEBUG 从zookeeper节点 /ctimanager/bj-cti2 的数据获取信息: b'{"agent":1}' 状态正常
2023-12-20 19:26:55,686 140538316216128 check_zookeeper_node.py:22 INFO 从zookeeper获取节点 /ctimanager/bj-cti2 正常
2023-12-20 19:26:55,687 140538316216128 check_zookeeper_node.py:19 ERROR 从zookeeper获取节点 /ctimanager/bj-cti3 不存在
2023-12-20 19:26:55,688 140538316216128 check_zookeeper_node.py:19 ERROR 从zookeeper获取节点 /ctimanager/bj-cti4 不存在
2023-12-20 19:26:55,689 140538316216128 check_zookeeper_node.py:19 ERROR 从zookeeper获取节点 /ctimanager/bj-cti5 不存在

相关文章

  • zookeeper leader选举机制

    zookeeper节点的4种状态: LEADING:说明此节点已经是leader节点,处于领导者地位的状态,差不多...

  • Zookeeper-数据同步源码分析

    zookeeper 集群在选举结束之后,leader 节点将进入 LEADING 状态,follower 节点将进...

  • 9、Zookeeper的服务器角色介绍

    一、Leader Leader作为整个ZooKeeper集群的主节点,负责响应所有对ZooKeeper状态变更的请...

  • 想了解ZooKeeper,看这篇就够了

    一、ZooKeeper是什么 ZooKeeper是一个集群管理者,监视集群中节点的状态,并根据状态进行下一步操作 ...

  • Zookeeper内部原理

    节点类型 stat结构体 czxid-创建节点的事务zxid:每次修改ZooKeeper状态都会收到一个zxid形...

  • zookeeper 节点类型

    zookeeper 节点类型 Intro Zookeeper 中节点类型按持久化可分为临时节点和持久节点,按顺序性...

  • ZooKeeper实现分布式锁

    ZooKeeper 节点是有生命周期的,这取决于节点的类型。在 ZooKeeper 中,节点类型可以分为持久节点(...

  • zookeeper常用状态检查命令工具

    zookeeper常用状态检查命令工具 查看单个node状态 查看cluster状态 有很多输出,这里只列出了两个...

  • zookeeper基本概念

    zookeeper : 1.协作多个任务:例如主从工作模式,从节点处于空闲状态时会通知主节点可以接受任务,于是主节...

  • HBase 基础架构

    架构图如下 zookeeper HBase 在 zookeeper 中存储节点 /hbase 的子节点如下图:im...

网友评论

      本文标题:Python检查zookeeper节点状态

      本文链接:https://www.haomeiwen.com/subject/kdzzgdtx.html