美文网首页
Nagios&Cacti

Nagios&Cacti

作者: 钟大發 | 来源:发表于2017-04-05 17:14 被阅读0次

    Nagios + Cacti 其实在易用性上是比不上zabbix的,但是对于仅仅需要报警而无需图表的服务监控,nagios 的确比较好,之前由于IDC迁移,就把之前老的那套nagios+cacti 环境重新部署了一次。

    Nagios:

    • 准备工作:
    apt-get install autoconf gcc libc6 build-essential bc gawk dc gettext \
    libmcrypt-dev libssl-dev make unzip apache2 apache2-utils php5 libgd2-xpm-dev
    /usr/sbin/useradd -m -s /bin/bash nagios #创建用户
    /usr/sbin/groupadd nagcmd #创建ganioscmd 用户,用于执行一些外部命令,比如nrpe
    /usr/sbin/usermod -a -G nagcmd nagios
    /usr/sbin/usermod -a -G nagcmd www-data
    
    • 安装:
    tar zxvf nagios-4.3.1.tar.gz
    cd nagios-4.3.1.tar.gz
    ./configure --prefix=/opt/nagios --with-command-group=nagcmd --with-httpd-conf=/etc/apache2/sites-enabled
    make all
    make install
    make install-init
    make install-config
    make install-commandmode
    update-rc.d nagios defaults #初始化各种配置以及增加开启启动
    
    • nagios目录:
    root@10.1.1.208:nagios# ls
    bin  etc  libexec  log  sbin  share  var
    

    其中nagios主要配置文件在etc 下,而插件主要则放在libexec下。

    • 配置nagios:
      公司的nagios 主要用来监控一些服务器的硬件状态,比如磁盘是否完好等等,而且均通过nrpe的方式进行监控,用于减少本地服务器负担。nagios的配置为分布式的,可以根据需要将多个配置注册在总的nagios.cfg 配置里。
    # You can specify individual object config files as shown below:
    cfg_file=/opt/nagios/etc/objects/commands.cfg
    cfg_file=/opt/nagios/etc/objects/contacts.cfg
    cfg_file=/opt/nagios/etc/objects/timeperiods.cfg
    cfg_file=/opt/nagios/etc/objects/templates.cfg
    #
    cfg_file=/opt/nagios/etc/objects/service.cfg
    cfg_file=/opt/nagios/etc/objects/group.cfg
    # Definitions for monitoring the local (Linux) host
    #cfg_file=/opt/nagios/etc/objects/localhost.cfg
    cfg_file=/opt/nagios/etc/objects/host_debian.cfg
    cfg_file=/opt/nagios/etc/objects/host_centos.cfg
    

    然后对应编辑目录就行了,假设我要添加一台linux 服务器,用于监控硬盘信息,需要如下步骤:
    1 .修改commands.cfg 配置,增加对应command:

    # check hardware Disk
    define command{
            command_name check_storage_disk_nrpe
            command_line /opt/nagios/libexec/check_storage_disk_nrpe $HOSTADDRESS$ check_storage_disk
    }
    

    libexec下放对应的脚本,大致意思就是nagios远程机器执行check_storage_disk 模块,而check_storage_disk 就是远程机器的一个监控脚本。

    #!/bin/bash
    PLUGINS=/opt/nagios/libexec
    CHECK_NRPE=$PLUGINS/check_nrpe
    host=$1
    comm=$2
    if [ $# -lt 2 ];then
        echo "Usage: $0 host command"
        exit 2
    fi
    #command_line    $USER1$/check_snmp_traffic $HOSTADDRESS$ public 3 " > 80 " " > 90 "
    res=`$CHECK_NRPE -H$host -n -p57000 -c $comm`
    if [ $? -ne 0 ];then
        if [ "CHECK_NRPE: Socket timeout after 10 seconds." == ${res} ];then
            echo "connect failed"
            exit 0
        else
            echo "Check Storage UNKNOWN"
            exit 3
        fi
    fi
    if [ "${res}" == "Storage Disk Normal" ];then
        echo "Check Storage OK"
        exit 0
    else
        echo "${res}"
        exit 2
    fi
    echo $res
    exit $EXIT
    

    nrpe 插件可以在nagios.org里下载。
    然后将该服务注册到service.cfg 中:

    define service{
            use                             local-service
            hostgroup_name                  debian_servers
            service_description             hardware_disk_check
            check_command                   check_storage_disk_nrpe
            }
    

    然后创建host 配置以及host group 配置:

    define hostgroup{
            hostgroup_name  debian_servers
            alias           servers
            members         test
            }
    define host{
            use                     linux-server
            host_name              test
            alias                   01
            address                 192.168.1.1
            }
    

    nagios 登录是通过apache htpass 做验证的,比较简单,修改对应的cgi的密码就行。修改nagios登录用户需要修改apache的htpasswd之外,还需要修改cgi.cfg 里的用户认证。
    然后检查nagios 配置:

    /opt/nagios/bin/nagios -v /opt/nagios/etc/nagios.cfg 
    

    然后启动nagios
    nagios 编译安装默认没有在init下有启动服务的脚本:
    这里贴一个:

    #!/bin/sh
    # 
    # chkconfig: 345 99 01
    # description: Nagios network monitor
    #
    # File : nagios
    #
    # Author : Jorge Sanchez Aymar (jsanchez@lanchile.cl)
    # 
    # Changelog :
    #
    # 1999-07-09 Karl DeBisschop <kdebisschop@infoplease.com>
    #  - setup for autoconf
    #  - add reload function
    # 1999-08-06 Ethan Galstad <egalstad@nagios.org>
    #  - Added configuration info for use with RedHat's chkconfig tool
    #    per Fran Boon's suggestion
    # 1999-08-13 Jim Popovitch <jimpop@rocketship.com>
    #  - added variable for nagios/var directory
    #  - cd into nagios/var directory before creating tmp files on startup
    # 1999-08-16 Ethan Galstad <egalstad@nagios.org>
    #  - Added test for rc.d directory as suggested by Karl DeBisschop
    # 2000-07-23 Karl DeBisschop <kdebisschop@users.sourceforge.net>
    #  - Clean out redhat macros and other dependencies
    # 2003-01-11 Ethan Galstad <egalstad@nagios.org>
    #  - Updated su syntax (Gary Miller)
    #
    # Description: Starts and stops the Nagios monitor
    #              used to provide network services status.
    #
      
    status_nagios ()
    {
    
        if test -x $NagiosCGI/daemonchk.cgi; then
            if $NagiosCGI/daemonchk.cgi -l $NagiosRunFile; then
                    return 0
            else
                return 1
            fi
        else
            if ps -p $NagiosPID > /dev/null 2>&1; then
                    return 0
            else
                return 1
            fi
        fi
    
        return 1
    }
    
    
    printstatus_nagios()
    {
    
        if status_nagios $1 $2; then
            echo "nagios (pid $NagiosPID) is running..."
        else
            echo "nagios is not running"
        fi
    }
    
    
    killproc_nagios ()
    {
    
        kill $2 $NagiosPID
    
    }
    
    
    pid_nagios ()
    {
    
        if test ! -f $NagiosRunFile; then
            echo "No lock file found in $NagiosRunFile"
            exit 1
        fi
    
        NagiosPID=`head -n 1 $NagiosRunFile`
    }
    
    
    # Source function library
    # Solaris doesn't have an rc.d directory, so do a test first
    if [ -f /etc/rc.d/init.d/functions ]; then
        . /etc/rc.d/init.d/functions
    elif [ -f /etc/init.d/functions ]; then
        . /etc/init.d/functions
    fi
    
    prefix=/opt/nagios
    exec_prefix=${prefix}
    NagiosBin=${exec_prefix}/bin/nagios
    NagiosCfgFile=${prefix}/etc/nagios.cfg
    NagiosStatusFile=${prefix}/var/status.dat
    NagiosRetentionFile=${prefix}/var/retention.dat
    NagiosCommandFile=${prefix}/var/rw/nagios.cmd
    NagiosVarDir=${prefix}/var
    NagiosRunFile=${prefix}/var/nagios.lock
    NagiosLockDir=/var/lock/subsys
    NagiosLockFile=nagios
    NagiosCGIDir=${exec_prefix}/sbin
    NagiosUser=nagios
    NagiosGroup=nagios
              
    
    # Check that nagios exists.
    if [ ! -f $NagiosBin ]; then
        echo "Executable file $NagiosBin not found.  Exiting."
        exit 1
    fi
    
    # Check that nagios.cfg exists.
    if [ ! -f $NagiosCfgFile ]; then
        echo "Configuration file $NagiosCfgFile not found.  Exiting."
        exit 1
    fi
              
    # See how we were called.
    case "$1" in
    
        start)
            echo -n "Starting nagios:"
            $NagiosBin -v $NagiosCfgFile > /dev/null 2>&1;
            if [ $? -eq 0 ]; then
                su - $NagiosUser -c "touch $NagiosVarDir/nagios.log $NagiosRetentionFile"
                rm -f $NagiosCommandFile
                touch $NagiosRunFile
                chown $NagiosUser:$NagiosGroup $NagiosRunFile
                $NagiosBin -d $NagiosCfgFile
                if [ -d $NagiosLockDir ]; then touch $NagiosLockDir/$NagiosLockFile; fi
                echo " done."
                exit 0
            else
                echo "CONFIG ERROR!  Start aborted.  Check your Nagios configuration."
                exit 1
            fi
            ;;
    
        stop)
            echo -n "Stopping nagios: "
    
            pid_nagios
            killproc_nagios nagios
    
            # now we have to wait for nagios to exit and remove its
            # own NagiosRunFile, otherwise a following "start" could
            # happen, and then the exiting nagios will remove the
            # new NagiosRunFile, allowing multiple nagios daemons
            # to (sooner or later) run - John Sellens
            #echo -n 'Waiting for nagios to exit .'
            for i in 1 2 3 4 5 6 7 8 9 10 ; do
                if status_nagios > /dev/null; then
                echo -n '.'
                sleep 1
                else
                break
                fi
            done
            if status_nagios > /dev/null; then
                echo ''
                echo 'Warning - nagios did not exit in a timely manner'
            else
                echo 'done.'
            fi
    
            rm -f $NagiosStatusFile $NagiosRunFile $NagiosLockDir/$NagiosLockFile $NagiosCommandFile
            ;;
    
        status)
            pid_nagios
            printstatus_nagios nagios
            ;;
    
        checkconfig)
            printf "Running configuration check..."
            $NagiosBin -v $NagiosCfgFile > /dev/null 2>&1;
            if [ $? -eq 0 ]; then
                echo " OK."
            else
                echo " CONFIG ERROR!  Check your Nagios configuration."
                exit 1
            fi
            ;;
    
        restart)
            printf "Running configuration check..."
            $NagiosBin -v $NagiosCfgFile > /dev/null 2>&1;
            if [ $? -eq 0 ]; then
                echo "done."
                $0 stop
                $0 start
            else
                echo " CONFIG ERROR!  Restart aborted.  Check your Nagios configuration."
                exit 1
            fi
            ;;
    
        reload|force-reload)
            printf "Running configuration check..."
            $NagiosBin -v $NagiosCfgFile > /dev/null 2>&1;
            if [ $? -eq 0 ]; then
                echo "done."
                if test ! -f $NagiosRunFile; then
                    $0 start
                else
                    pid_nagios
                    if status_nagios > /dev/null; then
                        printf "Reloading nagios configuration..."
                        killproc_nagios nagios -HUP
                        echo "done"
                    else
                        $0 stop
                        $0 start
                    fi
                fi
            else
                echo " CONFIG ERROR!  Reload aborted.  Check your Nagios configuration."
                exit 1
            fi
            ;;
    
        *)
            echo "Usage: nagios {start|stop|restart|reload|force-reload|status|checkconfig}"
            exit 1
            ;;
    
    esac
      
    # End of this script
    

    然后登录检查即可。

    cacti

    cacti 用于监控出图,其实nagios 可以通过pnp4nagios 进行出图,就是体验不是太好,cacti 用于定制化监控图表还是很不错的,虽然大家用的都是rrdtool。

    • 准备
    apt-get install rrdtool  php5 mysql-server
    

    其实php5不止要装那么点包,这个之后再说。
    下载cacti 后解压进入目录,登录mysql 导入cacti 对应数据表:

    mysql> create database cacti;
    mysql>use cacti;
    Query OK, 1 row affected (0.00 sec)
    mysql> source cacti.sql;
    mysql> GRANT ALL PRIVILEGES ON cacti.* TO 'cacti'@'127.0.0.1' IDENTIFIED BY 'cacti';
    

    修改配置文件:

    vi include/config.php
    $database_type     = 'mysql';
    $database_default  = 'cacti';
    $database_hostname = '127.0.0.1';
    $database_username = 'cacti';
    $database_password = 'cacti';
    $database_port     = '3306';
    $database_ssl      = false;
    

    之后登录ip/cacti 后会出现安装配置界面:
    默认用户admin 密码admin


    Paste_Image.png

    这里会提示缺少哪些包,装上即可:

    Paste_Image.png

    新版本的cacti 有个问题在于mysql 是时区权限。就是上图那个报错,需要修复一下:

    mysql> GRANT SELECT ON mysql.time_zone_name TO cacti@'127.0.0.1';
    mysql_tzinfo_to_sql /usr/share/zoneinfo/ | mysql -u root -p mysql
    

    之后next 变安装完成。

    Paste_Image.png

    之后就配置snmp 进行监控和出图啦。

    地址收藏:
    http://exchange.nagios.org
    http://forums.cacti.net

    相关文章

      网友评论

          本文标题:Nagios&Cacti

          本文链接:https://www.haomeiwen.com/subject/nxgtattx.html