本文记录使用promethes、grafana监控告警Hadoop服务的过程,文末附上grafana dashboard配置,有需要的下载后对应修改即可使用,不要再去csdn下载收费的了。
一、原理
使用promethes监控hadoop服务的原理是基于javaagent技术将promethes提供的exporter jar包加载到每个服务的进程中,exporter将JMX统一格式后提供http接口供promethes server调用。
二、配置
1、上传export包到hadoop的每台服务器, 并修改访问权限
/opt/software/prometheus_export/jmx_prometheus_javaagent-0.9.jar
chmod 777 /opt/software/prometheus_export/jmx_prometheus_javaagent-0.9.jar
2、在每台服务器的/opt/software/prometheus_export/目录下分别创建各个角色的yaml配置,文件会在第三步被引用,每份文件中只有hostPort不同,该hostPort与第三步中的-Dcom.sun.management.jmxremote.port需对应一致
端口对应关系:
namenode:1234
datanode:1235
jounalnode:1236
zkfc:1237
resourcemanager:1238
nodemanager:1239
hiveserver2:1240
hivemetastore:1241
zookeeper:1242
(各配置文件见文末有道笔记链接)
3、hdfs相关。
3.1修改hadoop-env.sh,此处涉及的角色:namenode、datanode、journalnode、zkfc、resoucemanager、nodemanager。(ambari对应配置: Advanced hadoop-env下面的hadoop-env template)
jmx_exporter#
export HDFS_NAMENODE_OPTS="$HDFS_NAMENODE_OPTS -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.local.only=false -Dcom.sun.management.jmxremote.port=1234 -javaagent:/opt/software/prometheus_export/jmx_prometheus_javaagent-0.9.jar=9222:/opt/software/prometheus_export/namenode.yaml"
export HDFS_DATANODE_OPTS="$HDFS_DATANODE_OPTS -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.local.only=false -Dcom.sun.management.jmxremote.port=1235 -javaagent:/opt/software/prometheus_export/jmx_prometheus_javaagent-0.9.jar=9322:/opt/software/prometheus_export/datanode.yaml"
export HADOOP_JOURNALNODE_OPTS="$HADOOP_JOURNALNODE_OPTS -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.local.only=false -Dcom.sun.management.jmxremote.port=1236 -javaagent:/opt/software/prometheus_export/jmx_prometheus_javaagent-0.9.jar=9522:/opt/software/prometheus_export/journalnode.yaml"
export HDFS_ZKFC_OPTS="$HDFS_ZKFC_OPTS -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.local.only=false -Dcom.sun.management.jmxremote.port=1237 -javaagent:/opt/software/prometheus_export/jmx_prometheus_javaagent-0.9.jar=9422:/opt/software/prometheus_export/zkfc.yaml"
4、yarn相关。
yarn-env.sh。(ambari对应配置: Advanced yarn-env下面的yarn-env template)
export YARN_RESOURCEMANAGER_OPTS="$YARN_RESOURCEMANAGER_OPTS -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.local.only=false -Dcom.sun.management.jmxremote.port=1238 -javaagent:/opt/software/prometheus_export/jmx_prometheus_javaagent-0.9.jar=9622:/opt/software/prometheus_export/resourcemanager.yaml"
export YARN_NODEMANAGER_OPTS="$YARN_NODEMANAGER_OPTS -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.local.only=false -Dcom.sun.management.jmxremote.port=1239 -javaagent:/opt/software/prometheus_export/jmx_prometheus_javaagent-0.9.jar=9722:/opt/software/prometheus_export/nodemanager.yaml"
5、hive相关
hive-env template 分别在if [ "SERVICE" = "hiveserver2" ]; then的最后增加如下内容:
export HADOOP_OPTS="$HADOOP_OPTS -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.local.only=false -Dcom.sun.management.jmxremote.port=1240 -javaagent:/opt/software/prometheus_export/jmx_prometheus_javaagent-0.9.jar=9822:/opt/software/prometheus_export/hivemetastore.yaml"
export HADOOP_OPTS="$HADOOP_OPTS -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.local.only=false -Dcom.sun.management.jmxremote.port=1241 -javaagent:/opt/software/prometheus_export/jmx_prometheus_javaagent-0.9.jar=9922:/opt/software/prometheus_export/hiveserver2.yaml"
6、zookeeper
在ambari中修改zookeeper-env template配置,在export CLASSPATH=$CLASSPATH:/usr/share/zookeeper/*后增加如下内容
########################prometheus start##################
export ZK_JMX_OPTS="-Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.local.only=false -Dcom.sun.management.jmxremote.port=1242 -javaagent:/opt/software/prometheus_export/jmx_prometheus_javaagent-0.9.jar=9923:/opt/software/prometheus_export/zookeeper.yaml"
########################prometheus end###################
在zookeeper所在的每台服务器上修改zkServer.sh,ZOOPIDFILE" ]; then
if kill -0 cat "$ZOOPIDFILE"
> /dev/null 2>&1; then
echo ZOOPIDFILE"`.
exit 0
fi
fi
nohup "{ZOO_LOG_DIR}" "-Dzookeeper.log.file={ZOO_LOG4J_PROP}"
-cp "JVMFLAGS ZOOMAIN "_ZOO_DAEMON_OUT" 2>&1 < /dev/null &
三、监控规则
image.png
四、grafana配置
https://note.youdao.com/s/bUh0T93W
网友评论