美文网首页
001.基于阿里云CentOS7.6搭建CM6.3大数据平台

001.基于阿里云CentOS7.6搭建CM6.3大数据平台

作者: CoderJed | 来源:发表于2020-06-10 18:40 被阅读0次

1. 集群规划以及配置说明

操作系统 主机名 内网IP 内存 CPU 角色 系统盘容量 数据盘容量 数据盘挂载点
CentOS-7.6 node01 172.26.0.53 16GB 4核 管理节点 50GB 100GB /data
CentOS-7.6 node02 172.26.0.50 16GB 4核 数据节点 40GB 100GB /data
CentOS-7.6 node03 172.26.0.51 16GB 4核 数据节点 40GB 100GB /data
CentOS-7.6 node04 172.26.0.52 16GB 4核 数据节点 40GB 100GB /data

2. 系统环境准备

2.1 生产环境的磁盘配置要求

  • 系统盘建议做RAID1,容量建议200G以上,并且做LVM逻辑卷,这样可以动态调整系统盘空间大小,CM安装在系统盘
  • 管理节点的数据盘做RAID5,管理节点的数据都放在数据盘中
  • 数据节点的数据盘做RAID0(一块盘做RAID0,硬件RAID),文件格式为xfs,并配置noatime,不做LVM,最好是同构

2.2 网络配置

  • 确保没有启用IPV6,所有节点同步

    编辑/etc/sysctl.conf文件:

    # 禁用IPV6
    net.ipv6.conf.all.disable_ipv6= 1
    net.ipv6.conf.default.disable_ipv6= 1
    net.ipv6.conf.lo.disable_ipv6= 1
    

    修改后执行sysctl -p命令;

    /etc/sysconfig/network文件中新增:

    NETWORKING_IPV6=no
    IPV6INIT=no
    
  • 主机名配置

    [root@node01 ~]# cat /etc/hostname
    node01
    [root@node02 ~]# cat /etc/hostname
    node02
    [root@node03 ~]# cat /etc/hostname
    node03
    [root@node04 ~]# cat /etc/hostname
    node04
    
  • /etc/hosts文件设置,所有节点同步

    172.26.0.53 node01
    172.26.0.50 node02
    172.26.0.51 node03
    172.26.0.52 node04
    

2.3 SSH免密钥登陆配置

设置从node01远程登录到其他3个机器免密钥

[root@node01 ~]# ssh-keygen
Generating public/private rsa key pair.
# 直接Enter
Enter file in which to save the key (/root/.ssh/id_rsa): 
# 直接Enter
Enter passphrase (empty for no passphrase): 
# 直接Enter
Enter same passphrase again: 
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:wjErOh9znaB9DV6zQkFqvIWgAqABRI9XywCefACA+p4 root@node01
The key's randomart image is:
+---[RSA 2048]----+
|/=....  .        |
|*.=.+o.+         |
|+=.+ oB o        |
|..o  o * .       |
| .  . * S o      |
|  .. + * = o     |
| .o.+ o * o      |
|  Eo + . .       |
|    .            |
+----[SHA256]-----+

[root@node01 ~]# ssh-copy-id -i ~/.ssh/id_rsa.pub root@node01
/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/root/.ssh/id_rsa.pub"
The authenticity of host 'node01 (172.26.0.53)' can't be established.
ECDSA key fingerprint is SHA256:RjnwwbdyitVDZL8ZBDSIchP6NIzcUgvnd+jItwp3D00.
ECDSA key fingerprint is MD5:8f:ea:10:8f:cf:3d:83:e2:e9:cc:af:ec:70:bf:1c:af.
# 输入yes
Are you sure you want to continue connecting (yes/no)? yes
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
# 输入root用户的密码
root@node01's password: 

Number of key(s) added: 1

Now try logging into the machine, with:   "ssh 'root@node01'"
and check to make sure that only the key(s) you wanted were added.
[root@node01 ~]# ssh-copy-id -i ~/.ssh/id_rsa.pub root@node02
[root@node01 ~]# ssh-copy-id -i ~/.ssh/id_rsa.pub root@node03
[root@node01 ~]# ssh-copy-id -i ~/.ssh/id_rsa.pub root@node04

2.4 禁用SELINUX

阿里云服务器已经禁用了SELINUX,无需操作,如需操作,所有节点同步

# 禁用SELINUX
setenforce 0
# 查看SELINUX是否开启
[root@node01 ~]# getenforce
Disabled

2.5 禁用防火墙

阿里云服务器已经禁用了防火墙,无需操作,如需操作,所有节点同步

systemctl stop firewalld
systemctl disable firewalld

[root@node01 ~]# systemctl status firewalld
● firewalld.service - firewalld - dynamic firewall daemon
   Loaded: loaded (/usr/lib/systemd/system/firewalld.service; disabled; vendor preset: enabled)
   Active: inactive (dead)
     Docs: man:firewalld(1)

2.6 禁用交换分区

所有节点同步

# 首先临时禁用交换分区
echo 1 > /proc/sys/vm/swappiness
# 然后修改/etc/sysctl.conf文件
vm.swappiness = 1
# 然后执行sysctl -p命令

2.7 设置透明大页面

所有节点同步

# 首先执行以下两条命令
echo never > /sys/kernel/mm/transparent_hugepage/defrag
echo never > /sys/kernel/mm/transparent_hugepage/enabled

# 给此文件赋予执行权限
chmod +x /etc/rc.d/rc.local

# 在rc.local文件中增加如下内容
if test -f /sys/kernel/mm/transparent_hugepage/enabled; then 
    echo never > /sys/kernel/mm/transparent_hugepage/enabled 
fi
if test -f /sys/kernel/mm/transparent_hugepage/defrag; then
    echo never > /sys/kernel/mm/transparent_hugepage/defrag 
fi

2.8 集群时间同步

  • 确保集群所有机器都在上海时区

    [root@node01 ~]# timedatectl
          Local time: Thu 2020-06-04 16:18:43 CST
      Universal time: Thu 2020-06-04 08:18:43 UTC
            RTC time: Thu 2020-06-04 16:18:42
           Time zone: Asia/Shanghai (CST, +0800)
         NTP enabled: yes
    NTP synchronized: yes
     RTC in local TZ: yes
          DST active: n/a
          
    # 如需设置,命令如下
    timedatectl set-timezone Asia/Shanghai
    
  • 所有节点安装NTP服务

    yum -y install ntp
    
  • NTP服务端配置

    选择node01机器作为NTP服务端,修改/etc/ntp.conf文件

    注意,阿里云服务器在使用yum安装好NTP服务后,默认设置去阿里云时间服务器同步时间,配置基本不需要改,确认以下配置即可:

    # 当外部时间不可⽤时,可使⽤本地硬件时间
    server 127.127.1.0
    fudge  127.127.1.0 stratum 10
    # 允许哪些⽹段的机器来同步时间,设置为集群的内网网段和子网掩码
    # 使用ifconfig查看机器的内网网段和子网掩码
    restrict 172.26.0.0 mask 255.255.240.0 nomodify notrap nopeer noquery
    

    而物理机服务器的默认配置是去CentOS的时间服务器同步时间吗,除了设置好以上内容外,还需要修改时间服务地址为国内的地址,例如:

    # 注释掉CentOS默认的时间服务器地址
    # server 0.centos.pool.ntp.org iburst
    # server 1.centos.pool.ntp.org iburst
    # server 2.centos.pool.ntp.org iburst
    # server 3.centos.pool.ntp.org iburst
    
    # 替换为国内的地址,例如中国国家授时中心服务器地址
    # 如需多个国内地址,可自行搜索
    server cn.pool.ntp.org iburst
    
  • NTP客户端配置

    集群中的其他节点,即node02-node04,修改/etc/ntp.conf文件,以下是客户端配置文件的全部内容:

    driftfile  /var/lib/ntp/drift
    pidfile    /var/run/ntpd.pid
    logfile    /var/log/ntp.log
    
    restrict 172.26.0.53 nomodify notrap nopeer noquery
    server 172.26.0.53 iburst minpoll 4 maxpoll 10
    server 127.127.1.0
    fudge  127.127.1.0 stratum 10
    
  • 所有节点启动NTP服务

    systemctl start ntpd
    systemctl enable ntpd
    
  • 在所有客户端查看同步状态

    # 可以看到*才代表正常,*代表当前正在从那个服务器同步时间
    # 客户端NTP启动之后,可能需要一段时间才能看到以下现象
    [root@node02 ~]# ntpq -p
         remote           refid      st t when poll reach   delay   offset  jitter
    ==============================================================================
    *node01          100.100.61.88    2 u    3   64   17    0.153  -31.350   6.619
     LOCAL(0)        .LOCL.          10 l  157   64    4    0.000    0.000   0.000
    

3. 基础软件准备

以下是官网列出的CM6.x支持的硬件配置、操作系统以及数据库等的说明:

https://docs.cloudera.com/documentation/enterprise/6/release-notes/topics/rg_requirements_supported_versions.html

3.1 JDK

所有节点安装JDK1.8,注意:

JDK 8u40, 8u45, 8u60, and 8u242 are not supported due to JDK issues impacting CDH functionality,官网测试过的最新可用的Oracle JDK1.8为8u181,另外,把JDK安装在/usr/java目录下,我的JAVA_HOME/usr/java/jdk

# 这里有个坑,解压之后的所有者和所属组变成了数字
# 应该手动修改为root或者你自己的用户
# 其他的安装包解压后也可能出现了类似的情况,都需要手动处理
[root@node01 java]# ll
total 4
lrwxrwxrwx 1 root root   12 Jun  4 18:56 jdk -> jdk1.8.0_181
drwxr-xr-x 7   10  143 4096 Jul  7  2018 jdk1.8.0_181

# 全部节点执行
chown -R root:root /usr/java

3.2 安装httpd

我的环境中,我把安装CM所需要的其他单节点软件都安装到了node01上

yum -y install httpd
systemctl start httpd
systemctl enable httpd

3.3 安装MySQL-5.7

参考:在CentOS-7.6系统中安装MySQL-5.7

说明:以上文章是我写的一篇安装MySQL-5.7的入门文章,其中不包含生产环境的复杂配置,由于这里安装的数据库只是用做CM集群的元数据管理,并不牵扯到业务数据,所以简单搭建即可,我选择安装在node01节点

创建CM相关表:

create database am default character set utf8;
create database cm default character set utf8;
create database rm default character set utf8;
create database hue default character set utf8;
create database hive default character set utf8;
create database oozie default character set utf8;
create database nav_as default character set utf8;
create database nav_ms default character set utf8;
create database sentry default character set utf8;
CREATE USER 'am'@'%' IDENTIFIED BY 'password';
CREATE USER 'cm'@'%' IDENTIFIED BY 'password';
CREATE USER 'rm'@'%' IDENTIFIED BY 'password';
CREATE USER 'hue'@'%' IDENTIFIED BY 'password';
CREATE USER 'hive'@'%' IDENTIFIED BY 'password';
CREATE USER 'oozie'@'%' IDENTIFIED BY 'password';
CREATE USER 'nav_as'@'%' IDENTIFIED BY 'password';
CREATE USER 'nav_ms'@'%' IDENTIFIED BY 'password';
CREATE USER 'sentry'@'%' IDENTIFIED BY 'password';
GRANT ALL PRIVILEGES ON am.* TO 'am'@'%';
GRANT ALL PRIVILEGES ON cm.* TO 'cm'@'%';
GRANT ALL PRIVILEGES ON rm.* TO 'rm'@'%';
GRANT ALL PRIVILEGES ON hue.* TO 'hue'@'%';
GRANT ALL PRIVILEGES ON hive.* TO 'hive'@'%';
GRANT ALL PRIVILEGES ON oozie.* TO 'oozie'@'%';
GRANT ALL PRIVILEGES ON sentry.* TO 'sentry'@'%';
GRANT ALL PRIVILEGES ON nav_as.* TO 'nav_as'@'%';
GRANT ALL PRIVILEGES ON nav_ms.* TO 'nav_ms'@'%';
FLUSH PRIVILEGES;

3.4 安装JDBC驱动

在所有节点安装JDBC驱动

[root@node01 ~]# mkdir -p /usr/share/java/
[root@node01 ~]# mv ~/mysql-connector-java-5.1.48.jar /usr/share/java/
[root@node01 ~]# cd /usr/share/java/
[root@node01 java]# ln -s mysql-connector-java-5.1.48.jar mysql-connector-java.jar 
[root@node01 java]# ll
total 984
-rw-r--r-- 1 root root 1006956 Jun  4 18:50 mysql-connector-java-5.1.48.jar
lrwxrwxrwx 1 root root      31 Jun  4 18:51 mysql-connector-java.jar -> mysql-connector-java-5.1.48.jar

4. Cloudera Manager 安装

4.1 安装包准备

下载CM6.3.1的安装包:

https://archive.cloudera.com/cm6/6.3.1/redhat7/yum/RPMS/x86_64/cloudera-manager-agent-6.3.1-1466458.el7.x86_64.rpm

https://archive.cloudera.com/cm6/6.3.1/redhat7/yum/RPMS/x86_64/cloudera-manager-daemons-6.3.1-1466458.el7.x86_64.rpm

https://archive.cloudera.com/cm6/6.3.1/redhat7/yum/RPMS/x86_64/cloudera-manager-server-6.3.1-1466458.el7.x86_64.rpm

https://archive.cloudera.com/cm6/6.3.1/redhat7/yum/RPMS/x86_64/cloudera-manager-server-db-2-6.3.1-1466458.el7.x86_64.rpm

https://archive.cloudera.com/cm6/6.3.1/redhat7/yum/RPMS/x86_64/enterprise-debuginfo-6.3.1-1466458.el7.x86_64.rpm

https://archive.cloudera.com/cm6/6.3.1/allkeys.asc

下载CDH6.3.2的安装包,说明:官网提供了CM6.3.1的RPM包但是没有提供CDH6.3.1的包,所以我们只能下载CDH6.3.2的包,经我测试,可以使用:

https://archive.cloudera.com/cdh6/6.3.2/parcels/CDH-6.3.2-1.cdh6.3.2.p0.1605554-el7.parcel

https://archive.cloudera.com/cdh6/6.3.2/parcels/CDH-6.3.2-1.cdh6.3.2.p0.1605554-el7.parcel.sha1

https://archive.cloudera.com/cdh6/6.3.2/parcels/CDH-6.3.2-1.cdh6.3.2.p0.1605554-el7.parcel.sha256

https://archive.cloudera.com/cdh6/6.3.2/parcels/manifest.json

CDH-6.3.2-1.cdh6.3.2.p0.1605554-el7.parcel.sha1文件重名为CDH-6.3.2-1.cdh6.3.2.p0.1605554-el7.parcel.sha,然后将所有安装包上传到服务器,我把CM安装包放到了/root/cm6.3.1目录下,将CDH安装包放到了/root/cdh6.3.2目录下:

[root@node01 ~]# ll cm6.3.1/
total 1199784
-rw-r--r-- 1 root root      14041 Oct 11  2019 allkeys.asc
-rw-r--r-- 1 2001 2001   10483568 Sep 25  2019 cloudera-manager-agent-6.3.1-1466458.el7.x86_64.rpm
-rw-r--r-- 1 2001 2001 1203832464 Sep 25  2019 cloudera-manager-daemons-6.3.1-1466458.el7.x86_64.rpm
-rw-r--r-- 1 2001 2001      11488 Sep 25  2019 cloudera-manager-server-6.3.1-1466458.el7.x86_64.rpm
-rw-r--r-- 1 2001 2001      10996 Sep 25  2019 cloudera-manager-server-db-2-6.3.1-1466458.el7.x86_64.rpm
-rw-r--r-- 1 2001 2001   14209868 Sep 25  2019 enterprise-debuginfo-6.3.1-1466458.el7.x86_64.rpm
[root@node01 ~]# ll cdh6.3.2/
total 2033436
-rw-r--r-- 1 root root 2082186246 Jun  9 18:57 CDH-6.3.2-1.cdh6.3.2.p0.1605554-el7.parcel
-rw-r--r-- 1 root root         40 Jun  9 17:20 CDH-6.3.2-1.cdh6.3.2.p0.1605554-el7.parcel.sha
-rw-r--r-- 1 root root         64 Jun  9 17:21 CDH-6.3.2-1.cdh6.3.2.p0.1605554-el7.parcel.sha256
-rw-r--r-- 1 root root      33887 Jun  9 17:20 manifest.json

进入CM安装包目录,生成RPM元数据:

[root@node01 cm6.3.1]# yum install createrepo -y
[root@node01 cm6.3.1]# createrepo .
Spawning worker 0 with 2 pkgs
Spawning worker 1 with 1 pkgs
Spawning worker 2 with 1 pkgs
Spawning worker 3 with 1 pkgs
Workers Finished
Saving Primary metadata
Saving file lists metadata
Saving other metadata
Generating sqlite DBs
Sqlite DBs complete

4.2 HTTP服务器配置

[root@node01 ~]# mv cm6.3.1 /var/www/html/

注意:阿里云服务器需要开放80端口

4.3 制作CM局域网内的yum源

root@node01 ~]# vim /etc/yum.repos.d/cm.repo

[cmrepo]
name = cm_repo
baseurl = http://node01/cm6.3.1
enable = true
gpgcheck = false

# 看到cmrepo那一行就代表cmrepo生效了
[root@node01 ~]# yum repolist
Loaded plugins: fastestmirror
Loading mirror speeds from cached hostfile
cmrepo                             | 2.9 kB  00:00:00     
cmrepo/primary_db                  | 6.8 kB  00:00:00     
repo id                            repo name                                        status
base/7/x86_64                      CentOS-7                                         10,070
cmrepo                             cm_repo                                               5
epel/x86_64                        Extra Packages for Enterprise Linux 7 - x86_64   13,314
extras/7/x86_64                    CentOS-7                                            397
mysql-connectors-community/x86_64  MySQL Connectors Community                          153
mysql-tools-community/x86_64       MySQL Tools Community                               110
mysql57-community/x86_64           MySQL 5.7 Community Server                          424
updates/7/x86_64                   CentOS-7                                            743
repolist: 25,216

4.4 安装Cloudera Manager Server

node01上安装Cloudera Manager Server:

[root@node01 ~]# yum -y install cloudera-manager-server
Loaded plugins: fastestmirror
Loading mirror speeds from cached hostfile
Resolving Dependencies
--> Running transaction check
---> Package cloudera-manager-server.x86_64 0:6.3.1-1466458.el7 will be installed
--> Processing Dependency: cloudera-manager-daemons = 6.3.1 for package: cloudera-manager-server-6.3.1-1466458.el7.x86_64
--> Running transaction check
---> Package cloudera-manager-daemons.x86_64 0:6.3.1-1466458.el7 will be installed
--> Finished Dependency Resolution
......

Installed:
  cloudera-manager-server.x86_64 0:6.3.1-1466458.el7                                                                                                                                

Dependency Installed:
  cloudera-manager-daemons.x86_64 0:6.3.1-1466458.el7                                                                                                                               

Complete!

安装完成后,在/opt目录下生成了cloudera目录:

[root@node01 cloudera]# ll /opt/cloudera/
total 12
drwxr-xr-x 27 cloudera-scm cloudera-scm 4096 Jun 10 15:48 cm
drwxr-xr-x  2 cloudera-scm cloudera-scm 4096 Sep 25  2019 csd
drwxr-xr-x  2 cloudera-scm cloudera-scm 4096 Sep 25  2019 parcel-repo

cdh6.3.2目录下的4个文件移动到/opt/cloudera/parcel-repo目录下:

[root@node01 ~]# mv cdh6.3.2/* /opt/cloudera/parcel-repo/
[root@node01 ~]# ll /opt/cloudera/parcel-repo/
total 2033436
-rw-r--r-- 1 root root 2082186246 Jun  9 18:57 CDH-6.3.2-1.cdh6.3.2.p0.1605554-el7.parcel
-rw-r--r-- 1 root root         40 Jun  9 17:20 CDH-6.3.2-1.cdh6.3.2.p0.1605554-el7.parcel.sha
-rw-r--r-- 1 root root         64 Jun  9 17:21 CDH-6.3.2-1.cdh6.3.2.p0.1605554-el7.parcel.sha256
-rw-r--r-- 1 root root      33887 Jun  9 17:20 manifest.json

4.5 初始化数据库

[root@node01 ~]# /opt/cloudera/cm/schema/scm_prepare_database.sh mysql cm cm your_password
JAVA_HOME=/usr/java/jdk
Verifying that we can write to /etc/cloudera-scm-server
Creating SCM configuration file in /etc/cloudera-scm-server
......
INFO  Successfully connected to database.
All done, your SCM database is configured correctly!

4.6 启动Cloudera Manager Server

[root@node01 ~]# systemctl start cloudera-scm-server
[root@node01 ~]# systemctl status cloudera-scm-server
● cloudera-scm-server.service - Cloudera CM Server Service
   Loaded: loaded (/usr/lib/systemd/system/cloudera-scm-server.service; enabled; vendor preset: disabled)
   Active: active (running) since Wed 2020-06-10 15:59:34 CST; 5s ago
  Process: 2051 ExecStartPre=/opt/cloudera/cm/bin/cm-server-pre (code=exited, status=0/SUCCESS)
 Main PID: 2054 (java)
   CGroup: /system.slice/cloudera-scm-server.service
           └─2054 /usr/java/jdk1.8.0_181/bin/java -cp .:/usr/share/java/mysql-connector-java.jar:/usr/share/java/oracle-connector-java.jar:/usr/share/java/postgresql-connector-j...
......

启动需要花几分钟时间,观察启动日志,日志中出现以下信息才算真的启动成功:

2020-06-10 16:00:45,469 INFO WebServerImpl:org.eclipse.jetty.server.AbstractConnector: Started ServerConnector@7dd12982{HTTP/1.1,[http/1.1]}{0.0.0.0:7180}
2020-06-10 16:00:45,479 INFO WebServerImpl:org.eclipse.jetty.server.Server: Started @70548ms
2020-06-10 16:00:45,479 INFO WebServerImpl:com.cloudera.server.cmf.WebServerImpl: Started Jetty server.

访问WEB-UI,注意,阿里云服务器需要开放7180端口,账号密码都是admin:

5. CM集群初始化

6. 解决CM集群的警告信息

点击完成后,来到集群监控主页面,我这里报了很多的警告信息,我们要去耐心查看报警信息,修改配置,直到满足CM集群的要求,才能解决这些问题

查看警告信息发现,很多是日志目录的空间不足以及堆转储空间不足

于是我们修改各组件的日志目录和堆转储目录:

在堆转储的警告详情中找到关键目录/tmp

除此之外,还有一些其他的存储位置磁盘空间不足的警告:

这种就直接在配置中搜索所有预警的目录,修改到/data目录下即可:

然后我们还有一些内存配置不合理的警告:

这种警告,我们逐个点击链接进去修改为CM建议的内存大小即可:

还有一个DataNode数量不足的警告:

我们只能是设置检查的策略,就是让CM不要检查DN数量了,我们机器有限,这个只能是通过放宽预警条件来解决:

然后还有内存调拨过度的警告:

这两个警告,只能是通过增加内存来解决,当然也可以通过设置检查阈值来让其不警告,但这是自欺欺人的做法,我们的机器资源有限,只能让这两个警告就放在这里,通过耐心的查看报警信息,修改配置,一般的问题都能得到解决,最终我搭建的CM集群是这样的:

到这里,基于阿里云CentOS7.6系统搭建CM6.3就成功了,我会在之后的文章中继续探索CM的使用

相关文章

网友评论

      本文标题:001.基于阿里云CentOS7.6搭建CM6.3大数据平台

      本文链接:https://www.haomeiwen.com/subject/apnptktx.html