美文网首页大数据Cloudera
CDH 6.3 大数据平台搭建

CDH 6.3 大数据平台搭建

作者: 只是甲 | 来源:发表于2020-12-04 14:12 被阅读0次

    一.CDH概述

    Cloudera版本(Cloudera’s Distribution Including Apache Hadoop,简称“CDH”),基于Web的用户界面,支持大多数Hadoop组件,包括HDFS、MapReduce、Hive、Pig、 Hbase、Zookeeper、Sqoop,简化了大数据平台的安装、使用难度。

    由于组件齐全,安装维护方便,国内已经有不少公司部署了CDH大数据平台,此处我选择CDH 6.3版本。

    二.安装CDH前准备

    2.1 环境准备

    主机配置:

    IP 主机名
    10.31.1.123 hp1
    10.31.1.124 hp2
    10.31.1.125 hp3
    10.31.1.126 hp4

    硬件配置:
    每台主机:CPU4核、内存8G、硬盘500G

    软件版本:

    名称 版本
    操作系统 CentOS release 7.8 (Final) 64位
    JDK 1.8
    数据库 MySQL 5.6.49
    JDBC MySQL Connector Java 5.1.38
    Cloudera Manager 6.3.1
    CDH 6.3.1

    2.2 安装前准备

    2.2.1 主机名配置(所有节点)

    分别在各个主机下设置主机名

    hostnamectl set-hostname hp1
    hostnamectl set-hostname hp2
    hostnamectl set-hostname hp3
    hostnamectl set-hostname hp4
    

    配置4台机器 /etc/hosts

    vi /etc/hosts
    127.0.0.1               localhost
    10.31.1.123             hp1
    10.31.1.124             hp2
    10.31.1.125             hp3
    10.31.1.126             hp4
    

    配置4台机器 /etc/sysconfig/network

    -- 以123为例,其它3台参考
    [root@10-31-1-123 ~]# more /etc/sysconfig/network
    # Created by anaconda
    HOSTNAME=hp1
    

    2.2.2 防火墙及SeLinux配置(所有节点)

    关闭防火墙

    systemctl disable firewalld
    systemctl stop firewalld
    

    配置SeLinux

    vi /etc/selinux/config
    SELINUX=enforcing  改为  SELINUX=permissive
    

    2.2.3 NTP服务配置(所有节点)

    yum install ntp
    systemctl start ntpd
    systemctl enable ntpd
    

    hp1为ntp服务器,其它3台同步123的

    vi /etc/ntp.config
    restrict 10.31.1.0 mask 255.255.255.0
    server 10.31.1.123
    
    systemctl restart ntpd
    

    2.2.4 安装python(所有节点)

    CDH要求python 2.7版本,此处系统自带,略过

    2.2.5 数据库需求(主节点)

    此处安装MySQL5.6版本,安装步骤略过

    2.2.6 安装JDK(所有节点)

    image.png

    此处选择安装JDK 1.8

    cd /usr/
    mkdir java
    cd java
    wget http://download.oracle.com/otn-pub/java/jdk/8u181-b13/96a7b8442fe848ef90c96a2fad6ed6d1/jdk-8u181-linux-x64.tar.gzAuthParam=1534129356_6b3ac55c6a38ba5a54c912855deb6a22
    mv jdk-8u181-linux-x64.tar.gzAuthParam\=1534129356_6b3ac55c6a38ba5a54c912855deb6a22 jdk-8u181-linux-x64.tar.gz
    tar -zxvf jdk-8u181-linux-x64.tar.gz 
    vi /etc/profile
    #java
    export JAVA_HOME=/usr/java/jdk1.8.0_181
    export PATH=$JAVA_HOME/bin:$PATH
    export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib
    

    如下,显示java安装成功

    [root@10-31-1-123 java]# java -version
    java version "1.8.0_181"
    Java(TM) SE Runtime Environment (build 1.8.0_181-b13)
    Java HotSpot(TM) 64-Bit Server VM (build 25.181-b13, mixed mode)
    

    2.2.7 下载安装包(所有节点)

    CM:CM6.3.1
    连接:https://archive.cloudera.com/cm6/6.3.1/repo-as-tarball/cm6.3.1-redhat7.tar.gz

    Parcel:
    https://archive.cloudera.com/cdh6/6.3.1/parcels/CDH-6.3.1-1.cdh6.3.1.p0.1470567-el7.parcel
    https://archive.cloudera.com/cdh6/6.3.1/parcels/CDH-6.3.1-1.cdh6.3.1.p0.1470567-el7.parcel.sha1
    https://archive.cloudera.com/cdh6/6.3.1/parcels/manifest.json

    以上软件打包近网盘中,可自取:
    链接:https://pan.baidu.com/s/1UH50Uweyi7yg6bV7dl02mQ
    提取码:nx7p

    2.2.8 安装MySQL的jdbc驱动(主节点)

    [root@10-31-1-123 mysql]# mkdir -p /usr/share/java
    [root@10-31-1-123 mysql]# cd /usr/share/java
    [root@10-31-1-123 java]# 
    [root@10-31-1-123 java]# ll
    总用量 832
    -rw-r--r--. 1 root root 848067 1月  15 2014 mysql-connector-java-commercial-5.1.25-bin.jar
    [root@10-31-1-123 java]# 
    [root@10-31-1-123 java]# mv mysql-connector-java-commercial-5.1.25-bin.jar mysql-connector-java.jar
    [root@10-31-1-123 java]# ll
    总用量 832
    -rw-r--r--. 1 root root 848067 1月  15 2014 mysql-connector-java.jar
    [root@10-31-1-123 java]# 
    

    2.2.9 创建CDH源数据库、用户、amon服务的数据库(主节点)

    create database cmf DEFAULT CHARACTER SET utf8;
    create database amon DEFAULT CHARACTER SET utf8;
    grant all on cmf.* TO 'cmf'@'%' IDENTIFIED BY 'www.research.com';
    grant all on amon.* TO 'amon'@'%' IDENTIFIED BY 'www.research.com';
    flush privileges;
    

    2.2.10 修改Linux swappiness参数(所有节点)

    为了避免服务器使用swap功能而影响服务器性能,一般都会把vm.swappiness修改为0(cloudera建议10以下)

    [root@hp1 mysql]# cd /usr/lib/tuned/
    [root@hp1 tuned]# grep "vm.swappiness" * -R
    latency-performance/tuned.conf:vm.swappiness=10
    throughput-performance/tuned.conf:vm.swappiness=10
    virtual-guest/tuned.conf:vm.swappiness = 30
    
    然后将文件中的配置依次修改为0
    修改后将这些文件同步到其他机器上
    

    2.2.11 禁用透明页(所有节点)

    [root@hp1 ~]# vim /etc/rc.local
    在文件中添加如下内容:
    echo never > /sys/kernel/mm/transparent_hugepage/defrag
    echo never > /sys/kernel/mm/transparent_hugepage/enabled
    然后将该文件同步其他机器上,然后启动所有服务器
    
    

    三.CDH部署

    3.1 离线部署CM server及agent

    3.1.1 创建软件目录解压软件(所有节点)

    [root@10-31-1-123 cdh]# mkdir -p /opt/cloudera-manager
    [root@10-31-1-123 cloudera-manager]# cd /usr/local/cdh/
    [root@10-31-1-123 cdh]# ls -lrth
    总用量 3.3G
    -rw-r--r--. 1 root root  34K 11月 13 15:46 manifest.json
    -rw-r--r--. 1 root root 1.4G 11月 13 16:10 cm6.3.1-redhat7.tar.gz
    -rw-r--r--. 1 root root   40 11月 13 16:10 CDH-6.3.1-1.cdh6.3.1.p0.1470567-el7.parcel.sha1
    -rw-r--r--. 1 root root 2.0G 11月 13 16:37 CDH-6.3.1-1.cdh6.3.1.p0.1470567-el7.parcel
    [root@10-31-1-123 cdh]# 
    [root@10-31-1-123 cdh]# tar -zxf cm6.3.1-redhat7.tar.gz -C /opt/cloudera-manager
    [root@10-31-1-123 cdh]# 
    

    3.1.2 选择hp1为主节点作为cm server,直接部署(主节点)

    cd /opt/cloudera-manager/cm6.3.1/RPMS/x86_64/
    rpm -ivh cloudera-manager-daemons-6.3.1-1466458.el7.x86_64.rpm --nodeps --force
    rpm -ivh cloudera-manager-server-6.3.1-1466458.el7.x86_64.rpm --nodeps --force 
    

    安装记录:

    警告:cloudera-manager-daemons-6.3.1-1466458.el7.x86_64.rpm: 头V3 RSA/SHA256 Signature, 密钥 ID b0b19c9f: NOKEY
    准备中...                          ################################# [100%]
    正在升级/安装...
       1:cloudera-manager-daemons-6.3.1-14################################# [100%]
    [root@10-31-1-123 x86_64]# rpm -ivh cloudera-manager-server-6.3.1-1466458.el7.x86_64.rpm --nodeps --force 
    警告:cloudera-manager-server-6.3.1-1466458.el7.x86_64.rpm: 头V3 RSA/SHA256 Signature, 密钥 ID b0b19c9f: NOKEY
    准备中...                          ################################# [100%]
    正在升级/安装...
       1:cloudera-manager-server-6.3.1-146################################# [100%]
    Created symlink from /etc/systemd
    

    3.1.3 cm agent部署 (所有节点)

    cd /opt/cloudera-manager/cm6.3.1/RPMS/x86_64
    rpm -ivh cloudera-manager-daemons-6.3.1-1466458.el7.x86_64.rpm --nodeps --force
    rpm -ivh cloudera-manager-agent-6.3.1-1466458.el7.x86_64.rpm --nodeps --force
    

    3.1.4 修改agent配置,指向server节点hp1 (所有节点)

    sed -i "s/server_host=localhost/server_host=10.31.1.123/g" /etc/cloudera-scm-agent/config.ini
    

    3.1.5 server配置(主节点)

     vim /etc/cloudera-scm-server/db.properties
    com.cloudera.cmf.db.type=mysql
    com.cloudera.cmf.db.host=10.31.1.123
    com.cloudera.cmf.db.name=cmf
    com.cloudera.cmf.db.user=cmf
    com.cloudera.cmf.db.password=www.research.com
    com.cloudera.cmf.db.setupType=EXTERNAL
    

    3.2 主节点部署离线parcel源 (主节点)

    3.2.1 安装httpd

    yum install -y httpd 
    

    3.2.2 部署离线parcel源 (主节点)

    mkdir -p /var/www/html/cdh6_parcel
    cp /usr/local/cdh/CDH-6.3.1-1.cdh6.3.1.p0.1470567-el7.parcel /var/www/html/cdh6_parcel/
    mv /usr/local/cdh/CDH-6.3.1-1.cdh6.3.1.p0.1470567-el7.parcel.sha1 /var/www/html/cdh6_parcel/CDH-6.3.1-1.cdh6.3.1.p0.1470567-el7.parcel.sha
    mv /usr/local/cdh/manifest.json /var/www/html/cdh6_parcel/
    systemctl start httpd
    

    3.2.3 页面访问

    http://10.31.1.123/cdh6_parcel/
    
    image.png

    3.3 主节点启动server (主节点)

    [root@10-31-1-123 x86_64]#  systemctl start cloudera-scm-server
    [root@10-31-1-123 x86_64]#  ll /var/log/cloudera-scm-server/ 
    总用量 20
    -rw-r-----. 1 cloudera-scm cloudera-scm 19610 11月 13 17:29 cloudera-scm-server.log
    -rw-r-----. 1 cloudera-scm cloudera-scm     0 11月 13 17:29 cmf-server-nio.log
    -rw-r-----. 1 cloudera-scm cloudera-scm     0 11月 13 17:29 cmf-server-perf.log
    [root@10-31-1-123 x86_64]# tail /var/log/cloudera-scm-server/cloudera-scm-server.log
            at com.mchange.v2.c3p0.WrapperConnectionPoolDataSource.getPooledConnection(WrapperConnectionPoolDataSource.java:195)
            at com.mchange.v2.c3p0.WrapperConnectionPoolDataSource.getPooledConnection(WrapperConnectionPoolDataSource.java:184)
            at com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool$1PooledConnectionResourcePoolManager.acquireResource(C3P0PooledConnectionPool.java:200)
            at com.mchange.v2.resourcepool.BasicResourcePool.doAcquire(BasicResourcePool.java:1086)
            at com.mchange.v2.resourcepool.BasicResourcePool.doAcquireAndDecrementPendingAcquiresWithinLockOnSuccess(BasicResourcePool.java:1073)
            at com.mchange.v2.resourcepool.BasicResourcePool.access$800(BasicResourcePool.java:44)
            at com.mchange.v2.resourcepool.BasicResourcePool$ScatteredAcquireTask.run(BasicResourcePool.java:1810)
            at com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread.run(ThreadPoolAsynchronousRunner.java:648)
    2020-11-13 17:29:26,939 WARN C3P0PooledConnectionPoolManager[identityToken->2t3hq3ad1hj75spfrks4l|3be4f71]-HelperThread-#1:com.mchange.v2.resourcepool.BasicResourcePool: Having failed to acquire a resource, com.mchange.v2.resourcepool.BasicResourcePool@3b0ee03a is interrupting all Threads waiting on a resource to check out. Will try again in response to new client requests.
    2020-11-13 17:29:27,568 INFO main:com.cloudera.enterprise.CommonMain: Statistics not enabled, JMX will not be registered
    [root@10-31-1-123 x86_64]# 
    

    3.4 所有节点启动agent (所有节点)

     systemctl start cloudera-scm-agent 
    

    3.5 web页面操作

    3.5.1 登录主节点的7180端口

    http://10.31.1.123:7180/
    

    登陆用户名:admin
    登陆密码: admin


    image.png

    3.5.2 选择免费版

    image.png

    3.5.3 创建集群

    image.png

    输入集群名


    image.png

    输入集群主机,此处用主机名


    image.png

    选择存储库
    版本一定要对应上,不然一直安装不成功
    添加本地的存储:http://10.31.1.123/cdh6_parcel/
    然后把{latest_version}改为当前版本 : 6.3.1

    image.png

    安装JDK


    image.png

    配置SSH登陆


    image.png

    等待安装结束


    image.png

    安装parcels


    image.png

    安装告一段落


    image.png

    选择服务;


    image.png

    默认项:


    image.png

    创建数据库

    create database hive DEFAULT CHARACTER SET utf8;
    grant all on hive.* TO 'hive'@'%' IDENTIFIED BY 'hive';
    
    create database oozie DEFAULT CHARACTER SET utf8;
    grant all on oozie.* TO 'oozie'@'%' IDENTIFIED BY 'oozie';
    
    create database hue DEFAULT CHARACTER SET utf8;
    grant all on hue.* TO 'hue'@'%' IDENTIFIED BY 'hue';
    
    flush privileges;
    
    --hue会报错
    mkdir /usr/lib64/mysql
    cp /usr/local/mysql/lib/libmysqlclient.so.18.1.0  /usr/lib64/mysql/
    cd /usr/lib64/mysql/
    ln -s libmysqlclient.so.18.1.0 libmysqlclient.so.18
    
    [root@hp1 mysql]# more /etc/ld.so.conf
    include ld.so.conf.d/*.conf
    /usr/lib64/mysql
    [root@hp1 mysql]# ldconfi
    
    image.png

    审核更改:


    image.png

    大功告成:


    image.png

    FAQ

    1.CDH文件权限问题

    Hive执行语句的时候提示 /user权限不够

    hive> 
        > select count(*) from fact_sale;
    Query ID = root_20201119152619_16f496b5-2482-4efb-a26c-e18117b2f10c
    Total jobs = 1
    Launching Job 1 out of 1
    Number of reduce tasks determined at compile time: 1
    In order to change the average load for a reducer (in bytes):
      set hive.exec.reducers.bytes.per.reducer=<number>
    In order to limit the maximum number of reducers:
      set hive.exec.reducers.max=<number>
    In order to set a constant number of reducers:
      set mapreduce.job.reduces=<number>
    org.apache.hadoop.security.AccessControlException: Permission denied: user=root, access=WRITE, inode="/user":hdfs:supergroup:drwxr-xr-x
    

    解决方案:

    [root@hp1 ~]# 
    [root@hp1 ~]# hadoop fs -ls /
    Found 2 items
    drwxrwxrwt   - hdfs supergroup          0 2020-11-15 12:22 /tmp
    drwxr-xr-x   - hdfs supergroup          0 2020-11-15 12:21 /user
    [root@hp1 ~]# 
    [root@hp1 ~]# hadoop fs -chmod 777 /user
    chmod: changing permissions of '/user': Permission denied. user=root is not the owner of inode=/user
    [root@hp1 ~]# 
    [root@hp1 ~]# sudo -u hdfs hadoop fs -chmod 777 /user
    [root@hp1 ~]# 
    [root@hp1 ~]# 
    [root@hp1 ~]# hadoop fs -ls /
    Found 2 items
    drwxrwxrwt   - hdfs supergroup          0 2020-11-15 12:22 /tmp
    drwxrwxrwx   - hdfs supergroup          0 2020-11-15 12:21 /user
    [root@hp1 ~]# 
    

    2.CDH yarn资源包问题

    hive insert into语句报错:

    > insert into fact_sale(id,sale_date,prod_name,sale_nums) values (1,'2011-08-16','PROD4',28);
    Query ID = root_20201119163832_f78a095d-2656-4da6-825f-64127e84b8b4
    Total jobs = 3
    Launching Job 1 out of 3
    Number of reduce tasks is set to 0 since there's no reduce operator
    20/11/19 16:38:32 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to rm69
    Starting Job = job_1605767427026_0013, Tracking URL = http://hp3:8088/proxy/application_1605767427026_0013/
    Kill Command = /opt/cloudera/parcels/CDH-6.3.1-1.cdh6.3.1.p0.1470567/lib/hadoop/bin/hadoop job  -kill job_1605767427026_0013
    Hadoop job information for Stage-1: number of mappers: 0; number of reducers: 0
    2020-11-19 16:39:12,211 Stage-1 map = 0%,  reduce = 0%
    Ended Job = job_1605767427026_0013 with errors
    Error during job, obtaining debugging information...
    FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
    MapReduce Jobs Launched: 
    Stage-Stage-1:  HDFS Read: 0 HDFS Write: 0 HDFS EC Read: 0 FAIL
    Total MapReduce CPU Time Spent: 0 msec
    

    根据提示看错误得到错误信息:
    跑mapreduce任务报错Download and unpack failed


    image.png

    安装一下就OK

    image.png

    3.hue的load Balancer启动失败

    load Balancer启动失败


    image.png

    查看日志,提示也没有日志,创建了指定目录,也没有日志输出


    image.png

    解决方案:

    yum -y install httpd
    yum -y install mod_ssl
    

    重新启动hue

    参考

    1.https://docs.cloudera.com/documentation/enterprise/latest/topics/installation.html
    2.https://www.cnblogs.com/shwang/p/12112508.html
    3.https://blog.csdn.net/gxd520/article/details/100982436
    4.https://wxy0327.blog.csdn.net/article/details/51768968
    5.https://blog.csdn.net/sinat_35045195/article/details/102566776

    错误处理,参考
    1.https://blog.csdn.net/weixin_39478115/article/details/77483251
    2.https://q.cnblogs.com/q/110190/

    相关文章

      网友评论

        本文标题:CDH 6.3 大数据平台搭建

        本文链接:https://www.haomeiwen.com/subject/fwsvbktx.html