美文网首页我爱编程
hadoop/hbase/spark安装

hadoop/hbase/spark安装

作者: 曾小俊爱睡觉 | 来源:发表于2017-03-31 18:10 被阅读0次

    安装母机linux

    下载CentOS-6.5-x86_64-bin-minimal.iso进行U盘安装

    安装KVM虚拟化软件

    yum install kvm libvirt python-virtinst qemu-kvm virt-viewer bridge-utils  #安装
    /etc/init.d/libvirtd start #启动
    

    也可以在软件管理器搜索kvm进行安装

    虚拟化机器集群

    桥接方式创建虚拟机:

    virt-install 
    --name=gateway  #名字
    --ram 4096          #内存
    --vcpus=4           #cpu核数
    -f /home/kvm/gateway.img    #文件
    --cdrom /root/CentOS-6.5-x86_64-bin-minimal.iso     #iso镜像文件
    --graphics vnc,listen=0.0.0.0,port=5920,                #是否使用vnc连接器
    --network bridge=br0 --force --autostart            #采用桥接方式桥接br0,自动启动
    

    也可以在图形化界面进行新增操作。minimal安装的linux需要先安装桌面环境。

    yum -y groupinstall Desktop
    yum -y groupinstall "X Window System"
    startx  #启动图形化
    
    如果想默认以图形界面启动,则修改/etc/inittab

    id:5:initdefault: #3为默认命令行,5为图形化,其他不常用

    搭建环境前摇

    1、网络设置
    /etc/sysconfig/network-scripts/ifcfg-eth0

    DEVICE=eth0
    TYPE=Ethernet
    ONBOOT=yes
    NM_CONTROLLED=yes
    BOOTPROTO=static
    HWADDR=52:54:00:00:9d:f1
    IPADDR=192.168.0.231
    PREFIX=24
    GATEWAY=192.168.0.1
    DNS1=192.168.0.1
    DNS2=8.8.8.8
    DEFROUTE=yes
    IPV4_FAILURE_FATAL=yes
    IPV6INIT=no
    NAME="System eth0"
    

    2、关闭selinux
    /etc/selinux/config

    # This file controls the state of SELinux on the system.
    # SELINUX= can take one of these three values:
    #     enforcing - SELinux security policy is enforced.
    #     permissive - SELinux prints warnings instead of enforcing.
    #     disabled - No SELinux policy is loaded.
    SELINUX=disabled
    # SELINUXTYPE= can take one of these two values:
    #     targeted - Targeted processes are protected,
    #     mls - Multi Level Security protection.
    SELINUXTYPE=targeted
    

    3、修改文件句柄
    在/etc/security/limis.conf增加以下配置,将root替换为你想授权的用户

    root soft nofile 65535
    root hard nofile 65535
    root soft nproc 32000
    root hard nproc 32000
    

    4、关闭防火墙

    service iptables stop   #现在关闭iptables
    chkconfig --level 35 iptables off   #以后重启也不启动iptables
    

    5、修改主机名和hosts

    cat > /etc/sysconfig/network << EOF
    > NETWORKING=yes
    > HOSTNAME=spark-1
    > GATEWAY=192.168.0.1
    > EOF
     
    cat >> /etc/hosts << EOF
    > 192.168.0.231 spark-1
    > 192.168.0.232 spark-2
    > 192.168.0.233 spark-3
    > 192.168.0.234 spark-4
    > EOF
    

    环境搭建

    • ssh互通
      每台机器执行以下命令:
    ssh-keygen
    touch authorized_keys
    

    将每台机器的id_rsa.pub的内容拷贝到每台机器的authorized_keys文件内 ssh-copy-id命令

    • 安装和配置启动hadoop
      从官网下载hadoop-2.7.3.tar.gz到~
    tar zxvf hadoop-2.7.3.tar.gz
    mv hadoop-2.7.3 /usr/local
    echo "HADOOP_HOME=/usr/local/hadoop-2.7.3" >> /etc/profile
    echo "PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin" >> /etc/profile
    source /etc/profile
    cd $HADOOP_HOME
    # 修改core-site.xml
        <property>
            <name>hadoop.tmp.dir</name>
            <value>/usr/local/hadoop-2.7.3/var</value>
        </property>
        <property>
            <name>fs.defaultFS</name>
            <value>hdfs://spark-1:9000</value>
        </property>
        <property>
            <name>fs.trash.interval</name>
            <value>2880</value>
        </property>
    # 修改hdfs-site.xml
    <property>
            <name>dfs.replication</name>
            <value>2</value>
        </property>
        <property>
            <name>dfs.permissions.enabled</name>
            <value>false</value>
        </property>
        <property>
            <name>dfs.namenode.http-address</name>
            <value>spark-1:50070</value>
        </property>
        <property>
            <name>dfs.namenode.secondary.http-address</name>
            <value>spark-2:50090</value>
        </property>
    # 修改mapred-site.xml
    <property>
    <name>mapred.job.tracker</name>
    <value>spark-1:8021</value>
    </property>
    # 修改slaves
    spark-1
    spark-2
    spark-3
    spark-4
    # 启动hadoop
    $HADOOP_HOME/bin/hadoop namenode -format
    $HADOOP_HOME/sbin/start-dfs.sh
    $HADOOP_HOME/sbin/start-yarn.sh
    $HADOOP_HOME/bin/hadoop dfsadmin -safemode leave
    

    安装zookeeper

    同上步骤1将zookeeper-3.4.6.tar.gz解压到/usr/local,将ZOOKEEPER_HOME加到/etc/profile,更新PATH
    创建data目录和myid文件和logs目录

    mkdir -p $ZOOKEEPER_HOME/data
    touch $ZOOKEEPER_HOME/myid
    echo 1 > $ZOOKEEPER_HOME/myid #安装的每台机器拥有一个id,一般安装奇数台机器,id一般0或1开始,保证唯一性
    mkdir -p $ZOOKEEPER_HOME/logs
    

    配置zookeeper

    cd $ZOOKEEPER_HOME/conf
    cp zoo_sample.cfg zoo.cfg
    echo > zoo.cfg << EOF
    > tickTime=2000
    > dataDir=/usr/local/zookeeper-3.4.6/data
    > dataLogDir=/usr/local/zookeeper-3.4.6/logs
    > clientPort=2181
    > tickTime=2000
    > initLimit=10
    > syncLimit=5
    > server.0=spark-1:2888:3888
    > server.1=spark-2:2888:3888
    > server.2=spark-3:2888:3888
    > EOF
    

    启动zookeeper,在每台安装机器执行:

    zkServer.sh start
    zkServer.sh status #查看状态
    

    安装hbase

    同上解压和配置环境变量
    修改hbase-site.xml

        <property>
                <name>hbase.rootdir</name>
                <value>hdfs://spark-1:9000/hbase</value>
        </property>
        <property>
                <name>hbase.cluster.distributed</name>
                <value>true</value>
        </property>
        <property>
                <name>hbase.master</name>
                <value>spark-1:60000</value>
       </property>
       <property>
                <name>hbase.zookeeper.quorum</name>
                <value>spark-1,spark-2,spark-3</value>
       </property>
    

    修改hbase-env.sh,添加以下内容

    export JAVA_HOME=/usr/local/jdk  #java安装目录
    export HBASE_LOG_DIR=/usr/local/hbase-1.2.1/logs #Hbase日志目录
    export HBASE_MANAGES_ZK=false #如果使用HBase自带的Zookeeper值设成true 如果使用自己安装的Zookeeper需要将该值设为false
    修改regionservers,将安装HBase的主机名加入,去掉localhost启动hbase
    $HBASE_HOME/bin/start-hbase.sh
    

    安装spark

    如上解压和配置环境变量
    将hadoop的core-site.xml、hdfs-site.xml和HBase的hdfs-site.xml拷到spark的conf目录
    修改spark-default.conf

    spark.executor.memory       6g
    spark.eventLog.enabled      true
    spark.eventLog.dir      hdfs://spark-1:9000/spark-history
    spark.serializer        org.apache.spark.serializer.KryoSerializer
    spark.eventLog.compress     true
    spark.scheduler.mode        FAIR
    

    修改spark-env.sh

    export HBASE_HOME=/usr/local/hbase-1.2.1
    export HIVE_HOME=/usr/local/hive-1.2.1
    export SPARK_CLASSPATH=$SPARK_CLASSPATH:/usr/local/spark-2.0.1-hadoop2.7/jars/hbase/*
    export SCALA_HOME=/usr/local/scala
    export JAVA_HOME=/usr/local/jdk
    export SPARK_MASTER_IP=spark-1
    export SPARK_WORKER_MEMORY=11g
    export SPARK_WORKER_CORES=4
    export SPARK_EXECUTOR_CORES=2
    export SPARK_EXECUTOR_MEMORY=6g
    export SPARK_DAEMON_MEMORY=11g
    export HADOOP_CONF_DIR=/usr/local/hadoop-2.7.3/etc/hadoop
    # export SPARK_DAEMON_JAVA_OPTS="-Dspark.deploy.recoveryMode=ZOOKEEPER -Dspark.deploy.zookeeper.url=s1:2181,s2:2181,s3:2181 -Dspark.deploy.zookeeper.dir=/spark" #zookeeper管理模式
    export SPARK_LOG_DIR=/usr/local/spark-2.0.1-hadoop2.7/logs
    export SPARK_HISTORY_OPTS="-Dspark.history.retainedApplications=10 -Dspark.history.fs.logDirectory=hdfs://spark-1:9000/spark-history"
    

    将hbase的lib下的以下jar包拷到spark的jars/hbase目录

    guava-12.0.1.jar  hbase-client-1.2.1.jar  hbase-common-1.2.1.jar  hbase-protocol-1.2.1.jar  hbase-server-1.2.1.jar  htrace-core-3.1.0-incubating.jar  metrics-core-2.2.0.jar  protobuf-java-2.5.0.jar
    

    启动spark

    $SPARK_HOME/sbin/start-all.sh
    $SPARK_HOME/sbin/start-history-server.sh
    

    快捷安装思路:

    • 善用rsync:
      由于几乎大部分配置都是一样,可以在一台机器上先做以上配置,用rsync -avz 进行文件夹同步,然后每台机器配置不同的地方
    • 使用fabric
      通过编写fabric脚本来安装集群,对shell能力要求较高

    相关文章

      网友评论

        本文标题:hadoop/hbase/spark安装

        本文链接:https://www.haomeiwen.com/subject/oepcmttx.html