spark yarn集群搭建(二:hadoop集群搭建)
作者:
无忧默言 | 来源:发表于
2018-06-27 14:03 被阅读0次
Master节点配置:
- 进入/datamgt目录下下载二进制包hadoop-2.7.3.tar.gz,解压并重命名
tar -zxvf hadoop-2.7.6.tar.gz && mv hadoop-2.7.6 hadoop
- 修改全局变量/etc/profile
- 修改hadoop配置文件
- 修改JAVA_HOME
vim $HADOOP_HOME/etc/hadoop/hadoop-env.sh
#将export JAVA_HOME=${JAVA_HOME}修改为:
export JAVA_HOME=/usr/java/jdk1.8.0_65
- 修改slaves
vim $HADOOP_HOME/etc/hadoop/slaves
#将原来的localhost删除,改成如下内容:
slave1
slave2
- 修改$HADOOP_HOME/etc/hadoop/core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://master:9000</value>
</property>
<property>
<name>io.file.buffer.size</name>
<value>131072</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/datamgt/hadoop/tmp</value>
</property>
</configuration>
- 修改$HADOOP_HOME/etc/hadoop/hdfs-site.xml
<configuration>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>master:50090</value>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/datamgt/hadoop/hdfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/datamgt/hadoop/hdfs/data</value>
</property>
</configuration>
- 修改$HADOOP_HOME/etc/hadoop/yarn-site.xml
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>master:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>master:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>master:8031</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>master:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>master:8088</value>
</property>
</configuration>
- 修改$HADOOP_HOME/etc/hadoop/mapred-site.xml
#先复制mapred-site.xml.template,生成mapred-site.xml后进行修改
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>master:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>master:19888</value>
</property>
</configuration>
salve节点配置:
- 复制master节点的hadoop文件夹到slave1和slave2上:
scp -r /datamgt/hadoop root@slave1:/datamgt && scp -r /datamgt/hadoop root@slave2:/datamgt
- 修改slave1、slave2节点下的/etc/profile文件,过程与master一致
启动集群:
- master节点启动之前格式化一下namenode
hadoop namenode -format
- master节点执行
/datamgt/hadoop/sbin/start-all.sh
查看集群是否启动成功:
遇到的问题
- 50070不可访问:
一开始以为是端口监听的问题:Hadoop HDFS的namenode WEB访问50070端口打不开解决方法
后来查看日志(hadoop/logs/namenode日志)发现是因为本机9000端口被占用导致hadoop的namenode服务启动失败
本文标题:spark yarn集群搭建(二:hadoop集群搭建)
本文链接:https://www.haomeiwen.com/subject/slplyftx.html
网友评论