安装搭建
各节点安装Java,并配置环境变量。
先在Master节点进行Hadoop安装配置。
主要配置文件如下:
core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://hadoopM:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/opt/hadoop/tmp</value>
</property>
<property>
<name>dfs.name.dir</name>
<value>/opt/hadoop/tmp/name</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/opt/hadoop/tmp/data</value>
</property>
</configuration>
hdfs-site.xml
<configuration>
<property>
<name>dfs.namenode.name.dir</name>
<value>/opt/hadoop/hdfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/opt/hadoop/hdfs/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>hadoopM:9001</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
</configuration>
mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>hadoopM:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.http.address</name>
<value>hadoopM:50030</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>hadoopM:19888</value>
</property>
</configuration>
yarn-site.xml
<configuration>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>hadoopM</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</value>
</property>
<property>
<name>yarn.scheduler.fair.preemption</name>
<value>true</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>hadoopM:8088</value>
<description>The http address of the RM web application</description>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>hadoopM:8032</value>
<description>The address of the applications manager interface in the RM</description>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>hadoopM:8030</value>
<description>The address of the scheduler interface</description>
</property>
<property>
<name>yarn.resourcemanager.resouce-tracker.address</name>
<value>hadoopM:8031</value>
<description></description>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>hadoopM:8033</value>
<description>The address of the RM admin interface</description>
</property>
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>2048</value>
</property>
<property>
<name>yarn.nodemanager.vmem-check-enabled</name>
<value>false</value>
</property>
<property>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
</property>
<property>
<name>yarn.log.server.url</name>
<value>http://hadoopM:19888/jobhistory/logs</value>
</property>
</configuration>
slaves
hadoopS1
hadoopS2
···
还要在hadoop-env.sh、yarn-env.sh中配置好java环境变量。
配置好后,将Master节点上的hadoop文件夹复制到各节点上,可使用scp进行数据传输。
scp -r ./hadoop xx@hadoopSx:/.../hadoop
启停使用
关闭防火墙
# centos
systemctl stop firewalld.service
在Master节点启动,在/hadoop/sbin/ 下执行 start-all.sh、stop-all.sh进行启动和关闭。
网友评论