1、环境准备
[root@host196 hadoop-2.8.5]# cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.74.10 host196
192.168.74.29 host197
192.168.74.30 host198
安装好jdk 1.8, zookeeper集群,规划好机器安装角色
启用ssh 免密登录
2、安装步骤
cd opt/
wget https://mirrors.tuna.tsinghua.edu.cn/apache/hadoop/common/hadoop-2.7.7/hadoop-2.7.7.tar.gz
tar -zxvf hadoop-2.7.7.tar.gz
配置环境变量
vi /etc/profile
HADOOP_HOME=/opt/hadoop-2.8.5
HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
ZOOKEEPER_HOME=/opt/zookeeper-3.4.8
export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$GIT_HOME/bin:$PATH
source /etc/profile
更改hadoop-env.sh文件
vi hadoop-env.sh
export JAVA_HOME=/usr/local/jdk1.8.0_111
修改 core-site.xml 打开 core-site.xml
vi core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://host196:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>file:/opt/hadoop-2.8.5/tmp</value>
</property>
</configuration>
修改 hdfs-site.xml 打开hdfs-site.xml文件
vi hdfs-site.xml
<configuration>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>host196:50090</value>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/opt/hadoop-2.8.5/tmp/dfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/opt/hadoop-2.8.5/tmp/dfs/data</value>
</property>
</configuration>
修改 mapred-site.xml,根据模板拷贝
cp mapred-site.xml.template mapred-site.xml
vi mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>host196:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>host196:19888</value>
</property>
</configuration>
修改 yarn-site.xml
vi yarn-site.xml
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.resourcemanager.hostname</name>
<value>host196</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
更改slaves
vi slaves
host197
host198
scp hadoop目录至host197, host198上
scp -r hadoop-2.8.5 host197:/opt/
scp -r hadoop-2.8.5 host198:/opt/
集群格式化,只需要在主节点上,第一次操作就行了
./hadoop namenode -format
./hadoop datanode -format
3、启动、停止服务脚本
- [启动]
关闭防火墙
systemctl stop firewalld.service
start-dfs.sh
start-yarn.sh
mr-jobhistory-daemon.sh start historyserver
或者
start-all.sh
mr-jobhistory-daemon.sh start historyserver
- [停止]
stop-all.sh
4、简单测试
在HDFS上创建一个文件夹
hadoop fs -mkdir -p /test/hdfs_test/input
查看创建的文件夹
[root@host196 hadoop]# hadoop fs -ls /
Found 2 items
drwxr-xr-x - root supergroup 0 2018-10-18 17:54 /opt
drwxrwx--- - root supergroup 0 2018-10-18 17:40 /tmp
[root@host196 hadoop]# hadoop fs -ls /opt/hdfs_test
Found 2 items
drwxr-xr-x - root supergroup 0 2018-10-18 18:06 /opt/hdfs_test/input
drwxr-xr-x - root supergroup 0 2018-10-19 10:11 /opt/hdfs_test/output
创建一个文件words.txt
vi words.txt
hello zhangsan
hello lisi
hello wangwu
上传到HDFS的/test/hdfs_test/input文件夹中
[hadoop@hadoop1 ~]$ hadoop fs -put words.txt /test/hdfs_test/input
将刚刚上传的文件下载到~/data文件夹中
hadoop fs -get /test/hdfs_test/input/words.txt ~/data
删除hdfs上的文件夹
hadoop fs -rmdir /opt/hdfs_test/output
运行一个mapreduce的例子程序
hadoop jar /opt/hadoop-2.8.5/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.8.5.jar wordcount /opt/hdfs_test/input /opt/hdfs_test/output
5、参考资料
1、Hadoop-2.7.7 集群快速搭建-(https://blog.csdn.net/qq_33857413/article/details/82853037) 2、Hadoop学习之路(四)Hadoop集群搭建和简单应用-(https://www.cnblogs.com/qingyunzong/p/8496127.html)
6、FAQ
1、连接hadoop 进行hdfs文件操作,出现以下异常:Permission denied: user=administrator, access=WRITE, inode="/":root:supergroup:drwxr-xr-x
解决方法:hadoop 的hdfs-site文件中添加以下内容,关闭权限检查 ,即解决了上述问题。
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
sudo -u hdfs hadoop fs -mkdir /user/root 我们可以以hdfs的身份对文件进行操作
切换到hdfs用户 进行执行命令即可
网友评论