一、安装概述
从头安装hadoop,空空如也的虚拟机。三台虚拟机,一台做master,ip为172.16.100.26、两台做slave节点,172.16.100.27、172.16.100.28。安装思路:虚拟机设置、节点之间免密登录、java环境配置、hadoop(tar包)安装配置。
二、虚拟机设置
- 查看当前虚拟机所用linux版本
[root@hadoop01 ~]# hostnamectl
Static hostname: localhost.localdomain
Transient hostname: hadoop01
Icon name: computer-vm
Chassis: vm
Machine ID: 149e71b7061342ff8e4064e72b2cd08b
Boot ID: 2c1726ea8845428c80c585e2eb1eb3f1
Virtualization: vmware
Operating System: CentOS Linux 7 (Core)
CPE OS Name: cpe:/o:centos:centos:7
Kernel: Linux 3.10.0-1062.el7.x86_64
Architecture: x86-64
- 设置主机名称
[root@hadoop01 ~]# vim /etc/hosts
172.16.100.26 hadoop01 master
172.16.100.27 hadoop02 node01
172.16.100.28 hadoop03 node02
主机ip 主机名称 别名
三、节点之间免密登录
- 在三台机器上分别生成公钥,不需要输入其它命令
[root@hadoop01 ~]# ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa):
Created directory '/root/.ssh'.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:7r464deCA7JumS7HciwrF5Cw2Ydsn7htEHCtYDYLvUU root@hadoop01
The key's randomart image is:
+---[RSA 2048]----+
| . oE |
|=+o o |
|=Xo= |
|=.O . |
| o = . S |
| = + .. |
| o X o o.. |
|= % o =.o . |
|o%o. .*+o |
+----[SHA256]-----+
- 查看生成的结果
[root@hadoop01 ~]# ll -a
total 36
dr-xr-x---. 4 root root 175 Nov 16 11:08 .
dr-xr-xr-x. 17 root root 224 Aug 28 00:52 ..
-rw-------. 1 root root 1236 Aug 28 00:52 anaconda-ks.cfg
-rw-------. 1 root root 4607 Nov 15 10:55 .bash_history
-rw-r--r--. 1 root root 18 Dec 29 2013 .bash_logout
-rw-r--r--. 1 root root 176 Dec 29 2013 .bash_profile
-rw-r--r--. 1 root root 176 Dec 29 2013 .bashrc
-rw-r--r--. 1 root root 100 Dec 29 2013 .cshrc
drwxr-----. 3 root root 19 Sep 2 11:03 .pki
drwx------. 2 root root 80 Nov 16 11:09 .ssh
-rw-r--r--. 1 root root 129 Dec 29 2013 .tcshrc
-rw-------. 1 root root 2923 Nov 16 11:07 .viminfo
[root@hadoop01 ~]# cd .ssh/
[root@hadoop01 .ssh]# ll
total 16
-rw-r--r--. 1 root root 1185 Nov 16 11:09 authorized_keys
-rw-------. 1 root root 1675 Nov 16 11:09 id_rsa
-rw-r--r--. 1 root root 395 Nov 16 11:09 id_rsa.pub
-rw-r--r--. 1 root root 552 Nov 16 11:09 known_hosts
- 将hadoop01机器上的生成的id_rsa.pub文件追加到authorized_keys文件中
[root@hadoop01 .ssh]# cat id_rsa.pub >> authorized_keys
- 利用xftp或xsecure工具将authorized_keys文件拷贝到windows机器上
- 将hadoop02、hadoop03机器上的id_rsa.pub文件拷贝到windows机器上的authorized_keys文件
- 将包含三台hadoop机器公钥的authorized_keys文件分别拷贝到.ssh文件夹下
- 使用ssh命令连接其它机器,非root用户或root组用户,还需要输入密码,原因是权限问题,chmod 700 .ssh/ 、chmod 600 .ssh/authorized_keys,此时再执行ssh命令,免密登录设置成功
[root@hadoop01 ~]# ssh hadoop02
[root@hadoop01 ~]# ssh hadoop03
四、java环境配置
- 下载地址:https://www.oracle.com/java/technologies/downloads/#java8
如未弹出当前页面,并提示浏览器cookie限制,可在浏览器的设置允许cookie即可。
image.png
image.png - 在/usr/local/文件夹下创建java目录
[root@hadoop01 local]# mkdir java
- 利用xftp或xsecure工具上传jar包到/usr/local/java目录,并解压
[root@hadoop01 java]# tar -zxvf jdk-8u311-linux-x64.tar.gz
- 添加java环境变量
[root@hadoop01 java]# vim /etc/profile
export JAVA_HOME=/usr/local/java/jdk1.8.0_311
export PATH=$PATH:$JAVA_HOME/bin
- 刷新配置文件,是配置文件即时生效
[root@hadoop01 java]# source /etc/profile
- 验证java配置结果
[root@hadoop01 java]# java -version
java version "1.8.0_311"
Java(TM) SE Runtime Environment (build 1.8.0_311-b11)
Java HotSpot(TM) 64-Bit Server VM (build 25.311-b11, mixed mode)
You have new mail in /var/spool/mail/root
四、hadoop(tar包)安装配置
- 下载地址:https://downloads.apache.org/hadoop/common/stable2/,所选版本2.10.1
- 三台机器分别创建hadoop用户
不要直接用root用户安装软件,首先root用户权限很大,容易误伤。对于网络程序,以root用户运行的程序,被攻击后,可直接获取root密码,后果严重
[root@hadoop01 java]# adduser hadoop
[root@hadoop01 java]# su hadoop
[hadoop@hadoop01 java]$ cd ~
[hadoop@hadoop01 ~]$ ll
total 0
- 使用root用户修改hadoop用户的密码
我的臭记性只能记住最简单的,忽略这个warning
[root@hadoop01 ~]# passwd hadoop
Changing password for user hadoop.
New password:
BAD PASSWORD: The password is shorter than 8 characters
Retype new password:
passwd: all authentication tokens updated successfully.
- 使用hadoop用户,上传压缩包到hadoop用户的根目录,并解压至当前目录
只操作第一节点即可,配置调整完毕后,再分发到其它节点
[hadoop@hadoop01 ~]$ tar -zxvf hadoop-2.10.1.tar.gz
[hadoop@hadoop01 ~]$ ll
total 399012
drwxrwxrwx. 11 hadoop hadoop 173 Nov 13 09:08 hadoop-2.10.1
-rw-r--r--. 1 hadoop hadoop 408587111 Nov 12 11:07 hadoop-2.10.1.tar.gz
You have new mail in /var/spool/mail/root
- 进入hadoop配置文件目录,修改配置
[hadoop@hadoop01 ~]$ cd hadoop-2.10.1/etc/hadoop/
You have new mail in /var/spool/mail/root
- vim hadoop-env.sh
[hadoop@hadoop01 hadoop]$ vim hadoop-env.sh
export JAVA_HOME=/usr/local/java/jdk1.8.0_311
- 在hadoop目录下先创建data目录,vim core-site.xml
[hadoop@hadoop01 hadoop-2.10.1]$ mkdir data
[hadoop@hadoop01 hadoop-2.10.1]$ cd etc/hadoop
[hadoop@hadoop01 hadoop-2.10.1]$ vim core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://hadoop01:8082</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/hadoop/hadoop-2.10.1/data</value>
</property>
</configuration>
- vim hdfs-site.xml
[hadoop@hadoop01 hadoop]$ vim hdfs-site.xml
<configuration>
<property>
<name>dfs.namenode.name.dir</name>
<value>/home/hadoop/hadoop-2.10.1/data/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/home/hadoop/hadoop-2.10.1/data/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.secondary.http.address</name>
<value>hadoop02:50090</value>
</property>
<property>
<name>dfs.permissions.enabled</name>
<value>false</value>
<description>关闭hdfs权限校验功能,非root用户会出现权限问题,需添加此配置</description>
</property>
<property>
<name>dfs.client.block.write.replace-datanode-on-failure.enable</name>
<value>true</value>
<description>调整策略,解决文件追加的报错问题</description>
</property>
<property>
<name>dfs.client.block.write.replace-datanode-on-failure.policy</name>
<value>never</value>
<description>调整策略,解决文件追加的报错问题</description>
</property>
</configuration>
- vim mapred-site.xml,指定mapreduce计算框架用yarn的方式启动
[hadoop@hadoop01 hadoop]$ cp mapred-site.xml.template mapred-site.xml
[hadoop@hadoop01 hadoop]$ vim mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
- vim yarn-site.xml
[hadoop@hadoop01 hadoop]$ vim yarn-site.xml
<configuration>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>hadoop01</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
- vim slaves
[hadoop@hadoop01 hadoop]$ vim slaves
hadoop01
hadoop02
hadoop03
- 将hadoop目录复制到其它节点
此时hadoop用户未做免密登录,需要输入密码
[hadoop@hadoop01 hadoop]$ scp -r ~/hadoop-2.10.1 hadoop@hadoop02:~
[hadoop@hadoop01 hadoop]$ scp -r ~/hadoop-2.10.1 hadoop@hadoop03:~
- hadoop用户环境变量配置,并使环境变量生效,此操作三台机器均需执行
[hadoop@hadoop01 hadoop]$ vim ~/.bashrc
export HADOOP_HOME=/home/hadoop/hadoop-2.10.1
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:
[hadoop@hadoop01 hadoop]$ source ~/.bashr
- 进入hadoop的bin目录,测试hadoop安装结果
[hadoop@hadoop01 hadoop]$ hadoop version
Hadoop 2.10.1
Subversion https://github.com/apache/hadoop -r 1827467c9a56f133025f28557bfc2c562d78e816
Compiled by centos on 2020-09-14T13:17Z
Compiled with protoc 2.5.0
From source with checksum 3114edef868f1f3824e7d0f68be03650
This command was run using /home/hadoop/hadoop-2.10.1/share/hadoop/common/hadoop-common-2.10.1.jar
You have new mail in /var/spool/mail/root
- 只在主节点上进行格式化通信,且只执行一次
[hadoop@hadoop01 hadoop]$ hadoop namenode -format
- 重复免密登录的操作,使hadoop用户实现免密登录,注意权限问题
此时不实现免密登录,集群启动会很麻烦 - 在sbin目录下启动hdfs服务
[hadoop@hadoop01 sbin]$ pwd
/home/hadoop/hadoop-2.10.1/sbin
[hadoop@hadoop01 sbin]$ ./start-dfs.sh
Starting namenodes on [hadoop01]
hadoop01: starting namenode, logging to /home/hadoop/hadoop-2.10.1/logs/hadoop-hadoop-namenode-hadoop01.out
hadoop02: starting datanode, logging to /home/hadoop/hadoop-2.10.1/logs/hadoop-hadoop-datanode-hadoop02.out
hadoop01: starting datanode, logging to /home/hadoop/hadoop-2.10.1/logs/hadoop-hadoop-datanode-hadoop01.out
hadoop03: starting datanode, logging to /home/hadoop/hadoop-2.10.1/logs/hadoop-hadoop-datanode-hadoop03.out
Starting secondary namenodes [hadoop02]
hadoop02: starting secondarynamenode, logging to /home/hadoop/hadoop-2.10.1/logs/hadoop-hadoop-secondarynamenode-hadoop02.out
You have new mail in /var/spool/mail/root
-
在网页上通过ui界面验证启动结果,http://172.16.100.26:50070/
image.png -
在sbin目录下启动yarn服务
[hadoop@hadoop01 sbin]$ pwd
/home/hadoop/hadoop-2.10.1/sbin
[hadoop@hadoop01 sbin]$ ./start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /home/hadoop/hadoop-2.10.1/logs/yarn-hadoop-resourcemanager-hadoop01.out
hadoop03: starting nodemanager, logging to /home/hadoop/hadoop-2.10.1/logs/yarn-hadoop-nodemanager-hadoop03.out
hadoop02: starting nodemanager, logging to /home/hadoop/hadoop-2.10.1/logs/yarn-hadoop-nodemanager-hadoop02.out
hadoop01: starting nodemanager, logging to /home/hadoop/hadoop-2.10.1/logs/yarn-hadoop-nodemanager-hadoop01.out
-
在网页上通过ui界面验证启动结果,http://172.16.100.26:8088/
image.png
五、我遇到的问题
- 用户使用混乱,应该使用hadoop用户时,我使用了root用户。场景描述:启动hdfs时,使用过root用户,生成的log文件的所有者是root,此时使用hadoop用户执行./start-dfs.sh时,出现权限拒绝。解决方案有两种,且三台机器均需执行此操作:
a. 切换到root用户下删除以root权限为日志,然后使用hadoop用户重新执行该命令。
b. 使用修改文件所有权的方法,将root权限修改回hadoop权限。
[hadoop@hadoop01 logs]$ pwd
/home/hadoop/hadoop-2.10.1/logs
[hadoop@hadoop01 logs]$ chown -R hadoop:hadoop hadoop-hadoop-datanode-hadoop01.log
[root@hadoop01 logs]# ll
total 664
-rw-rw-r--. 1 hadoop hadoop 171498 Nov 16 16:10 hadoop-hadoop-datanode-hadoop01.log
-rw-rw-r--. 1 hadoop hadoop 726 Nov 16 16:10 hadoop-hadoop-datanode-hadoop01.out
-rw-rw-r--. 1 hadoop hadoop 726 Nov 16 09:36 hadoop-hadoop-datanode-hadoop01.out.1
-rw-rw-r--. 1 hadoop hadoop 726 Nov 15 14:27 hadoop-hadoop-datanode-hadoop01.out.2
-rw-rw-r--. 1 hadoop hadoop 726 Nov 15 11:14 hadoop-hadoop-datanode-hadoop01.out.3
-rw-rw-r--. 1 hadoop hadoop 726 Nov 15 11:12 hadoop-hadoop-datanode-hadoop01.out.4
-rw-rw-r--. 1 hadoop hadoop 203436 Nov 16 16:11 hadoop-hadoop-namenode-hadoop01.log
-rw-rw-r--. 1 hadoop hadoop 6007 Nov 16 16:19 hadoop-hadoop-namenode-hadoop01.out
-rw-rw-r--. 1 hadoop hadoop 726 Nov 16 09:35 hadoop-hadoop-namenode-hadoop01.out.1
-rw-rw-r--. 1 hadoop hadoop 6007 Nov 15 14:27 hadoop-hadoop-namenode-hadoop01.out.2
-rw-rw-r--. 1 hadoop hadoop 6012 Nov 15 11:29 hadoop-hadoop-namenode-hadoop01.out.3
-rw-rw-r--. 1 hadoop hadoop 726 Nov 15 11:12 hadoop-hadoop-namenode-hadoop01.out.4
-rw-rw-r--. 1 hadoop hadoop 0 Nov 15 11:12 SecurityAuth-hadoop.audit
drwxr-xr-x. 2 hadoop hadoop 6 Nov 16 16:58 userlogs
-rw-rw-r--. 1 hadoop hadoop 122893 Nov 16 16:52 yarn-hadoop-nodemanager-hadoop01.log
-rw-rw-r--. 1 hadoop hadoop 1508 Nov 16 16:12 yarn-hadoop-nodemanager-hadoop01.out
-rw-rw-r--. 1 hadoop hadoop 1515 Nov 15 11:15 yarn-hadoop-nodemanager-hadoop01.out.1
-rw-rw-r--. 1 hadoop hadoop 106074 Nov 16 16:22 yarn-hadoop-resourcemanager-hadoop01.log
-rw-rw-r--. 1 hadoop hadoop 1524 Nov 16 16:12 yarn-hadoop-resourcemanager-hadoop01.out
-rw-rw-r--. 1 hadoop hadoop 1531 Nov 15 11:15 yarn-hadoop-resourcemanager-hadoop01.out.1
- 使用hadoop用户进行免密登陆时,仍需要密码,也是权限问题,请仔细研读3.7章节。
根据我以上的配置,我只遇到了这两个问题。
网友评论