发自简书
master 192.168.179.129
worker1 192.168.179.130
worker2 192.168.179.131
操作系统Ubuntu 18.04.2
在三台机子上都要做的事,以master为例
sudo apt-get install ssh
sudo apt-get install rsync
tar -xzvf hadoop-3.1.2.tar.gz
1、修改主机名字,把ubuntu改成master,从机改worker1、worker2
sudo vim /etc/hostname
2、修改ip地址和主机名之间的映射
sudo vim /etc/hosts
3、修改/etc/profile
export JAVA_HOME=/usr/java/jdk1.8.0_221
export JRE_HOME=$JAVA_HOME/jre
export CLASSPATH=$JAVA_HOME/lib:$JRE_HOME/lib:$CLASSPATH
export PATH=$JAVA_HOME/bin:$JRE_HOME/bin:$PATH
export HADOOP_HOME=/home/你的用户名/Downloads/hadoop-3.1.2
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
4、修改hadoop配置文件
目录在hadoop-3.1.2/etc/hadoop下面
在/hadoop-3.1.2/下建立文件夹hdfs,再建立文件夹name tmp data有问题就删
hadoop-env.sh
core-site.xml (后面4个是为了让HIVE连上)
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://master:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/njupt4145438/Downloads/hadoop-3.1.2/hdfs/tmp</value>
</property>
<property>
<name>hadoop.proxyuser.root.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.root.groups</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.用户名.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.用户名.groups</name>
<value>*</value>
</property>
</configuration>
hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.namenode.dir</name>
<value>/home/用户名/Downloads/hadoop-3.1.2/hdfs/name</value>
</property>
<property>
<name>dfs.datanode.dir</name>
<value>/home/用户名/Downloads/hadoop-3.1.2/hdfs/data</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
</configuration>
mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
yarn-site.xml
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.resourcemanager.hostname</name>
<value>master</value>
</property>
<property>
<name>yarn.resourcemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
workers
worker1
worker2
4、分发文件
虚拟机克隆或者scp -r
5、ssh无密码登入
启动的脚本是通过master的shell脚本用ssh进行通信
三台机子都要做的
把公钥 id_rsa.pub加到authorized_keys后面
ssh-keygen -t rsa -P ""
cd ~/.ssh
cat id_rsa.pub >> authorized_keys
ssh localhost
exit
master无密码登陆worker1
mater主机中输入命令复制一份公钥到home中
cp .ssh/id_rsa.pub ~/id_rsa_master.pub
把master的home目录下的id_rsa_master.pub拷到worker1的home下
worker1的home目录下分别输入命令
cat id_rsa_master.pub >> .ssh/authorized_keys
启动脚本
cd hadoop-3.1.2
sbin/start-all.sh
sbin/stop-all.sh
jps在三台机子下看看进程(要用oracle的JDK,否则没有jps工具)
接下来就可以玩了,出现问题不用着急,看logs里的日志
//Make the HDFS directories required to execute MapReduce jobs:
$ bin/hdfs dfs -mkdir /user
$ bin/hdfs dfs -mkdir /user/<username>
//传一个test.txt,删掉本地的test.txt
$ bin/hdfs dfs -put test.txt
$ rm test.txt
$ bin/hdfs dfs -ls /user/njupt4145438
$ bin/hdfs dfs -get test.txt
//磁盘使用信息
$ bin/hdfs dfsadmin -report
网友评论