Mac 上安装 Hadoop

作者: 李飞_fd28 | 来源:发表于2019-12-12 16:25 被阅读0次

Mac 上安装 Hadoop
如何在MacOSX上安装hadoop(转)
Flink on Yarn
Spark的安装（基于Mac）
Mac OSX上安装Hadoop
hadoop之踩坑记录--在mac系统上装hadoop生态
Mac下Hadoop伪分布式安装及出现的问题（JDK版本，Had
Mac安装Hadoop
mac安装hadoop
Hbase安装

官方地址

https://hadoop.apache.org/

环境准备

1. 下载安装包

https://mirrors.cnnic.cn/apache/hadoop/common/stable2/

~/opt/hadoop-3.2.1.tar.gz

2. 解压缩

tar xzf hadoop-3.2.1.tar.gz
mv hadoop-3.2.1/*  ~/opt/hadoop/
cd ~/opt/hadoop/
ls -l
-rw-r--r--@  1 lifei  staff  150569  9 10 22:35 LICENSE.txt
-rw-r--r--@  1 lifei  staff   22125  9 10 22:35 NOTICE.txt
-rw-r--r--@  1 lifei  staff    1361  9 10 22:35 README.txt
drwxr-xr-x@ 13 lifei  staff     416  9 11 00:51 bin
drwxr-xr-x@  3 lifei  staff      96  9 10 23:58 etc
drwxr-xr-x@  7 lifei  staff     224  9 11 00:51 include
drwxr-xr-x@  3 lifei  staff      96  9 11 00:51 lib
drwxr-xr-x@ 14 lifei  staff     448  9 11 00:51 libexec
drwxr-xr-x@ 29 lifei  staff     928  9 10 23:58 sbin
drwxr-xr-x@  4 lifei  staff     128  9 11 01:11 share

3. Hadoop操作模式

本地/独立模式：下载Hadoop在系统中，默认情况下之后，它会被配置在一个独立的模式，用于运行Java程序。
模拟分布式模式：这是在单台机器的分布式模拟。Hadoop守护每个进程，如 hdfs, yarn, MapReduce 等，都将作为一个独立的java程序运行。这种模式对开发非常有用。
完全分布式模式：这种模式是完全分布式的最小两台或多台计算机的集群。

在本地模式下安装Hadoop

有单个JVM运行任何守护进程一切都运行。独立模式适合于开发期间运行MapReduce程序，因为它很容易进行测试和调试。

设置Hadoop

ls ~/.bashrc
ls: /Users/lifei/.bashrc: No such file or directory
vi ~/.bashrc
export HADOOP_HOME=/Users/lifei/opt/hadoop
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
保存 ~/.bashrc 执行 wq
source ~/.bashrc 
执行： hadoop
Usage: hadoop [OPTIONS] SUBCOMMAND [SUBCOMMAND OPTIONS]
 or    hadoop [OPTIONS] CLASSNAME [CLASSNAME OPTIONS]
  where CLASSNAME is a user-provided Java class

  OPTIONS is none or any of:

--config dir                     Hadoop config directory
--debug                          turn on shell script debug mode
--help                           usage information
buildpaths                       attempt to add class files from build tree
hostnames list[,of,host,names]   hosts to use in slave mode
hosts filename                   list of hosts to use in slave mode
loglevel level                   set the log4j level for this command
workers                          turn on worker mode

  SUBCOMMAND is one of:


    Admin Commands:

daemonlog     get/set the log level for each daemon

    Client Commands:

archive       create a Hadoop archive
checknative   check native Hadoop and compression libraries availability
classpath     prints the class path needed to get the Hadoop jar and the required libraries
conftest      validate configuration XML files
credential    interact with credential providers
distch        distributed metadata changer
distcp        copy file or directories recursively
dtutil        operations related to delegation tokens
envvars       display computed Hadoop environment variables
fs            run a generic filesystem user client
gridmix       submit a mix of synthetic job, modeling a profiled from production load
jar <jar>     run a jar file. NOTE: please use "yarn jar" to launch YARN applications, not this command.
jnipath       prints the java.library.path
kdiag         Diagnose Kerberos Problems
kerbname      show auth_to_local principal conversion
key           manage keys via the KeyProvider
rumenfolder   scale a rumen input trace
rumentrace    convert logs into a rumen trace
s3guard       manage metadata on S3
trace         view and modify Hadoop tracing settings
version       print the version

    Daemon Commands:

kms           run KMS, the Key Management Server

SUBCOMMAND may print help when invoked w/o parameters or with -h.

部署成功。

运行Hadoop

运行Hadoop 界的 Hello World !!!，统计字符的数量。
准备工作

mkdir ~/hadoop_workspace
mkdir ~/hadoop_workspace/input
echo 'Lightbatis 增强 MyBatis 版Java 数据库持久层，更简洁列易用。Lightbatis 增强 MyBatis 版Java 数据库持久层，更简洁列易用。' > ~/hadoop_workspace/input/hello.txt

准备工作完成后如下：

(base) lifeideMacBook-Pro:input lifei$ pwd
/Users/lifei/hadoop_workspace/input
(base) lifeideMacBook-Pro:input lifei$ ls
hello.txt
(base) lifeideMacBook-Pro:input lifei$ cat hello.txt 
Lightbatis 增强 MyBatis 版Java 数据库持久层，更简洁列易用。Lightbatis 增强 MyBatis 版Java 数据库持久层，更简洁列易用。

开始运行

cd ~/hadoop_workspace
hadoop jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.2.1.jar wordcount input output

开始运行中，控制台输出如下：

(base) lifeideMacBook-Pro:hadoop_workspace lifei$ hadoop jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.2.1.jar wordcount input output
2019-12-12 16:20:18,928 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2019-12-12 16:20:19,110 INFO impl.MetricsConfig: Loaded properties from hadoop-metrics2.properties
2019-12-12 16:20:24,165 INFO impl.MetricsSystemImpl: Scheduled Metric snapshot period at 10 second(s).
2019-12-12 16:20:24,165 INFO impl.MetricsSystemImpl: JobTracker metrics system started
2019-12-12 16:20:29,421 INFO input.FileInputFormat: Total input files to process : 1
2019-12-12 16:20:29,470 INFO mapreduce.JobSubmitter: number of splits:1
2019-12-12 16:20:29,573 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_local1284066301_0001
2019-12-12 16:20:29,573 INFO mapreduce.JobSubmitter: Executing with tokens: []
2019-12-12 16:20:29,675 INFO mapreduce.Job: The url to track the job: http://localhost:8080/
2019-12-12 16:20:29,675 INFO mapreduce.Job: Running job: job_local1284066301_0001
2019-12-12 16:20:29,676 INFO mapred.LocalJobRunner: OutputCommitter set in config null
2019-12-12 16:20:29,681 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 2
2019-12-12 16:20:29,681 INFO output.FileOutputCommitter: FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false
2019-12-12 16:20:29,682 INFO mapred.LocalJobRunner: OutputCommitter is org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter
2019-12-12 16:20:29,714 INFO mapred.LocalJobRunner: Waiting for map tasks
2019-12-12 16:20:29,715 INFO mapred.LocalJobRunner: Starting task: attempt_local1284066301_0001_m_000000_0
2019-12-12 16:20:29,731 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 2
2019-12-12 16:20:29,731 INFO output.FileOutputCommitter: FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false
2019-12-12 16:20:29,738 INFO util.ProcfsBasedProcessTree: ProcfsBasedProcessTree currently is supported only on Linux.
2019-12-12 16:20:29,738 INFO mapred.Task:  Using ResourceCalculatorProcessTree : null
2019-12-12 16:20:29,741 INFO mapred.MapTask: Processing split: file:/Users/lifei/hadoop_workspace/input/hello.txt:0+153
2019-12-12 16:20:29,798 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
2019-12-12 16:20:29,798 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
2019-12-12 16:20:29,798 INFO mapred.MapTask: soft limit at 83886080
2019-12-12 16:20:29,798 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
2019-12-12 16:20:29,798 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
2019-12-12 16:20:29,801 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
2019-12-12 16:20:29,806 INFO mapred.LocalJobRunner: 
2019-12-12 16:20:29,806 INFO mapred.MapTask: Starting flush of map output
2019-12-12 16:20:29,806 INFO mapred.MapTask: Spilling map output
2019-12-12 16:20:29,806 INFO mapred.MapTask: bufstart = 0; bufend = 189; bufvoid = 104857600
2019-12-12 16:20:29,806 INFO mapred.MapTask: kvstart = 26214396(104857584); kvend = 26214364(104857456); length = 33/6553600
2019-12-12 16:20:29,826 INFO mapred.MapTask: Finished spill 0
2019-12-12 16:20:29,841 INFO mapred.Task: Task:attempt_local1284066301_0001_m_000000_0 is done. And is in the process of committing
2019-12-12 16:20:29,843 INFO mapred.LocalJobRunner: map
2019-12-12 16:20:29,843 INFO mapred.Task: Task 'attempt_local1284066301_0001_m_000000_0' done.
2019-12-12 16:20:29,848 INFO mapred.Task: Final Counters for attempt_local1284066301_0001_m_000000_0: Counters: 18
    File System Counters
        FILE: Number of bytes read=316857
        FILE: Number of bytes written=840334
        FILE: Number of read operations=0
        FILE: Number of large read operations=0
        FILE: Number of write operations=0
    Map-Reduce Framework
        Map input records=1
        Map output records=9
        Map output bytes=189
        Map output materialized bytes=172
        Input split bytes=115
        Combine input records=9
        Combine output records=6
        Spilled Records=6
        Failed Shuffles=0
        Merged Map outputs=0
        GC time elapsed (ms)=0
        Total committed heap usage (bytes)=257425408
    File Input Format Counters 
        Bytes Read=153
2019-12-12 16:20:29,849 INFO mapred.LocalJobRunner: Finishing task: attempt_local1284066301_0001_m_000000_0
2019-12-12 16:20:29,849 INFO mapred.LocalJobRunner: map task executor complete.
2019-12-12 16:20:29,852 INFO mapred.LocalJobRunner: Waiting for reduce tasks
2019-12-12 16:20:29,852 INFO mapred.LocalJobRunner: Starting task: attempt_local1284066301_0001_r_000000_0
2019-12-12 16:20:29,860 INFO output.FileOutputCommitter: File Output Committer Algorithm version is 2
2019-12-12 16:20:29,860 INFO output.FileOutputCommitter: FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false
2019-12-12 16:20:29,860 INFO util.ProcfsBasedProcessTree: ProcfsBasedProcessTree currently is supported only on Linux.
2019-12-12 16:20:29,860 INFO mapred.Task:  Using ResourceCalculatorProcessTree : null
2019-12-12 16:20:29,864 INFO mapred.ReduceTask: Using ShuffleConsumerPlugin: org.apache.hadoop.mapreduce.task.reduce.Shuffle@67d50939
2019-12-12 16:20:29,866 WARN impl.MetricsSystemImpl: JobTracker metrics system already initialized!
2019-12-12 16:20:29,885 INFO reduce.MergeManagerImpl: MergerManager: memoryLimit=2672505600, maxSingleShuffleLimit=668126400, mergeThreshold=1763853824, ioSortFactor=10, memToMemMergeOutputsThreshold=10
2019-12-12 16:20:29,888 INFO reduce.EventFetcher: attempt_local1284066301_0001_r_000000_0 Thread started: EventFetcher for fetching Map Completion Events
2019-12-12 16:20:29,923 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local1284066301_0001_m_000000_0 decomp: 168 len: 172 to MEMORY
2019-12-12 16:20:29,930 INFO reduce.InMemoryMapOutput: Read 168 bytes from map-output for attempt_local1284066301_0001_m_000000_0
2019-12-12 16:20:29,932 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 168, inMemoryMapOutputs.size() -> 1, commitMemory -> 0, usedMemory ->168
2019-12-12 16:20:29,933 INFO reduce.EventFetcher: EventFetcher is interrupted.. Returning
2019-12-12 16:20:29,934 INFO mapred.LocalJobRunner: 1 / 1 copied.
2019-12-12 16:20:29,934 INFO reduce.MergeManagerImpl: finalMerge called with 1 in-memory map-outputs and 0 on-disk map-outputs
2019-12-12 16:20:29,948 INFO mapred.Merger: Merging 1 sorted segments
2019-12-12 16:20:29,949 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 155 bytes
2019-12-12 16:20:29,957 INFO reduce.MergeManagerImpl: Merged 1 segments, 168 bytes to disk to satisfy reduce memory limit
2019-12-12 16:20:29,958 INFO reduce.MergeManagerImpl: Merging 1 files, 172 bytes from disk
2019-12-12 16:20:29,958 INFO reduce.MergeManagerImpl: Merging 0 segments, 0 bytes from memory into reduce
2019-12-12 16:20:29,958 INFO mapred.Merger: Merging 1 sorted segments
2019-12-12 16:20:29,958 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 155 bytes
2019-12-12 16:20:29,959 INFO mapred.LocalJobRunner: 1 / 1 copied.
2019-12-12 16:20:29,979 INFO Configuration.deprecation: mapred.skip.on is deprecated. Instead, use mapreduce.job.skiprecords
2019-12-12 16:20:29,981 INFO mapred.Task: Task:attempt_local1284066301_0001_r_000000_0 is done. And is in the process of committing
2019-12-12 16:20:29,982 INFO mapred.LocalJobRunner: 1 / 1 copied.
2019-12-12 16:20:29,982 INFO mapred.Task: Task attempt_local1284066301_0001_r_000000_0 is allowed to commit now
2019-12-12 16:20:29,983 INFO output.FileOutputCommitter: Saved output of task 'attempt_local1284066301_0001_r_000000_0' to file:/Users/lifei/hadoop_workspace/output
2019-12-12 16:20:29,984 INFO mapred.LocalJobRunner: reduce > reduce
2019-12-12 16:20:29,984 INFO mapred.Task: Task 'attempt_local1284066301_0001_r_000000_0' done.
2019-12-12 16:20:29,984 INFO mapred.Task: Final Counters for attempt_local1284066301_0001_r_000000_0: Counters: 24
    File System Counters
        FILE: Number of bytes read=317233
        FILE: Number of bytes written=840660
        FILE: Number of read operations=0
        FILE: Number of large read operations=0
        FILE: Number of write operations=0
    Map-Reduce Framework
        Combine input records=0
        Combine output records=0
        Reduce input groups=6
        Reduce shuffle bytes=172
        Reduce input records=6
        Reduce output records=6
        Spilled Records=6
        Shuffled Maps =1
        Failed Shuffles=0
        Merged Map outputs=1
        GC time elapsed (ms)=0
        Total committed heap usage (bytes)=257425408
    Shuffle Errors
        BAD_ID=0
        CONNECTION=0
        IO_ERROR=0
        WRONG_LENGTH=0
        WRONG_MAP=0
        WRONG_REDUCE=0
    File Output Format Counters 
        Bytes Written=154
2019-12-12 16:20:29,984 INFO mapred.LocalJobRunner: Finishing task: attempt_local1284066301_0001_r_000000_0
2019-12-12 16:20:29,985 INFO mapred.LocalJobRunner: reduce task executor complete.
2019-12-12 16:20:30,682 INFO mapreduce.Job: Job job_local1284066301_0001 running in uber mode : false
2019-12-12 16:20:30,683 INFO mapreduce.Job:  map 100% reduce 100%
2019-12-12 16:20:30,684 INFO mapreduce.Job: Job job_local1284066301_0001 completed successfully
2019-12-12 16:20:30,692 INFO mapreduce.Job: Counters: 30
    File System Counters
        FILE: Number of bytes read=634090
        FILE: Number of bytes written=1680994
        FILE: Number of read operations=0
        FILE: Number of large read operations=0
        FILE: Number of write operations=0
    Map-Reduce Framework
        Map input records=1
        Map output records=9
        Map output bytes=189
        Map output materialized bytes=172
        Input split bytes=115
        Combine input records=9
        Combine output records=6
        Reduce input groups=6
        Reduce shuffle bytes=172
        Reduce input records=6
        Reduce output records=6
        Spilled Records=12
        Shuffled Maps =1
        Failed Shuffles=0
        Merged Map outputs=1
        GC time elapsed (ms)=0
        Total committed heap usage (bytes)=514850816
    Shuffle Errors
        BAD_ID=0
        CONNECTION=0
        IO_ERROR=0
        WRONG_LENGTH=0
        WRONG_MAP=0
        WRONG_REDUCE=0
    File Input Format Counters 
        Bytes Read=153
    File Output Format Counters 
        Bytes Written=154

查看执行结果：

(base) lifeideMacBook-Pro:hadoop_workspace lifei$ ls -l output/
total 8
-rw-r--r--  1 lifei  staff    0 12 12 16:20 _SUCCESS
-rw-r--r--  1 lifei  staff  142 12 12 16:20 part-r-00000

查看 part-r-0000 的结果

(base) lifeideMacBook-Pro:hadoop_workspace lifei$ cat output/part-r-00000 
Lightbatis  1
MyBatis 2
增强  2
数据库持久层，更简洁列易用。  1
数据库持久层，更简洁列易用。Lightbatis    1
版Java   2

Hadoop 单机版安装完成。

Mac 上安装 Hadoop
官方地址 https://hadoop.apache.org/ 环境准备 1. 下载安装包 https://mir...
如何在MacOSX上安装hadoop(转)
如何在MacOSX上安装hadoop 其中一篇文章地址为how-to-install-hadoop-on-mac-...
Flink on Yarn
安装和启动YARN 根据『 Hadoop 』mac下Hadoop的安装与使用这篇文章的指示安装并配置好Hadoop...
Spark的安装（基于Mac）
一、简介 1.1内容在mac电脑上成功安装spark（不用预先安装hadoop），并在jupyter上使用pys...
Mac OSX上安装Hadoop
安装Hadoop 只是安装来学习Hadoop，所以，在自己笔记本上安装的。正式的线上环境会用多台机器搭建集群或者...
hadoop之踩坑记录--在mac系统上装hadoop生态
起因是这样，因为开发需求，得在mac上安装hadoop、hbase、hive等大数据开发环境在mac上装环境，你...
Mac下Hadoop伪分布式安装及出现的问题（JDK版本，Had
安装 Hadoop安装基本是参照网上现有的介绍。我主要参考了下面两篇：Mac 系统安装Hadoop 2.7.3和...
Mac安装Hadoop
查看mac系统Java版本和安装目录应该能查看到结果修改各自的hostname 用pd新建一个ubuntu虚拟...
mac安装hadoop
Pre-requisite ssh installedhadoop 3.1.2Java installed Had...
Hbase安装
Hadoop安装总述 hadoop安装是hbase安装的前提条件, hadoop的HDFS是hbase实际上存...

Mac 上安装 Hadoop

官方地址

环境准备

1. 下载安装包

2. 解压缩

3. Hadoop操作模式

在本地模式下安装Hadoop

设置Hadoop

运行Hadoop

相关文章

Mac 上安装 Hadoop

如何在MacOSX上安装hadoop(转)

Flink on Yarn

Spark的安装（基于Mac）

Mac OSX上安装Hadoop

hadoop之踩坑记录--在mac系统上装hadoop生态

Mac下Hadoop伪分布式安装及出现的问题（JDK版本，Had

Mac安装Hadoop

mac安装hadoop

Hbase安装

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读

程序员

lightbatis