Spark单机环境搭建并运行单次统计
Spark环境搭建
jdk1.7.0 | hadoop-2.6.5 | scala-2.11.4 | spark-1.6.2 |
Hadoop环境安装
Spark环境安装
mv spark-env.sh.template spark-env.sh
//在spark-env.sh追加如下内容:
export JAVA_HOME=/usr/local/bigdata/software/jdk1.7.0
export SCALA_HOME=/usr/local/bigdata/software/scala-2.11.4
export SPARK_MASTER_IP=hadoop1
export SPARK_WORKER_MEMORY=2G
- 配置环境变量
export SPARK_HOMT=/usr/local/bigdata/software/spark-1.6.2
export PATH=$SPARK_HOMT/bin:$PATH
export PATH=$SPARK_HOMT/sbin:$PATH
运行单次统计
- copy本地文件到HDFS
##vim test
you,jump
i,jump
you,jump
i,jump
you,jump
i,jump
##copy到HDFS
hdfs dfs -copyFromLocal /words
-
spark-shell进入命令行
-
运行Scala统计单次例子
输入如下命令:
sc.textFile("hdfs://hadoop1:9000/words").flatMap(line => line.split(",")).map(word => (word,1)).reduceByKey(_+_).foreach(result => println(result._1 + " => " + result._2))
you => 1
结果:
you => 3
jump => 6
i => 3
-
图示运行过程
网友评论