美文网首页
Windows部署Spark

Windows部署Spark

作者: SmailTrey | 来源:发表于2019-09-26 13:49 被阅读0次

    一、Spark等软件下载
    1.Spark下载地址:http://spark.apache.org/downloads.html
    2.Hadoop下载地址:https://archive.apache.org/dist/hadoop/common/
    3.Scala 下载地址:http://www.scala-lang.org/download/all.html
    二、配置相应的环境变量

    image.png
    image.png

    三、代码验证

     <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-streaming_2.11</artifactId>
            <version>1.6.1</version>
    </dependency>
    
    /**
      * Java版本
      */
    package com.lin.spark;
    
    
    import org.apache.spark.SparkConf;
    import org.apache.spark.api.java.JavaSparkContext;
    import org.apache.spark.streaming.Durations;
    import org.apache.spark.streaming.api.java.JavaReceiverInputDStream;
    import org.apache.spark.streaming.api.java.JavaStreamingContext;
    import scala.Tuple2;
    
    import java.util.Arrays;
    
    public class NetworkWordCount {
        public static void main(String[] args) {
            NetworkWordCount networkWordCount = new NetworkWordCount();
            networkWordCount.getNetworkWordCount();
        }
        public void getNetworkWordCount() {
            SparkConf sparkConf = new SparkConf()
                    .setMaster("local[*]")
                    .setAppName("NetworkWordCount");
            JavaSparkContext javaSparkContext = new JavaSparkContext(sparkConf);
            JavaStreamingContext javaStreamingContext = new JavaStreamingContext(javaSparkContext, Durations.seconds(5));
            JavaReceiverInputDStream<String> lines = javaStreamingContext.socketTextStream("192.168.10.132", 9999);
            lines.flatMap(line -> Arrays.asList(line.split(" ")).iterator())
                            .mapToPair(word -> new Tuple2<String, Integer>(word,1))
                            .reduceByKey((a,b) -> a + b)
                            .print();
            try {
                javaStreamingContext.start();
                javaStreamingContext.awaitTermination();
            } catch (InterruptedException e) {
                e.printStackTrace();
            }finally {
                if(null != javaStreamingContext){
                    javaStreamingContext.close();
                }
            }
        }
    }
     
    /**
      * Scala版本
      */
    object NetworkWordCount {
      def main(args: Array[String]): Unit = {
        val conf = new SparkConf().setMaster("local[2]").setAppName("NetworkWordCount")
        val streamingContext = new StreamingContext(conf, Seconds(5))
        val lines = streamingContext.socketTextStream("192.168.10.132", 9999)
        val words = lines.flatMap(_.split(" "));
        val pairs = words.map(word => (word, 1));
        val wordCounts = pairs.reduceByKey(_ + _)
        wordCounts.print()
        streamingContext.start()
        //等待停止
        streamingContext.awaitTermination()
      }
    }
    
    image.png

    四、可能存在的错误

    • Exception in thread "main" java.lang.NoSuchMethodError:scala.collection.immutable.HashSet$.empty()Lscala/collection/immutable/HashSet;

    • 解决办法:找到合适的Scala版本

    • Could not locate executable D:\Spark\hadoop-2.6.0\bin\winutils.exe in the Hadoop binaries

    • 解决办法:下载hadoop-common-2.6.0-bin-master文件 把bin目录的文件 放在hadoop目录下.再把hadoop.dll 复制放在C:/Windows/System32下

    五、LaTeX相关链接
    LaTeX/Colors 地址: https://en.wikibooks.org/wiki/LaTeX/Colors
    LaTeX/Fonts 地址:https://en.wikibooks.org/wiki/LaTeX/Fonts

    相关文章

      网友评论

          本文标题:Windows部署Spark

          本文链接:https://www.haomeiwen.com/subject/ysaguctx.html