美文网首页
spark应用开发HelloWorld

spark应用开发HelloWorld

作者: wangqiaoshi | 来源:发表于2017-12-24 15:38 被阅读0次

    准备

    代码列子
    1.安装scala插件
    开发工具 intellij-IDEA

    image.png
    2.构建文件
    在这里的例子,构建工具采用的是maven,sbt我们在实践中,发现拉取依赖包慢,而且每次更新或者添加依赖的时候,都会遍历检查所有的依赖,非常耗cpu,影响开发,建议maven.

    插件

    <scala.version>2.11.8</scala.version>
     <plugin>
            <groupId>org.scala-tools</groupId>
            <artifactId>maven-scala-plugin</artifactId>
            <executions>
                <execution>
                    <goals>
                        <goal>compile</goal>
                        <goal>testCompile</goal>
                    </goals>
                </execution>
            </executions>
            <configuration>
                <scalaVersion>${scala.version}</scalaVersion>
                <args>
                    <arg>-target:jvm-1.5</arg>
                </args>
            </configuration>
        </plugin>
    
    <plugin>
            <groupId>org.scala-tools</groupId>
            <artifactId>maven-scala-plugin</artifactId>
            <configuration>
                <scalaVersion>${scala.version}</scalaVersion>
            </configuration>
    </plugin>
    
    //依赖:
    <dependency>
          <groupId>org.apache.spark</groupId>
          <artifactId>spark-core_2.11</artifactId>
          <version>2.2.1</version>
        </dependency>
    
        <dependency>
          <groupId>org.apache.spark</groupId>
          <artifactId>spark-sql_2.11</artifactId>
          <version>2.2.1</version>
        </dependency>
    
        <dependency>
          <groupId>org.apache.spark</groupId>
          <artifactId>spark-hive_2.11</artifactId>
          <version>2.2.1</version>
        </dependency>
    

    3.开发代码
    数据people.json

    {"name":"zhangsan","age":25}
    {"name":"wangwu","age":20}
    {"name":"lisi","age":28}
    {"name":"mazi","age":18}
    

    新建HelloWorld scala object.

     val spark = SparkSession
          .builder()
          .master("local[2]")
          .appName("hello world")
          .config("spark.some.config.option", "some-value")
          .getOrCreate()
        import spark.implicits._
    
        val peopleDF = spark.read.json("src/main/resources/people.json")
    
        val newPeopleDF = peopleDF.map(row=>{
          val name = row.getAs[String]("name")
          val age = row.getAs[Long]("age")
          (name,age-18)
        }).toDF("name","理黄花大闺女的年龄差")
    
        newPeopleDF.show()
    

    输出:


    image.png

    相关文章

      网友评论

          本文标题:spark应用开发HelloWorld

          本文链接:https://www.haomeiwen.com/subject/ijkhgxtx.html