美文网首页
Spark与函数式编程

Spark与函数式编程

作者: 诺之林 | 来源:发表于2021-04-20 16:20 被阅读0次

    本文基于函数式编程简介

    目录

    引入

    • Functional Programming鼻祖Lisp 现代方言主要有Clojure

    • Scala = 多范式(multi-paradigm)编程语言 支持OOP和FP

    • Spark的开发语言即是Scala

    特点

    This means functions can be treated as values. They can be assigned as values, passed into functions, and returned from functions

    scala
    
    val f = (s: String) => println(s)
    
    Array("Hello", "Scala").map(f)
    

    Since data structures can’t be changed, 'adding' or 'removing' something from an immutable collection means creating a new collection just like the old one but with the needed change

    /opt/services/spark/bin/spark-shell
    
    val rdd = sc.parallelize(Array(1, 2, 3, 4, 5))
    # rdd: org.apache.spark.rdd.RDD[Int] = ParallelCollectionRDD[0] at parallelize 
    
    val mapRDD = rdd.map(i => 10 + i)
    # mapRDD: org.apache.spark.rdd.RDD[Int] = MapPartitionsRDD[1] at map
    

    要求

    • Pure Function, no Side Effects
    Pure Function: 相同输入的情况下输出总是相同
    
    No Side Effects: 不会修改函数外部的任何变量
    
    • Expression, no Statement
    Expression: 单纯的运算过程, 总是有返回值
    
    Statement: 执行某种操作, 没有返回值
    

    价值

    • Method Chaining => Pure functions are easier to reason about
    function-programming-introduction-01.png
    • Parallel Programming => 多个任务被同时(多核)处理
    function-programming-introduction-02.png

    参考

    相关文章

      网友评论

          本文标题:Spark与函数式编程

          本文链接:https://www.haomeiwen.com/subject/zbbafltx.html