美文网首页
spark的sortbykey的二次排序

spark的sortbykey的二次排序

作者: 南山小和尚 | 来源:发表于2019-03-03 16:11 被阅读0次

    基本思路是自定义一个sortbykey的类,然后是使用map转换,其中key为该对象即可,最后调用算子sortbykey,基本实现如下:

    1、自定义类

    class SecondSortByKeyScala(val first :String,val second :Int)extends Ordered[SecondSortByKeyScala]with Serializable {

    override def compare(that: SecondSortByKeyScala): Int = {

    val compare =this.first.compareTo(that.first)

    if(compare ==0){

    return this.second.compareTo(that.second)

    }

    return compare

    }

    }

    2、spark执行代码如下

    val spark = SparkSession.builder().appName("spark1").master("local[1]").getOrCreate();

    val sc = spark.sparkContext;

    val list =Array("xiao,76","xiao,56","xiao1,98","xiao1,65",

    "xiao2,24","xiao2,98","xiao3,77","xiao3,56","xiao3,96");

    val rdd = sc.parallelize(list)

    val sortStartValue = rdd.map(x =>(new SecondSortByKeyScala(

    x.split(",")(0),x.split(",")(1).toInt),x))

    val rddsortbeing = sortStartValue.sortByKey(false)

    rddsortbeing.foreach(x =>{

    println(x._2)

    })

    3、打印结果如下

    相关文章

      网友评论

          本文标题:spark的sortbykey的二次排序

          本文链接:https://www.haomeiwen.com/subject/ffxmuqtx.html