美文网首页
Spark-0.5.2源码解析-collection shuff

Spark-0.5.2源码解析-collection shuff

作者: 编程回忆录 | 来源:发表于2018-10-08 22:39 被阅读0次

collection shuffle的意思就是打乱列表元素原有顺序返回一个新的列表,在Spark 0.5.2的源代码版本中,实现代码如下:

/**
    * Shuffle the elements of a collection into a random order,returning the
    * result in a new collection.Unlike scala.util.Random.shuffle,this method
    * uses a local random number generator,avoiding inter-thread contention.
    *
    * @param seq
    * @tparam T
    * @return
    */
  def randomize[T: ClassManifest](seq: TraversableOnce[T]): Seq[T] = {
    randomizeInPlace(seq.toArray)
  }

  /**
    * Shuffle the elements of an array into a random order,modifying the
    * original array.Returns the original array.
    *
    */
  def randomizeInPlace[T](arr: Array[T], rand: Random = new Random): Array[T] = {
    for (i <- (arr.length - 1) to 1 by -1) {
      val j = rand.nextInt(i)
      val tmp = arr(j)
      arr(j) = arr(i)
      arr(i) = tmp
    }
    arr
  }

这里值得关注的是randomizeInPlace方法参数传递了Random类型参数以避免多线程干扰问题。

相关文章

网友评论

      本文标题:Spark-0.5.2源码解析-collection shuff

      本文链接:https://www.haomeiwen.com/subject/ibvjaftx.html