美文网首页
rdd dataframe streaming cache pe

rdd dataframe streaming cache pe

作者: chailei | 来源:发表于2018-09-11 10:25 被阅读0次

RDD

  /**
   * Persist this RDD with the default storage level (`MEMORY_ONLY`).
   */
  def persist(): this.type = persist(StorageLevel.MEMORY_ONLY)

  /**
   * Persist this RDD with the default storage level (`MEMORY_ONLY`).
   */
  def cache(): this.type = persist()

Dataset

  /**
   * Persist this Dataset with the default storage level (`MEMORY_AND_DISK`).
   *
   * @group basic
   * @since 1.6.0
   */
  def persist(): this.type = {
    sparkSession.sharedState.cacheManager.cacheQuery(this)
    this
  }

  /**
   * Persist this Dataset with the default storage level (`MEMORY_AND_DISK`).
   *
   * @group basic
   * @since 1.6.0
   */
  def cache(): this.type = persist()

Streaming

  /** Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER) */
  def persist(): DStream[T] = persist(StorageLevel.MEMORY_ONLY_SER)

  /** Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER) */
  def cache(): DStream[T] = persist()

共同点:cache底层调用persist
不同点:RDD 默认是 MEMORY_ONLY
Dataset 默认是 MEMORY_AND_DISK
Streaming默认是MEMORY_ONLY_SER

相关文章

网友评论

      本文标题:rdd dataframe streaming cache pe

      本文链接:https://www.haomeiwen.com/subject/crvygftx.html