Spark-Streaming
Spark-Streaming,思维导图
Spark-Streaming:
起点:
StreamingContext
DStream:
RDD队列
自定义数据源 (并实现 onStart、onStop 方法)
Kafka 数据源
DStream 转换:
无状态转化操作:
Transform
join
有状态转化操作:
UpdateStateByKey
WindowOperations :
window
countByWindow
reduceByWindow
reduceByKeyAndWindow(func, windowLength, slideInterval, [numTasks])
reduceByKeyAndWindow(func, invFunc, windowLength, slideInterval, [numTasks])
DStream 输出:
print()
saveAsTextFiles(prefix, [suffix])
saveAsObjectFiles(prefix, [suffix])
saveAsHadoopFiles(prefix, [suffix])
foreachRDD(func)
优雅关闭
网友评论