Spark Source外部数据源
- https://blog.csdn.net/oopsoom/article/details/42064075
- Spark SQL 源码分析
https://blog.csdn.net/oopsoom/article/details/38257749
RDD
Spark SQL执行流程
Spark Catalyst
内存
Spark Task
Spark调优
1.http://marsishandsome.github.io/SparkSQL-Internal/03-performance-turning/
从Spark的并行度、数据格式(列式存储)、合适数量的Task(默认200个)
Spark storage
1.http://jerryshao.me/2013/10/08/spark-storage-module-analysis/
从通信和存储层来介绍,介绍了driver和executor之间的通信,核心类BlockManager
Spark 调度
Spark Streaming
1.Structured Streaming 实现思路与实现概述
https://github.com/lw-lin/CoolplaySpark/blob/master/Structured%20Streaming%20%E6%BA%90%E7%A0%81%E8%A7%A3%E6%9E%90%E7%B3%BB%E5%88%97/1.1%20Structured%20Streaming%20%E5%AE%9E%E7%8E%B0%E6%80%9D%E8%B7%AF%E4%B8%8E%E5%AE%9E%E7%8E%B0%E6%A6%82%E8%BF%B0.md
2.Source 解析
https://github.com/lw-lin/CoolplaySpark/blob/master/Structured%20Streaming%20%E6%BA%90%E7%A0%81%E8%A7%A3%E6%9E%90%E7%B3%BB%E5%88%97/2.1%20Structured%20Streaming%20%E4%B9%8B%20Source%20%E8%A7%A3%E6%9E%90.md
3.Sink 解析
https://github.com/lw-lin/CoolplaySpark/blob/master/Structured%20Streaming%20%E6%BA%90%E7%A0%81%E8%A7%A3%E6%9E%90%E7%B3%BB%E5%88%97/2.2%20Structured%20Streaming%20%E4%B9%8B%20Sink%20%E8%A7%A3%E6%9E%90.md
网友评论