参考
Flink Data Streaming Fault Tolerance
High-throughput, low-latency, and exactly-once stream processing with Apache Flink™
An Overview of End-to-End Exactly-Once Processing in Apache Flink (with Apache Kafka, too!)
true streaming的优点
- low latency
- flow control
- and true streaming programming model(session window,micro batch的处理方式需要窗口和interval对齐)
micro-batching的优点
- high throughput(批量处理效率更高)
- and exactly-once guarantees(一个小批次要不全部成功,要不全部失败)
flink集合了两种特性
For small state (e.g., counts or other statistical summaries), this backup overhead is usually negligible, while for large state, the checkpoint interval makes a tradeoff between throughput and recovery time.
- 包含真流处理的优点
- 使用异步snapshot来实现exactly once,snapshot不会影响到数据流的处理
Other problems with Storm’s mechanism is low throughput and problems with flow control, as the acknowledgment mechanism often falsely classifies failures under backpressure.
网友评论