Windows是处理无限流的核心。Windows将流分割为有限大小的“桶”,我们可以在其中应用计算。本文档关注的是如何在Flink中执行窗口操作,以及程序员如何从其提供的功能中获益最多。
下面给出了一个窗口Flink程序的一般结构。第一个片段指的是keyed流,而第一个代码片段是指non-keyed流。正如你所看到的,唯一的区别是keyed流调用keyBy(…),并且对于non-keyed流window(…)变成了windowAll(…)。这也将成为本页其余部分的路线图。
Keyed Windows
stream
.keyBy(...) <- keyed versus non-keyed windows
.window(...) <- required: "assigner"
[.trigger(...)] <- optional: "trigger" (else default trigger)
[.evictor(...)] <- optional: "evictor" (else no evictor)
[.allowedLateness(...)] <- optional: "lateness" (else zero)
[.sideOutputLateData(...)] <- optional: "output tag" (else no side output for late data)
.reduce/aggregate/fold/apply() <- required: "function"
[.getSideOutput(...)] <- optional: "output tag"
Non-Keyed Windows
stream
.windowAll(...) <- required: "assigner"
[.trigger(...)] <- optional: "trigger" (else default trigger)
[.evictor(...)] <- optional: "evictor" (else no evictor)
[.allowedLateness(...)] <- optional: "lateness" (else zero)
[.sideOutputLateData(...)] <- optional: "output tag" (else no side output for late data)
.reduce/aggregate/fold/apply() <- required: "function"
[.getSideOutput(...)] <- optional: "output tag"
网友评论