美文网首页
flink的TimeCharacteristic

flink的TimeCharacteristic

作者: ATNOW | 来源:发表于2019-09-26 16:45 被阅读0次

    概述:

    • flink的TimeCharacteristic枚举定义了三类值,分别是ProcessingTime、IngestionTime、EventTime

    • ProcessingTime是以operator处理的时间为准,它使用的是机器的系统时间来作为data stream的时间;

    • IngestionTime是以数据进入flink streaming data flow的时间为准;

    • EventTime是以数据自带的时间戳字段为准,应用程序需要指定如何从record中抽取时间戳字段

    • 指定为EventTime的source需要自己定义event time以及emit watermark,或者在source之外通过assignTimestampsAndWatermarks在程序手工指定

      TimeCharacteristic处于flink/streaming/api/目录下,代码如下
    /*
     * Licensed to the Apache Software Foundation (ASF) under one
     * or more contributor license agreements.  See the NOTICE file
     * distributed with this work for additional information
     * regarding copyright ownership.  The ASF licenses this file
     * to you under the Apache License, Version 2.0 (the
     * "License"); you may not use this file except in compliance
     * with the License.  You may obtain a copy of the License at
     *
     *     http://www.apache.org/licenses/LICENSE-2.0
     *
     * Unless required by applicable law or agreed to in writing, software
     * distributed under the License is distributed on an "AS IS" BASIS,
     * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
     * See the License for the specific language governing permissions and
     * limitations under the License.
     */
    
    package org.apache.flink.streaming.api;
    
    import org.apache.flink.annotation.PublicEvolving;
    
    /**
     * The time characteristic defines how the system determines time for time-dependent
     * order and operations that depend on time (such as time windows).
     */
    @PublicEvolving
    public enum TimeCharacteristic {
    
        /**
         * Processing time for operators means that the operator uses the system clock of the machine
         * to determine the current time of the data stream. Processing-time windows trigger based
         * on wall-clock time and include whatever elements happen to have arrived at the operator at
         * that point in time.
         *
         * <p>Using processing time for window operations results in general in quite non-deterministic
         * results, because the contents of the windows depends on the speed in which elements arrive.
         * It is, however, the cheapest method of forming windows and the method that introduces the
         * least latency.
         */
        /**ProcessingTime是以operator处理的时间为准,它使用的是机器的系统时间来作为data stream的时间**/
        ProcessingTime,
    
        /**
         * Ingestion time means that the time of each individual element in the stream is determined
         * when the element enters the Flink streaming data flow. Operations like windows group the
         * elements based on that time, meaning that processing speed within the streaming dataflow
         * does not affect windowing, but only the speed at which sources receive elements.
         *
         * <p>Ingestion time is often a good compromise between processing time and event time.
         * It does not need and special manual form of watermark generation, and events are typically
         * not too much out-or-order when they arrive at operators; in fact, out-of-orderness can
         * only be introduced by streaming shuffles or split/join/union operations. The fact that
         * elements are not very much out-of-order means that the latency increase is moderate,
         * compared to event
         * time.
         */
        /**IngestionTime是以数据进入flink streaming data flow的时间为准**/
        IngestionTime,
    
        /**
         * Event time means that the time of each individual element in the stream (also called event)
         * is determined by the event's individual custom timestamp. These timestamps either exist in
         * the elements from before they entered the Flink streaming dataflow, or are user-assigned at
         * the sources. The big implication of this is that it allows for elements to arrive in the
         * sources and in all operators out of order, meaning that elements with earlier timestamps may
         * arrive after elements with later timestamps.
         *
         * <p>Operators that window or order data with respect to event time must buffer data until they
         * can be sure that all timestamps for a certain time interval have been received. This is
         * handled by the so called "time watermarks".
         *
         * <p>Operations based on event time are very predictable - the result of windowing operations
         * is typically identical no matter when the window is executed and how fast the streams
         * operate. At the same time, the buffering and tracking of event time is also costlier than
         * operating with processing time, and typically also introduces more latency. The amount of
         * extra cost depends mostly on how much out of order the elements arrive, i.e., how long the
         * time span between the arrival of early and late elements is. With respect to the
         * "time watermarks", this means that the cost typically depends on how early or late the
         * watermarks can be generated for their timestamp.
         *
         * <p>In relation to {@link #IngestionTime}, the event time is similar, but refers the the
         * event's original time, rather than the time assigned at the data source. Practically, that
         * means that event time has generally more meaning, but also that it takes longer to determine
         * that all elements for a certain time have arrived.
         */
        /**EventTime是以数据自带的时间戳字段为准,应用程序需要指定如何从record中抽取时间戳字段**/
        EventTime
    }
    

    不同时间种类 :

    不同时间种类

    Event Time 在数据最源头产生时带有时间戳,后面都需要用时间戳来进行运算。

    EventTime

    EventTime和ProcessingTime

    EventTime是用事件真实产生的时间戳去做 Re-bucketing ,重要性在于记录引擎输出运算结果的时间。简单来说,流式引擎连续 24 小时在运行、搜集资料,假设 Pipeline 里有一个 windows Operator 正在做运算,每小时能产生结果,何时输出 windows 的运算值,这个时间点就是 Event - Time 处理的精髓,用来表示该收的数据已经收到

    相关文章

      网友评论

          本文标题:flink的TimeCharacteristic

          本文链接:https://www.haomeiwen.com/subject/uqtgyctx.html