美文网首页
flink Metrics及其使用

flink Metrics及其使用

作者: todd5167 | 来源:发表于2019-09-27 20:13 被阅读0次

    flink metric用来对外暴露系统内部的一些运行指标,比如flink框架运行时的JVM相关配置,或者基于flink开发的项目。

    监控类型

    flink提供了Counter, Gauge, Histogram and Meter四种类型的指标。我们通过继承RichFunction拿到MetricGroup,并向其中填充指标。

    Counter:
    用与存储数值类型,比如统计数据输入、输出总数量。

    public class MyMapper extends RichMapFunction<String, String> {
      private transient Counter counter;
    
      @Override
      public void open(Configuration config) {
        this.counter = getRuntimeContext()
          .getMetricGroup()
          .counter("myCounter");
      }
    
      @Override
      public String map(String value) throws Exception {
        this.counter.inc();
        return value;
      }
    }
    

    Gauge:
    可以用来存储任何类型,前提要实现org.apache.flink.metrics.Gauge接口,重写getValue方法,如果返回类型为Object则该类需要重写toString方法。

    有些场景下,需要根据业务计算出指标,则Gauge使用起来更灵活。

    public class MyMapper extends RichMapFunction<String, String> {
      private transient int valueToExpose = 0;
    
      @Override
      public void open(Configuration config) {
        getRuntimeContext()
          .getMetricGroup()
          .gauge("MyGauge", new Gauge<Integer>() {
            @Override
            public Integer getValue() {
              return valueToExpose;
            }
          });
      }
    
      @Override
      public String map(String value) throws Exception {
        valueToExpose++;
        return value;
      }
    }
    

    Meter:
    用来计算平均速率,直接使用其子类MeterView更方便一些。

    public class MyMapper extends RichMapFunction<Long, Long> {
      private transient Counter numInBytes;
      private transient Meter meter;
    
      @Override
      public void open(Configuration config) {
        this.meter = getRuntimeContext()
          .getMetricGroup()
          .meter("myMeter", new MeterView(numInBytes, 20));
      }
    
      @Override
      public Long map(Long value) throws Exception {
          numInBytes.inc(value);
        return value;
      }
    }
    
    添加自定义监控指标

    以flink1.5的Kafka读取以及写入为例,添加rps、dirtyData等相关指标信息。�kafka读取和写入重点是先拿到RuntimeContex初始化指标,并传递给要使用的序列类,通过重写序列化和反序列化方法,来更新指标信息。

    不加指标的kafka数据读取、写入Demo
    public class FlinkEtlTest {
        private static final Logger logger = LoggerFactory.getLogger(FlinkEtlTest.class);
    
        public static void main(String[] args) throws Exception {
            final ParameterTool params = ParameterTool.fromArgs(args);
            String jobName = params.get("jobName");
    
            StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
    
            /** 设置kafka数据 */
            String topic = "myTest01";
            Properties props = new Properties();
            props.setProperty("bootstrap.servers", "localhost:9092");
            props.setProperty("zookeeper.quorum", "localhost:2181/kafka");
    
            // 使用FlinkKafkaConsumer09以及SimpleStringSchema序列化类,读取kafka数据
            FlinkKafkaConsumer09<String> consumer09 = new FlinkKafkaConsumer09(topic, new SimpleStringSchema(), props);
            consumer09.setStartFromEarliest();
    
            // 使用FlinkKafkaProducer09和SimpleStringSchema反序列化类,将数据写入kafka
            String sinkBrokers = "localhost:9092";
            FlinkKafkaProducer09<String> myProducer = new FlinkKafkaProducer09<>(sinkBrokers, "myTest01", new SimpleStringSchema());
    
    
            DataStream<String> kafkaDataStream = env.addSource(consumer09);
            kafkaDataStream = kafkaDataStream.map(str -> {
                logger.info("map receive {}",str);
                return str.toUpperCase();
            });
    
            kafkaDataStream.addSink(myProducer);
    
            env.execute(jobName);
        }
    
        
    }
    
    为kafka读取添加相关指标
    • 继承FlinkKafkaConsumer09,获取它的RuntimeContext,使用当前MetricGroup初始化指标参数。
    public class CustomerFlinkKafkaConsumer09<T> extends FlinkKafkaConsumer09<T> {
    
        CustomerSimpleStringSchema customerSimpleStringSchema;
        // 构造方法有多个
        public CustomerFlinkKafkaConsumer09(String topic, DeserializationSchema valueDeserializer, Properties props) {
            super(topic, valueDeserializer, props);
            this.customerSimpleStringSchema = (CustomerSimpleStringSchema) valueDeserializer;
        }
    
        @Override
        public void run(SourceContext sourceContext) throws Exception {
            //将RuntimeContext传递给customerSimpleStringSchema
            customerSimpleStringSchema.setRuntimeContext(getRuntimeContext());
           // 初始化指标
            customerSimpleStringSchema.initMetric();
            super.run(sourceContext);
        }
    }
    
    
    • 重写SimpleStringSchema类的反序列化方法,当数据流入时变更指标。
    public class CustomerSimpleStringSchema extends SimpleStringSchema {
    
        private static final Logger logger = LoggerFactory.getLogger(CustomerSimpleStringSchema.class);
    
        public static final String DT_NUM_RECORDS_RESOVED_IN_COUNTER = "dtNumRecordsInResolve";
        public static final String DT_NUM_RECORDS_RESOVED_IN_RATE = "dtNumRecordsInResolveRate";
        public static final String DT_DIRTY_DATA_COUNTER = "dtDirtyData";
        public static final String DT_NUM_BYTES_IN_COUNTER = "dtNumBytesIn";
        public static final String DT_NUM_RECORDS_IN_RATE = "dtNumRecordsInRate";
    
        public static final String DT_NUM_BYTES_IN_RATE = "dtNumBytesInRate";
        public static final String DT_NUM_RECORDS_IN_COUNTER = "dtNumRecordsIn";
    
    
    
        protected transient Counter numInResolveRecord;
        //source RPS
        protected transient Meter numInResolveRate;
        //source dirty data
        protected transient Counter dirtyDataCounter;
    
        // tps
        protected transient Meter numInRate;
        protected transient Counter numInRecord;
    
        //bps
        protected transient Counter numInBytes;
        protected transient Meter numInBytesRate;
    
    
    
        private transient RuntimeContext runtimeContext;
    
        public void initMetric() {
            numInResolveRecord = runtimeContext.getMetricGroup().counter(DT_NUM_RECORDS_RESOVED_IN_COUNTER);
            numInResolveRate = runtimeContext.getMetricGroup().meter(DT_NUM_RECORDS_RESOVED_IN_RATE, new MeterView(numInResolveRecord, 20));
            dirtyDataCounter = runtimeContext.getMetricGroup().counter(DT_DIRTY_DATA_COUNTER);
    
            numInBytes = runtimeContext.getMetricGroup().counter(DT_NUM_BYTES_IN_COUNTER);
            numInRecord = runtimeContext.getMetricGroup().counter(DT_NUM_RECORDS_IN_COUNTER);
    
            numInRate = runtimeContext.getMetricGroup().meter(DT_NUM_RECORDS_IN_RATE, new MeterView(numInRecord, 20));
            numInBytesRate = runtimeContext.getMetricGroup().meter(DT_NUM_BYTES_IN_RATE , new MeterView(numInBytes, 20));
    
    
    
        }
        // 源表读取重写deserialize方法
        @Override
        public String deserialize(byte[] value) {
            // 指标进行变更
            numInBytes.inc(value.length);
            numInResolveRecord.inc();
            numInRecord.inc();
            try {
                return super.deserialize(value);
            } catch (Exception e) {
                dirtyDataCounter.inc();
            }
            return "";
        }
    
    
        public void setRuntimeContext(RuntimeContext runtimeContext) {
            this.runtimeContext = runtimeContext;
        }
    }
    
    • 新的API使用
    CustomerFlinkKafkaConsumer09<String> consumer09 = new CustomerFlinkKafkaConsumer09(topic, new CustomerSimpleStringSchema(), props);
    
    
    为kafka写入添加相关指标
    • 继承FlinkKafkaProducer09类,重写open方法,拿到RuntimeContext,初始化指标信息传递给CustomerSinkStringSchema。
    
    public class  CustomerFlinkKafkaProducer09<T> extends FlinkKafkaProducer09<T> {
    
        public static final String DT_NUM_RECORDS_OUT = "dtNumRecordsOut";
        public static final String DT_NUM_RECORDS_OUT_RATE = "dtNumRecordsOutRate";
    
        CustomerSinkStringSchema schema;
    
        public CustomerFlinkKafkaProducer09(String brokerList, String topicId, SerializationSchema serializationSchema) {
            super(brokerList, topicId, serializationSchema);
            this.schema = (CustomerSinkStringSchema) serializationSchema;
        }
    
    
    
        @Override
        public void open(Configuration configuration) {
            producer = getKafkaProducer(this.producerConfig);
    
            RuntimeContext ctx = getRuntimeContext();
            Counter counter = ctx.getMetricGroup().counter(DT_NUM_RECORDS_OUT);
            //Sink的RPS计算
            MeterView meter = ctx.getMetricGroup().meter(DT_NUM_RECORDS_OUT_RATE, new MeterView(counter, 20));
            // 将counter传递给CustomerSinkStringSchema
            schema.setCounter(counter);
    
            super.open(configuration);
        }
    
    }
    
    
    • 重写SimpleStringSchema的序列化方法
    public class CustomerSinkStringSchema extends SimpleStringSchema {
    
        private static final Logger logger = LoggerFactory.getLogger(CustomerSinkStringSchema.class);
    
        private Counter sinkCounter;
    
        @Override
        public byte[] serialize(String element) {
            logger.info("sink data {}", element);
            sinkCounter.inc();
            return super.serialize(element);
        }
    
        public void setCounter(Counter counter) {
            this.sinkCounter = counter;
        }
    }
    
    
    • 新的kafkaSinkApi使用
    CustomerFlinkKafkaProducer09<String> myProducer = new CustomerFlinkKafkaProducer09<>(sinkBrokers, "mqTest01", new CustomerSinkStringSchema());
    
    

    这样就可以在监控框架里面看到采集的指标信息了,比如flink_taskmanager_job_task_operator_dtDirtyData指标,dtDirtyData是自己添加的指标,前面的字符串是operator默认使用的metricGroup。

    相关文章

      网友评论

          本文标题:flink Metrics及其使用

          本文链接:https://www.haomeiwen.com/subject/wtctjctx.html