简介
正文
1. FlinkKafkaConsumer010
flink
中已经预置了kafka
相关的数据源实现FlinkKafkaConsumer010
,先看下具体的实现:
@PublicEvolving
public class FlinkKafkaConsumer010<T> extends FlinkKafkaConsumer09<T> {
private static final long serialVersionUID = 2324564345203409112L;
public FlinkKafkaConsumer010(String topic, DeserializationSchema<T> valueDeserializer, Properties props) {
this(Collections.singletonList(topic), valueDeserializer, props);
}
public FlinkKafkaConsumer010(String topic, KeyedDeserializationSchema<T> deserializer, Properties props) {
this(Collections.singletonList(topic), deserializer, props);
}
public FlinkKafkaConsumer010(List<String> topics, DeserializationSchema<T> deserializer, Properties props) {
this((List)topics, (KeyedDeserializationSchema)(new KeyedDeserializationSchemaWrapper(deserializer)), props);
}
public FlinkKafkaConsumer010(List<String> topics, KeyedDeserializationSchema<T> deserializer, Properties props) {
super(topics, deserializer, props);
}
@PublicEvolving
public FlinkKafkaConsumer010(Pattern subscriptionPattern, DeserializationSchema<T> valueDeserializer, Properties props) {
this((Pattern)subscriptionPattern, (KeyedDeserializationSchema)(new KeyedDeserializationSchemaWrapper(valueDeserializer)), props);
}
@PublicEvolving
public FlinkKafkaConsumer010(Pattern subscriptionPattern, KeyedDeserializationSchema<T> deserializer, Properties props) {
super(subscriptionPattern, deserializer, props);
}
......
}
kafka
的Consumer
有一堆实现,不过最终都是继承自FlinkKafkaConsumerBase
,而这个抽象类则是继承RichParallelSourceFunction
,是不是很眼熟,跟自定义mysql
数据源继承的抽象类RichSourceFunction
很类似。
public abstract class FlinkKafkaConsumerBase<T>
extends RichParallelSourceFunction<T>
implements CheckpointListener, ResultTypeQueryable<T>, CheckpointedFunction
可以看到,这里有很多构造函数,我们直接使用即可。
1.1 代码使用
package myflink.job;
import org.apache.flink.api.common.serialization.SimpleStringSchema;
import org.apache.flink.streaming.api.datastream.DataStreamSource;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.streaming.api.functions.sink.PrintSinkFunction;
import org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer010;
import java.util.Properties;
/**
* kafka作为数据源,消费kafka中的消息
* 教程详见
* @See http://www.54tianzhisheng.cn/tags/Flink/
*/
public class KafkaDatasouceForFlinkJob {
public static void main(String[] args) throws Exception {
final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
Properties properties = new Properties();
properties.put("bootstrap.servers","localhost:9092");
properties.put("zookeeper.connect","localhost:2181");
properties.put("group.id","metric-group");
properties.put("auto.offset.reset","latest");
properties.put("key.deserializer","org.apache.kafka.common.serialization.StringDeserializer");
properties.put("value.deserializer","org.apache.kafka.common.serialization.StringDeserializer");
DataStreamSource<String> dataStreamSource = env.addSource(
new FlinkKafkaConsumer010<String>(
"testjin" ,// topic
new SimpleStringSchema(),
properties
)
).setParallelism(1);
// dataStreamSource.print();
// 同样效果
dataStreamSource.addSink(new PrintSinkFunction<>());
env.execute("Flink add kafka data source");
}
}
说明:
a、这里直接使用properties
对象来设置kafka
相关配置,比如brokers
、zk
、groupId
、序列化
、反序列化
等。
b、使用FlinkKafkaConsumer010
构造函数,指定topic
、properties
配置
c、SimpleStringSchema
仅针对String
类型数据的序列化及反序列化,如果kafka
中消息的内容不是String
,则会报错;看下SimpleStringSchema
的定义:
public class SimpleStringSchema implements DeserializationSchema<String>, SerializationSchema<String>
d、这里直接把获取到的消息打印出来。
网友评论