美文网首页
flink基础——安装和demo

flink基础——安装和demo

作者: lifesmily | 来源:发表于2018-12-05 15:31 被阅读177次

    清华开源软件镜像
    https://mirrors.tuna.tsinghua.edu.cn/apache/

    1. 安装

    1) Mac上安装

    参考:
    https://segmentfault.com/a/1190000016901469
    https://www.jianshu.com/p/17676d34dd35
    启动位置:
    /usr/local/Cellar/apache-flink/1.6.2/libexec/bin
    启动与停止:

    $ ./start-cluster.sh
    $ ./stop-cluster.sh
    

    2)Windows上安装

    参考
    https://ci.apache.org/projects/flink/flink-docs-stable/start/flink_on_windows.html
    启动位置
    D:\flink-1.6.2-bin-scala_2\flink-1.6.2\bin
    启动

    $ start-cluster.bat 
    # Starting a local cluster with one JobManager process and one TaskManager process. 
    # You can terminate the processes via CTRL-C in the spawned shell windows.
    # Web interface by default on [http://localhost:8081/.]
    

    启动一个job

    进入目录,运行

    flink.bat run -c wikiedits.WikipediaAnalysis D:\Flink_Project\target\original-wiki-edits-1.0-SNAPSHOT.jar 127.0.0.1 9000
    

    2. 运行第一个job

    1) Java程序

    1. maven 模版
    $ mvn archetype:generate \  
    -DarchetypeGroupId=org.apache.flink \  
    -DarchetypeArtifactId=flink-quickstart-java \  
    -DarchetypeVersion=1.6.2
    
    mvn archetype:generate \
        -DarchetypeGroupId=org.apache.flink \
        -DarchetypeArtifactId=flink-quickstart-java \
        -DarchetypeVersion=1.6.1 \
        -DgroupId=wiki-edits \
        -DartifactId=wiki-edits \
        -Dversion=0.1 \
        -Dpackage=wikiedits \
        -DinteractiveMode=false
    
    1. 自定义任务
    package FlinkTest;
    
    import org.apache.flink.api.common.functions.FlatMapFunction;
    import org.apache.flink.api.java.tuple.Tuple2;
    import org.apache.flink.streaming.api.datastream.DataStreamSource;
    import org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator;
    import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
    import org.apache.flink.util.Collector;
    
    public class SocketTextStreamWordCount {
        public static void main(String[] args) throws Exception {
            //参数检查
            if (args.length != 2) {
                System.err.println("USAGE:\nSocketTextStreamWordCount <hostname> <port>");
                return;
            }
    
            String hostname = args[0];
            Integer port = Integer.parseInt(args[1]);
    
    
            // set up the streaming execution environment
            final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
    
            //获取数据
            DataStreamSource<String> stream = env.socketTextStream(hostname, port);
    
            //计数
            SingleOutputStreamOperator<Tuple2<String, Integer>> sum = stream.flatMap(new LineSplitter())
                    .keyBy(0)
                    .sum(1);
    
            sum.print();
    
            env.execute("Java WordCount from SocketTextStream Example");
        }
    
        public static final class LineSplitter implements FlatMapFunction<String, Tuple2<String, Integer>> {
            @Override
            public void flatMap(String s, Collector<Tuple2<String, Integer>> collector) {
                String[] tokens = s.toLowerCase().split("\\W+");
    
                for (String token: tokens) {
                    if (token.length() > 0) {
                        collector.collect(new Tuple2<String, Integer>(token, 1));
                    }
                }
            }
        }
    }
    
    1. 打包
      进入工程目录(pom.xml所在目录),使用以下命令打包。
      $ maven clean package -Dmaven.test.skip=true

    2. 执行

    flink run -c FlinkTest.SocketTextStreamWordCount \
    /Users/dodoyuan/IdeaProjects/flink-quickstart/target/\
    original-flink-quickstart-1.8-SNAPSHOT.jar 127.0.0.1 9000
    

    完整参考来源:
    https://www.jianshu.com/p/17676d34dd35

    2) python程序

    1. 自定义任务
    from flink.plan.Environment import get_environment
    from flink.functions.GroupReduceFunction import GroupReduceFunction
    
    class Adder(GroupReduceFunction):
      def reduce(self, iterator, collector):
        count, word = iterator.next()
        count += sum([x[0] for x in iterator])
        collector.collect((count, word))
    
    env = get_environment()
    data = env.from_elements("Who's there?",
     "I think I hear them. Stand, ho! Who's there?")
    
    data \
      .flat_map(lambda x, c: [(1, word) for word in x.lower().split()]) \
      .group_by(1) \
      .reduce_group(Adder(), combinable=True) \
      .output()
    
    env.execute(local=True)
    
    1. 运行
      /usr/local/Cellar/apache-flink/1.6.2/libexec/bin/pyflink.sh wordcount.py
      参考:
      http://www.willmcginnis.com/2015/11/08/getting-started-with-python-and-apache-flink/

    2. 查看日志文件


      image.png

    其他参考:
    https://ci.apache.org/projects/flink/flink-docs-stable/quickstart/java_api_quickstart.html

    相关文章

      网友评论

          本文标题:flink基础——安装和demo

          本文链接:https://www.haomeiwen.com/subject/mawzxqtx.html