美文网首页
Flink on Yarn

Flink on Yarn

作者: 阿呆少爷 | 来源:发表于2019-02-20 13:23 被阅读0次

    安装和启动YARN

    根据『 Hadoop 』mac下Hadoop的安装与使用这篇文章的指示安装并配置好Hadoop。最后启动yarn。

    $ ./sbin/start-yarn.sh
    Starting resourcemanager
    Starting nodemanagers
    
    $ jps                                                                                                                                                 
    40186 ResourceManager
    40286 NodeManager
    40447 Jps
    

    提交Flink任务

    $ flink run -h                                                                                                                     
    2019-02-18 19:18:20,815 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli                 - Found Yarn properties file under /var/folders/y4/qn5fmtgd75zcl1yqd0m37f600000gp/T/.yarn-properties-henshao.
    2019-02-18 19:18:20,815 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli                 - Found Yarn properties file under /var/folders/y4/qn5fmtgd75zcl1yqd0m37f600000gp/T/.yarn-properties-henshao.
    
    Action "run" compiles and runs a program.
    
      Syntax: run [OPTIONS] <jar-file> <arguments>
      "run" action options:
         -c,--class <classname>               Class with the program entry point
                                              ("main" method or "getPlan()" method.
                                              Only needed if the JAR file does not
                                              specify the class in its manifest.
         -C,--classpath <url>                 Adds a URL to each user code
                                              classloader  on all nodes in the
                                              cluster. The paths must specify a
                                              protocol (e.g. file://) and be
                                              accessible on all nodes (e.g. by means
                                              of a NFS share). You can use this
                                              option multiple times for specifying
                                              more than one URL. The protocol must
                                              be supported by the {@link
                                              java.net.URLClassLoader}.
         -d,--detached                        If present, runs the job in detached
                                              mode
         -n,--allowNonRestoredState           Allow to skip savepoint state that
                                              cannot be restored. You need to allow
                                              this if you removed an operator from
                                              your program that was part of the
                                              program when the savepoint was
                                              triggered.
         -p,--parallelism <parallelism>       The parallelism with which to run the
                                              program. Optional flag to override the
                                              default value specified in the
                                              configuration.
         -q,--sysoutLogging                   If present, suppress logging output to
                                              standard out.
         -s,--fromSavepoint <savepointPath>   Path to a savepoint to restore the job
                                              from (for example
                                              hdfs:///flink/savepoint-1537).
         -sae,--shutdownOnAttachedExit        If the job is submitted in attached
                                              mode, perform a best-effort cluster
                                              shutdown when the CLI is terminated
                                              abruptly, e.g., in response to a user
                                              interrupt, such as typing Ctrl + C.
      Options for yarn-cluster mode:
         -d,--detached                        If present, runs the job in detached
                                              mode
         -m,--jobmanager <arg>                Address of the JobManager (master) to
                                              which to connect. Use this flag to
                                              connect to a different JobManager than
                                              the one specified in the
                                              configuration.
         -sae,--shutdownOnAttachedExit        If the job is submitted in attached
                                              mode, perform a best-effort cluster
                                              shutdown when the CLI is terminated
                                              abruptly, e.g., in response to a user
                                              interrupt, such as typing Ctrl + C.
         -yD <property=value>                 use value for given property
         -yd,--yarndetached                   If present, runs the job in detached
                                              mode (deprecated; use non-YARN
                                              specific option instead)
         -yh,--yarnhelp                       Help for the Yarn session CLI.
         -yid,--yarnapplicationId <arg>       Attach to running YARN session
         -yj,--yarnjar <arg>                  Path to Flink jar file
         -yjm,--yarnjobManagerMemory <arg>    Memory for JobManager Container with
                                              optional unit (default: MB)
         -yn,--yarncontainer <arg>            Number of YARN container to allocate
                                              (=Number of Task Managers)
         -ynl,--yarnnodeLabel <arg>           Specify YARN node label for the YARN
                                              application
         -ynm,--yarnname <arg>                Set a custom name for the application
                                              on YARN
         -yq,--yarnquery                      Display available YARN resources
                                              (memory, cores)
         -yqu,--yarnqueue <arg>               Specify YARN queue.
         -ys,--yarnslots <arg>                Number of slots per TaskManager
         -yst,--yarnstreaming                 Start Flink in streaming mode
         -yt,--yarnship <arg>                 Ship files in the specified directory
                                              (t for transfer)
         -ytm,--yarntaskManagerMemory <arg>   Memory per TaskManager Container with
                                              optional unit (default: MB)
         -yz,--yarnzookeeperNamespace <arg>   Namespace to create the Zookeeper
                                              sub-paths for high availability mode
         -z,--zookeeperNamespace <arg>        Namespace to create the Zookeeper
                                              sub-paths for high availability mode
    
      Options for default mode:
         -m,--jobmanager <arg>           Address of the JobManager (master) to which
                                         to connect. Use this flag to connect to a
                                         different JobManager than the one specified
                                         in the configuration.
         -z,--zookeeperNamespace <arg>   Namespace to create the Zookeeper sub-paths
                                         for high availability mode
    
    $ flink run -m yarn-cluster -yn 1 -yjm 1024 -ytm 1024 -c com.henshao.flink.StatefulSource target/flink-learning-1.0-SNAPSHOT.jar 
    2019-02-18 19:16:26,002 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli                 - Found Yarn properties file under /var/folders/y4/qn5fmtgd75zcl1yqd0m37f600000gp/T/.yarn-properties-henshao.
    2019-02-18 19:16:26,002 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli                 - Found Yarn properties file under /var/folders/y4/qn5fmtgd75zcl1yqd0m37f600000gp/T/.yarn-properties-henshao.
    2019-02-18 19:16:26,606 INFO  org.apache.hadoop.yarn.client.RMProxy                         - Connecting to ResourceManager at /0.0.0.0:8032
    2019-02-18 19:16:26,773 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli                 - No path for the flink jar passed. Using the location of class org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
    2019-02-18 19:16:26,773 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli                 - No path for the flink jar passed. Using the location of class org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
    2019-02-18 19:16:26,787 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli                 - The argument yn is deprecated in will be ignored.
    2019-02-18 19:16:26,787 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli                 - The argument yn is deprecated in will be ignored.
    2019-02-18 19:16:26,945 WARN  org.apache.flink.yarn.AbstractYarnClusterDescriptor           - Neither the HADOOP_CONF_DIR nor the YARN_CONF_DIR environment variable is set. The Flink YARN Client needs one of these to be set to properly load the Hadoop configuration for accessing YARN.
    2019-02-18 19:16:26,974 INFO  org.apache.flink.yarn.AbstractYarnClusterDescriptor           - Cluster specification: ClusterSpecification{masterMemoryMB=1024, taskManagerMemoryMB=1024, numberTaskManagers=1, slotsPerTaskManager=1}
    2019-02-18 19:16:27,280 WARN  org.apache.flink.yarn.AbstractYarnClusterDescriptor           - The file system scheme is 'file'. This indicates that the specified Hadoop configuration path is wrong and the system is using the default Hadoop configuration values.The Flink YARN client needs to store its files in a distributed file system
    2019-02-18 19:16:27,281 WARN  org.apache.flink.yarn.AbstractYarnClusterDescriptor           - The configuration directory ('/usr/local/Cellar/apache-flink/1.7.1/libexec/conf') contains both LOG4J and Logback configuration files. Please delete or rename one of them.
    2019-02-18 19:16:28,560 INFO  org.apache.flink.yarn.AbstractYarnClusterDescriptor           - Submitting application master application_1550485455954_0006
    2019-02-18 19:16:28,590 INFO  org.apache.hadoop.yarn.client.api.impl.YarnClientImpl         - Submitted application application_1550485455954_0006
    2019-02-18 19:16:28,591 INFO  org.apache.flink.yarn.AbstractYarnClusterDescriptor           - Waiting for the cluster to be allocated
    2019-02-18 19:16:28,593 INFO  org.apache.flink.yarn.AbstractYarnClusterDescriptor           - Deploying cluster, current state ACCEPTED
    2019-02-18 19:16:35,205 INFO  org.apache.flink.yarn.AbstractYarnClusterDescriptor           - YARN application has been deployed successfully.
    Starting execution of program
    

    yarn单机部署时,AppMaster最大只能使用一个。这样只能启动JobManager,没有资源启动TaskManager。

    image.png image.png

    参考文章

    1. 『 Hadoop 』mac下Hadoop的安装与使用
    2. Flink on YARN部署快速入门指南

    相关文章

      网友评论

          本文标题:Flink on Yarn

          本文链接:https://www.haomeiwen.com/subject/cjyzeqtx.html