美文网首页
16-SparkCore03

16-SparkCore03

作者: CrUelAnGElPG | 来源:发表于2018-09-01 04:30 被阅读0次

Spark on YARN

将spark作业提交到yarn上去执行

spark仅仅作业一个客户端

./spark-submit \

--class org.apache.spark.examples.SparkPi \

--master yarn \

/home/hadoop/app/spark-2.1.0-bin-2.6.0-cdh5.7.0/examples/jars/spark-examples_2.11-2.3.0.jar \

3

deploy-mode: client / cluster

yarn = yarn-client

yarn-cluster =

--queue

--num-executors

--executor-cores

--executor-memory

40-50s ==> 10-15s

client vs cluster

driver运行在哪里

client

am

SPARK_HISTORY_OPTS="-Dspark.history.fs.logDirectory=hdfs://hadoop000:8020/directory -Dspark.history.ui.port=7777"

coalesce vs reparition

200        200    1条      200      200 

rdd1 -map-> rdd2 -filter--coalesce-> rddc --> save...

xxxx.coalesce(1)

map vs mapPartitions

foreach  vs foreachPartition

foreachPartition

只要涉及到输出的,用这个

相关文章

  • 16-SparkCore03

    Spark on YARN 将spark作业提交到yarn上去执行 spark仅仅作业一个客户端 ./spark-...

网友评论

      本文标题:16-SparkCore03

      本文链接:https://www.haomeiwen.com/subject/nrdywftx.html