美文网首页
Spark-Local模式

Spark-Local模式

作者: ssttIsme | 来源:发表于2021-10-14 23:12 被阅读0次

    下载
    http://archive.apache.org/dist/spark/spark-3.0.0/


    解压安装
    [server@hadoop102 ~]$ cd /opt/software/
    [server@hadoop102 software]$ tar -zxvf spark-3.0.0-bin-hadoop3.2.tgz -C /opt/module/
    
    [server@hadoop102 software]$ cd /opt/module/
    [server@hadoop102 module]$ mv spark-3.0.0-bin-hadoop3.2/ spark-local
    [server@hadoop102 module]$ cd spark-local/
    [server@hadoop102 spark-local]$ cd data/
    [server@hadoop102 data]$ pwd
    /opt/module/spark-local/data
    [server@hadoop102 data]$ vim word.txt
    
    My Spark
    My Scala
    My Spark
    
    [server@hadoop102 data]$ cd ..
    [server@hadoop102 spark-local]$ pwd
    /opt/module/spark-local
    [server@hadoop102 spark-local]$ bin/spark-shell 
    21/10/14 23:06:17 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
    Setting default log level to "WARN".
    To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
    Spark context Web UI available at http://hadoop102:4040
    Spark context available as 'sc' (master = local[*], app id = local-1634224020472).
    Spark session available as 'spark'.
    Welcome to
          ____              __
         / __/__  ___ _____/ /__
        _\ \/ _ \/ _ `/ __/  '_/
       /___/ .__/\_,_/_/ /_/\_\   version 3.0.0
          /_/
             
    Using Scala version 2.12.10 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_65)
    Type in expressions to have them evaluated.
    Type :help for more information.
    
    scala> var i =8
    i: Int = 8
    

    安装成功,试一下wordcount

    scala> sc.textFile("data/word.txt").flatMap(_.split(" ")).map((_,1)).reduceByKey(_+_).collect
    res0: Array[(String, Int)] = Array((Spark,2), (Scala,1), (My,3)) 
    scala> :quit
    

    计算的监控页面http://hadoop102:4040/jobs/

    [Submitted]提交时间
    [Duration]执行时间
    [Tasks]任务


    运行官方例子

    bin/spark-submit \
    --class org.apache.spark.examples.SparkPi \
    --master local[2] \
    ./examples/jars/spark-examples_2.12-3.0.0.jar \
    10
    
    [server@hadoop102 spark-local]$ bin/spark-submit \
    > --class org.apache.spark.examples.SparkPi \
    > --master local[2] \
    > ./examples/jars/spark-examples_2.12-3.0.0.jar \
    > 10
    21/10/14 23:25:32 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
    21/10/14 23:25:34 INFO SparkContext: Running Spark version 3.0.0
    21/10/14 23:25:34 INFO ResourceUtils: ==============================================================
    21/10/14 23:25:34 INFO ResourceUtils: Resources for spark.driver:
    
    21/10/14 23:25:34 INFO ResourceUtils: ==============================================================
    21/10/14 23:25:34 INFO SparkContext: Submitted application: Spark Pi
    21/10/14 23:25:34 INFO SecurityManager: Changing view acls to: server
    21/10/14 23:25:34 INFO SecurityManager: Changing modify acls to: server
    21/10/14 23:25:34 INFO SecurityManager: Changing view acls groups to: 
    21/10/14 23:25:34 INFO SecurityManager: Changing modify acls groups to: 
    21/10/14 23:25:34 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(server); groups with view permissions: Set(); users  with modify permissions: Set(server); groups with modify permissions: Set()
    21/10/14 23:25:36 INFO Utils: Successfully started service 'sparkDriver' on port 36880.
    21/10/14 23:25:36 INFO SparkEnv: Registering MapOutputTracker
    21/10/14 23:25:36 INFO SparkEnv: Registering BlockManagerMaster
    21/10/14 23:25:36 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
    21/10/14 23:25:36 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
    21/10/14 23:25:36 INFO SparkEnv: Registering BlockManagerMasterHeartbeat
    21/10/14 23:25:36 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-8474253f-7a83-4398-adb6-ad8f1da6f88b
    21/10/14 23:25:37 INFO MemoryStore: MemoryStore started with capacity 413.9 MiB
    21/10/14 23:25:37 INFO SparkEnv: Registering OutputCommitCoordinator
    21/10/14 23:25:38 INFO Utils: Successfully started service 'SparkUI' on port 4040.
    21/10/14 23:25:38 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://hadoop102:4040
    21/10/14 23:25:39 INFO SparkContext: Added JAR file:/opt/module/spark-local/./examples/jars/spark-examples_2.12-3.0.0.jar at spark://hadoop102:36880/jars/spark-examples_2.12-3.0.0.jar with timestamp 1634225138989
    21/10/14 23:25:40 INFO Executor: Starting executor ID driver on host hadoop102
    21/10/14 23:25:40 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 42450.
    21/10/14 23:25:40 INFO NettyBlockTransferService: Server created on hadoop102:42450
    21/10/14 23:25:40 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
    21/10/14 23:25:40 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, hadoop102, 42450, None)
    21/10/14 23:25:40 INFO BlockManagerMasterEndpoint: Registering block manager hadoop102:42450 with 413.9 MiB RAM, BlockManagerId(driver, hadoop102, 42450, None)
    21/10/14 23:25:40 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, hadoop102, 42450, None)
    21/10/14 23:25:40 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, hadoop102, 42450, None)
    21/10/14 23:25:44 INFO SparkContext: Starting job: reduce at SparkPi.scala:38
    21/10/14 23:25:44 INFO DAGScheduler: Got job 0 (reduce at SparkPi.scala:38) with 10 output partitions
    21/10/14 23:25:44 INFO DAGScheduler: Final stage: ResultStage 0 (reduce at SparkPi.scala:38)
    21/10/14 23:25:44 INFO DAGScheduler: Parents of final stage: List()
    21/10/14 23:25:44 INFO DAGScheduler: Missing parents: List()
    21/10/14 23:25:44 INFO DAGScheduler: Submitting ResultStage 0 (MapPartitionsRDD[1] at map at SparkPi.scala:34), which has no missing parents
    21/10/14 23:25:45 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 3.1 KiB, free 413.9 MiB)
    21/10/14 23:25:45 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 1816.0 B, free 413.9 MiB)
    21/10/14 23:25:45 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on hadoop102:42450 (size: 1816.0 B, free: 413.9 MiB)
    21/10/14 23:25:45 INFO SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:1200
    21/10/14 23:25:45 INFO DAGScheduler: Submitting 10 missing tasks from ResultStage 0 (MapPartitionsRDD[1] at map at SparkPi.scala:34) (first 15 tasks are for partitions Vector(0, 1, 2, 3, 4, 5, 6, 7, 8, 9))
    21/10/14 23:25:45 INFO TaskSchedulerImpl: Adding task set 0.0 with 10 tasks
    21/10/14 23:25:46 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, hadoop102, executor driver, partition 0, PROCESS_LOCAL, 7393 bytes)
    21/10/14 23:25:46 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, hadoop102, executor driver, partition 1, PROCESS_LOCAL, 7393 bytes)
    21/10/14 23:25:46 INFO Executor: Running task 0.0 in stage 0.0 (TID 0)
    21/10/14 23:25:46 INFO Executor: Running task 1.0 in stage 0.0 (TID 1)
    21/10/14 23:25:46 INFO Executor: Fetching spark://hadoop102:36880/jars/spark-examples_2.12-3.0.0.jar with timestamp 1634225138989
    21/10/14 23:25:46 INFO TransportClientFactory: Successfully created connection to hadoop102/192.168.100.102:36880 after 265 ms (0 ms spent in bootstraps)
    21/10/14 23:25:46 INFO Utils: Fetching spark://hadoop102:36880/jars/spark-examples_2.12-3.0.0.jar to /tmp/spark-997faf68-9d5f-48cf-bc5d-ba59624f1a01/userFiles-14e86a58-0a9b-4489-afab-90c89993477e/fetchFileTemp5339847072173054765.tmp
    21/10/14 23:25:47 INFO Executor: Adding file:/tmp/spark-997faf68-9d5f-48cf-bc5d-ba59624f1a01/userFiles-14e86a58-0a9b-4489-afab-90c89993477e/spark-examples_2.12-3.0.0.jar to class loader
    21/10/14 23:25:49 INFO Executor: Finished task 0.0 in stage 0.0 (TID 0). 1000 bytes result sent to driver
    21/10/14 23:25:49 INFO Executor: Finished task 1.0 in stage 0.0 (TID 1). 1043 bytes result sent to driver
    21/10/14 23:25:49 INFO TaskSetManager: Starting task 2.0 in stage 0.0 (TID 2, hadoop102, executor driver, partition 2, PROCESS_LOCAL, 7393 bytes)
    21/10/14 23:25:49 INFO Executor: Running task 2.0 in stage 0.0 (TID 2)
    21/10/14 23:25:49 INFO TaskSetManager: Starting task 3.0 in stage 0.0 (TID 3, hadoop102, executor driver, partition 3, PROCESS_LOCAL, 7393 bytes)
    21/10/14 23:25:49 INFO Executor: Running task 3.0 in stage 0.0 (TID 3)
    21/10/14 23:25:49 INFO TaskSetManager: Finished task 1.0 in stage 0.0 (TID 1) in 3362 ms on hadoop102 (executor driver) (1/10)
    21/10/14 23:25:49 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 3645 ms on hadoop102 (executor driver) (2/10)
    21/10/14 23:25:49 INFO Executor: Finished task 2.0 in stage 0.0 (TID 2). 1000 bytes result sent to driver
    21/10/14 23:25:49 INFO Executor: Finished task 3.0 in stage 0.0 (TID 3). 1000 bytes result sent to driver
    21/10/14 23:25:49 INFO TaskSetManager: Starting task 4.0 in stage 0.0 (TID 4, hadoop102, executor driver, partition 4, PROCESS_LOCAL, 7393 bytes)
    21/10/14 23:25:49 INFO Executor: Running task 4.0 in stage 0.0 (TID 4)
    21/10/14 23:25:49 INFO TaskSetManager: Starting task 5.0 in stage 0.0 (TID 5, hadoop102, executor driver, partition 5, PROCESS_LOCAL, 7393 bytes)
    21/10/14 23:25:49 INFO TaskSetManager: Finished task 2.0 in stage 0.0 (TID 2) in 517 ms on hadoop102 (executor driver) (3/10)
    21/10/14 23:25:49 INFO Executor: Running task 5.0 in stage 0.0 (TID 5)
    21/10/14 23:25:49 INFO TaskSetManager: Finished task 3.0 in stage 0.0 (TID 3) in 507 ms on hadoop102 (executor driver) (4/10)
    21/10/14 23:25:50 INFO Executor: Finished task 4.0 in stage 0.0 (TID 4). 1000 bytes result sent to driver
    21/10/14 23:25:50 INFO TaskSetManager: Starting task 6.0 in stage 0.0 (TID 6, hadoop102, executor driver, partition 6, PROCESS_LOCAL, 7393 bytes)
    21/10/14 23:25:50 INFO TaskSetManager: Finished task 4.0 in stage 0.0 (TID 4) in 375 ms on hadoop102 (executor driver) (5/10)
    21/10/14 23:25:50 INFO Executor: Running task 6.0 in stage 0.0 (TID 6)
    21/10/14 23:25:50 INFO Executor: Finished task 5.0 in stage 0.0 (TID 5). 1000 bytes result sent to driver
    21/10/14 23:25:50 INFO TaskSetManager: Starting task 7.0 in stage 0.0 (TID 7, hadoop102, executor driver, partition 7, PROCESS_LOCAL, 7393 bytes)
    21/10/14 23:25:50 INFO TaskSetManager: Finished task 5.0 in stage 0.0 (TID 5) in 382 ms on hadoop102 (executor driver) (6/10)
    21/10/14 23:25:50 INFO Executor: Running task 7.0 in stage 0.0 (TID 7)
    21/10/14 23:25:50 INFO Executor: Finished task 6.0 in stage 0.0 (TID 6). 1000 bytes result sent to driver
    21/10/14 23:25:50 INFO TaskSetManager: Starting task 8.0 in stage 0.0 (TID 8, hadoop102, executor driver, partition 8, PROCESS_LOCAL, 7393 bytes)
    21/10/14 23:25:50 INFO TaskSetManager: Finished task 6.0 in stage 0.0 (TID 6) in 386 ms on hadoop102 (executor driver) (7/10)
    21/10/14 23:25:50 INFO Executor: Finished task 7.0 in stage 0.0 (TID 7). 1000 bytes result sent to driver
    21/10/14 23:25:50 INFO TaskSetManager: Starting task 9.0 in stage 0.0 (TID 9, hadoop102, executor driver, partition 9, PROCESS_LOCAL, 7393 bytes)
    21/10/14 23:25:50 INFO TaskSetManager: Finished task 7.0 in stage 0.0 (TID 7) in 363 ms on hadoop102 (executor driver) (8/10)
    21/10/14 23:25:50 INFO Executor: Running task 8.0 in stage 0.0 (TID 8)
    21/10/14 23:25:50 INFO Executor: Running task 9.0 in stage 0.0 (TID 9)
    21/10/14 23:25:50 INFO Executor: Finished task 8.0 in stage 0.0 (TID 8). 1000 bytes result sent to driver
    21/10/14 23:25:51 INFO TaskSetManager: Finished task 8.0 in stage 0.0 (TID 8) in 380 ms on hadoop102 (executor driver) (9/10)
    21/10/14 23:25:51 INFO Executor: Finished task 9.0 in stage 0.0 (TID 9). 1000 bytes result sent to driver
    21/10/14 23:25:51 INFO TaskSetManager: Finished task 9.0 in stage 0.0 (TID 9) in 372 ms on hadoop102 (executor driver) (10/10)
    21/10/14 23:25:51 INFO DAGScheduler: ResultStage 0 (reduce at SparkPi.scala:38) finished in 6.446 s
    21/10/14 23:25:51 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool 
    21/10/14 23:25:51 INFO DAGScheduler: Job 0 is finished. Cancelling potential speculative or zombie tasks for this job
    21/10/14 23:25:51 INFO TaskSchedulerImpl: Killing all running tasks in stage 0: Stage finished
    21/10/14 23:25:51 INFO DAGScheduler: Job 0 finished: reduce at SparkPi.scala:38, took 6.875487 s
    Pi is roughly 3.143099143099143
    21/10/14 23:25:51 INFO SparkUI: Stopped Spark web UI at http://hadoop102:4040
    21/10/14 23:25:51 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
    21/10/14 23:25:51 INFO MemoryStore: MemoryStore cleared
    21/10/14 23:25:51 INFO BlockManager: BlockManager stopped
    21/10/14 23:25:51 INFO BlockManagerMaster: BlockManagerMaster stopped
    21/10/14 23:25:51 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
    21/10/14 23:25:51 INFO SparkContext: Successfully stopped SparkContext
    21/10/14 23:25:51 INFO ShutdownHookManager: Shutdown hook called
    21/10/14 23:25:51 INFO ShutdownHookManager: Deleting directory /tmp/spark-5c007d29-3e78-4b75-b410-8b4be66fcd29
    21/10/14 23:25:51 INFO ShutdownHookManager: Deleting directory /tmp/spark-997faf68-9d5f-48cf-bc5d-ba59624f1a01
    

    相关文章

      网友评论

          本文标题:Spark-Local模式

          本文链接:https://www.haomeiwen.com/subject/yvhpoltx.html