下载
http://archive.apache.org/dist/spark/spark-3.0.0/
解压安装
[server@hadoop102 ~]$ cd /opt/software/
[server@hadoop102 software]$ tar -zxvf spark-3.0.0-bin-hadoop3.2.tgz -C /opt/module/
[server@hadoop102 software]$ cd /opt/module/
[server@hadoop102 module]$ mv spark-3.0.0-bin-hadoop3.2/ spark-local
[server@hadoop102 module]$ cd spark-local/
[server@hadoop102 spark-local]$ cd data/
[server@hadoop102 data]$ pwd
/opt/module/spark-local/data
[server@hadoop102 data]$ vim word.txt
My Spark
My Scala
My Spark
[server@hadoop102 data]$ cd ..
[server@hadoop102 spark-local]$ pwd
/opt/module/spark-local
[server@hadoop102 spark-local]$ bin/spark-shell
21/10/14 23:06:17 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
Spark context Web UI available at http://hadoop102:4040
Spark context available as 'sc' (master = local[*], app id = local-1634224020472).
Spark session available as 'spark'.
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 3.0.0
/_/
Using Scala version 2.12.10 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_65)
Type in expressions to have them evaluated.
Type :help for more information.
scala> var i =8
i: Int = 8
安装成功,试一下wordcount
scala> sc.textFile("data/word.txt").flatMap(_.split(" ")).map((_,1)).reduceByKey(_+_).collect
res0: Array[(String, Int)] = Array((Spark,2), (Scala,1), (My,3))
scala> :quit
计算的监控页面http://hadoop102:4040/jobs/
[Submitted]提交时间
[Duration]执行时间
[Tasks]任务
运行官方例子
bin/spark-submit \
--class org.apache.spark.examples.SparkPi \
--master local[2] \
./examples/jars/spark-examples_2.12-3.0.0.jar \
10
[server@hadoop102 spark-local]$ bin/spark-submit \
> --class org.apache.spark.examples.SparkPi \
> --master local[2] \
> ./examples/jars/spark-examples_2.12-3.0.0.jar \
> 10
21/10/14 23:25:32 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
21/10/14 23:25:34 INFO SparkContext: Running Spark version 3.0.0
21/10/14 23:25:34 INFO ResourceUtils: ==============================================================
21/10/14 23:25:34 INFO ResourceUtils: Resources for spark.driver:
21/10/14 23:25:34 INFO ResourceUtils: ==============================================================
21/10/14 23:25:34 INFO SparkContext: Submitted application: Spark Pi
21/10/14 23:25:34 INFO SecurityManager: Changing view acls to: server
21/10/14 23:25:34 INFO SecurityManager: Changing modify acls to: server
21/10/14 23:25:34 INFO SecurityManager: Changing view acls groups to:
21/10/14 23:25:34 INFO SecurityManager: Changing modify acls groups to:
21/10/14 23:25:34 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(server); groups with view permissions: Set(); users with modify permissions: Set(server); groups with modify permissions: Set()
21/10/14 23:25:36 INFO Utils: Successfully started service 'sparkDriver' on port 36880.
21/10/14 23:25:36 INFO SparkEnv: Registering MapOutputTracker
21/10/14 23:25:36 INFO SparkEnv: Registering BlockManagerMaster
21/10/14 23:25:36 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
21/10/14 23:25:36 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
21/10/14 23:25:36 INFO SparkEnv: Registering BlockManagerMasterHeartbeat
21/10/14 23:25:36 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-8474253f-7a83-4398-adb6-ad8f1da6f88b
21/10/14 23:25:37 INFO MemoryStore: MemoryStore started with capacity 413.9 MiB
21/10/14 23:25:37 INFO SparkEnv: Registering OutputCommitCoordinator
21/10/14 23:25:38 INFO Utils: Successfully started service 'SparkUI' on port 4040.
21/10/14 23:25:38 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://hadoop102:4040
21/10/14 23:25:39 INFO SparkContext: Added JAR file:/opt/module/spark-local/./examples/jars/spark-examples_2.12-3.0.0.jar at spark://hadoop102:36880/jars/spark-examples_2.12-3.0.0.jar with timestamp 1634225138989
21/10/14 23:25:40 INFO Executor: Starting executor ID driver on host hadoop102
21/10/14 23:25:40 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 42450.
21/10/14 23:25:40 INFO NettyBlockTransferService: Server created on hadoop102:42450
21/10/14 23:25:40 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
21/10/14 23:25:40 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, hadoop102, 42450, None)
21/10/14 23:25:40 INFO BlockManagerMasterEndpoint: Registering block manager hadoop102:42450 with 413.9 MiB RAM, BlockManagerId(driver, hadoop102, 42450, None)
21/10/14 23:25:40 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, hadoop102, 42450, None)
21/10/14 23:25:40 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, hadoop102, 42450, None)
21/10/14 23:25:44 INFO SparkContext: Starting job: reduce at SparkPi.scala:38
21/10/14 23:25:44 INFO DAGScheduler: Got job 0 (reduce at SparkPi.scala:38) with 10 output partitions
21/10/14 23:25:44 INFO DAGScheduler: Final stage: ResultStage 0 (reduce at SparkPi.scala:38)
21/10/14 23:25:44 INFO DAGScheduler: Parents of final stage: List()
21/10/14 23:25:44 INFO DAGScheduler: Missing parents: List()
21/10/14 23:25:44 INFO DAGScheduler: Submitting ResultStage 0 (MapPartitionsRDD[1] at map at SparkPi.scala:34), which has no missing parents
21/10/14 23:25:45 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 3.1 KiB, free 413.9 MiB)
21/10/14 23:25:45 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 1816.0 B, free 413.9 MiB)
21/10/14 23:25:45 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on hadoop102:42450 (size: 1816.0 B, free: 413.9 MiB)
21/10/14 23:25:45 INFO SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:1200
21/10/14 23:25:45 INFO DAGScheduler: Submitting 10 missing tasks from ResultStage 0 (MapPartitionsRDD[1] at map at SparkPi.scala:34) (first 15 tasks are for partitions Vector(0, 1, 2, 3, 4, 5, 6, 7, 8, 9))
21/10/14 23:25:45 INFO TaskSchedulerImpl: Adding task set 0.0 with 10 tasks
21/10/14 23:25:46 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, hadoop102, executor driver, partition 0, PROCESS_LOCAL, 7393 bytes)
21/10/14 23:25:46 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, hadoop102, executor driver, partition 1, PROCESS_LOCAL, 7393 bytes)
21/10/14 23:25:46 INFO Executor: Running task 0.0 in stage 0.0 (TID 0)
21/10/14 23:25:46 INFO Executor: Running task 1.0 in stage 0.0 (TID 1)
21/10/14 23:25:46 INFO Executor: Fetching spark://hadoop102:36880/jars/spark-examples_2.12-3.0.0.jar with timestamp 1634225138989
21/10/14 23:25:46 INFO TransportClientFactory: Successfully created connection to hadoop102/192.168.100.102:36880 after 265 ms (0 ms spent in bootstraps)
21/10/14 23:25:46 INFO Utils: Fetching spark://hadoop102:36880/jars/spark-examples_2.12-3.0.0.jar to /tmp/spark-997faf68-9d5f-48cf-bc5d-ba59624f1a01/userFiles-14e86a58-0a9b-4489-afab-90c89993477e/fetchFileTemp5339847072173054765.tmp
21/10/14 23:25:47 INFO Executor: Adding file:/tmp/spark-997faf68-9d5f-48cf-bc5d-ba59624f1a01/userFiles-14e86a58-0a9b-4489-afab-90c89993477e/spark-examples_2.12-3.0.0.jar to class loader
21/10/14 23:25:49 INFO Executor: Finished task 0.0 in stage 0.0 (TID 0). 1000 bytes result sent to driver
21/10/14 23:25:49 INFO Executor: Finished task 1.0 in stage 0.0 (TID 1). 1043 bytes result sent to driver
21/10/14 23:25:49 INFO TaskSetManager: Starting task 2.0 in stage 0.0 (TID 2, hadoop102, executor driver, partition 2, PROCESS_LOCAL, 7393 bytes)
21/10/14 23:25:49 INFO Executor: Running task 2.0 in stage 0.0 (TID 2)
21/10/14 23:25:49 INFO TaskSetManager: Starting task 3.0 in stage 0.0 (TID 3, hadoop102, executor driver, partition 3, PROCESS_LOCAL, 7393 bytes)
21/10/14 23:25:49 INFO Executor: Running task 3.0 in stage 0.0 (TID 3)
21/10/14 23:25:49 INFO TaskSetManager: Finished task 1.0 in stage 0.0 (TID 1) in 3362 ms on hadoop102 (executor driver) (1/10)
21/10/14 23:25:49 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 3645 ms on hadoop102 (executor driver) (2/10)
21/10/14 23:25:49 INFO Executor: Finished task 2.0 in stage 0.0 (TID 2). 1000 bytes result sent to driver
21/10/14 23:25:49 INFO Executor: Finished task 3.0 in stage 0.0 (TID 3). 1000 bytes result sent to driver
21/10/14 23:25:49 INFO TaskSetManager: Starting task 4.0 in stage 0.0 (TID 4, hadoop102, executor driver, partition 4, PROCESS_LOCAL, 7393 bytes)
21/10/14 23:25:49 INFO Executor: Running task 4.0 in stage 0.0 (TID 4)
21/10/14 23:25:49 INFO TaskSetManager: Starting task 5.0 in stage 0.0 (TID 5, hadoop102, executor driver, partition 5, PROCESS_LOCAL, 7393 bytes)
21/10/14 23:25:49 INFO TaskSetManager: Finished task 2.0 in stage 0.0 (TID 2) in 517 ms on hadoop102 (executor driver) (3/10)
21/10/14 23:25:49 INFO Executor: Running task 5.0 in stage 0.0 (TID 5)
21/10/14 23:25:49 INFO TaskSetManager: Finished task 3.0 in stage 0.0 (TID 3) in 507 ms on hadoop102 (executor driver) (4/10)
21/10/14 23:25:50 INFO Executor: Finished task 4.0 in stage 0.0 (TID 4). 1000 bytes result sent to driver
21/10/14 23:25:50 INFO TaskSetManager: Starting task 6.0 in stage 0.0 (TID 6, hadoop102, executor driver, partition 6, PROCESS_LOCAL, 7393 bytes)
21/10/14 23:25:50 INFO TaskSetManager: Finished task 4.0 in stage 0.0 (TID 4) in 375 ms on hadoop102 (executor driver) (5/10)
21/10/14 23:25:50 INFO Executor: Running task 6.0 in stage 0.0 (TID 6)
21/10/14 23:25:50 INFO Executor: Finished task 5.0 in stage 0.0 (TID 5). 1000 bytes result sent to driver
21/10/14 23:25:50 INFO TaskSetManager: Starting task 7.0 in stage 0.0 (TID 7, hadoop102, executor driver, partition 7, PROCESS_LOCAL, 7393 bytes)
21/10/14 23:25:50 INFO TaskSetManager: Finished task 5.0 in stage 0.0 (TID 5) in 382 ms on hadoop102 (executor driver) (6/10)
21/10/14 23:25:50 INFO Executor: Running task 7.0 in stage 0.0 (TID 7)
21/10/14 23:25:50 INFO Executor: Finished task 6.0 in stage 0.0 (TID 6). 1000 bytes result sent to driver
21/10/14 23:25:50 INFO TaskSetManager: Starting task 8.0 in stage 0.0 (TID 8, hadoop102, executor driver, partition 8, PROCESS_LOCAL, 7393 bytes)
21/10/14 23:25:50 INFO TaskSetManager: Finished task 6.0 in stage 0.0 (TID 6) in 386 ms on hadoop102 (executor driver) (7/10)
21/10/14 23:25:50 INFO Executor: Finished task 7.0 in stage 0.0 (TID 7). 1000 bytes result sent to driver
21/10/14 23:25:50 INFO TaskSetManager: Starting task 9.0 in stage 0.0 (TID 9, hadoop102, executor driver, partition 9, PROCESS_LOCAL, 7393 bytes)
21/10/14 23:25:50 INFO TaskSetManager: Finished task 7.0 in stage 0.0 (TID 7) in 363 ms on hadoop102 (executor driver) (8/10)
21/10/14 23:25:50 INFO Executor: Running task 8.0 in stage 0.0 (TID 8)
21/10/14 23:25:50 INFO Executor: Running task 9.0 in stage 0.0 (TID 9)
21/10/14 23:25:50 INFO Executor: Finished task 8.0 in stage 0.0 (TID 8). 1000 bytes result sent to driver
21/10/14 23:25:51 INFO TaskSetManager: Finished task 8.0 in stage 0.0 (TID 8) in 380 ms on hadoop102 (executor driver) (9/10)
21/10/14 23:25:51 INFO Executor: Finished task 9.0 in stage 0.0 (TID 9). 1000 bytes result sent to driver
21/10/14 23:25:51 INFO TaskSetManager: Finished task 9.0 in stage 0.0 (TID 9) in 372 ms on hadoop102 (executor driver) (10/10)
21/10/14 23:25:51 INFO DAGScheduler: ResultStage 0 (reduce at SparkPi.scala:38) finished in 6.446 s
21/10/14 23:25:51 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool
21/10/14 23:25:51 INFO DAGScheduler: Job 0 is finished. Cancelling potential speculative or zombie tasks for this job
21/10/14 23:25:51 INFO TaskSchedulerImpl: Killing all running tasks in stage 0: Stage finished
21/10/14 23:25:51 INFO DAGScheduler: Job 0 finished: reduce at SparkPi.scala:38, took 6.875487 s
Pi is roughly 3.143099143099143
21/10/14 23:25:51 INFO SparkUI: Stopped Spark web UI at http://hadoop102:4040
21/10/14 23:25:51 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
21/10/14 23:25:51 INFO MemoryStore: MemoryStore cleared
21/10/14 23:25:51 INFO BlockManager: BlockManager stopped
21/10/14 23:25:51 INFO BlockManagerMaster: BlockManagerMaster stopped
21/10/14 23:25:51 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
21/10/14 23:25:51 INFO SparkContext: Successfully stopped SparkContext
21/10/14 23:25:51 INFO ShutdownHookManager: Shutdown hook called
21/10/14 23:25:51 INFO ShutdownHookManager: Deleting directory /tmp/spark-5c007d29-3e78-4b75-b410-8b4be66fcd29
21/10/14 23:25:51 INFO ShutdownHookManager: Deleting directory /tmp/spark-997faf68-9d5f-48cf-bc5d-ba59624f1a01
网友评论