备注:
Cloudera 6.3.1
Spark 2.4
一.Spark安装目录结构
Spark组件主要文件安装目录:
{BIGDATE_HOME} 为 /opt/cloudera/parcels/CDH-6.3.1-1.cdh6.3.1.p0.1470567
![](https://img.haomeiwen.com/i2638478/44b840de223e9a86.png)
目录路径 | 目录说明 |
---|---|
bin | 可执行文件,包含spark-submit、spark-shell |
etc | 配置文件 |
lib、lib64 | Spark依赖包目录 |
二.Spark日志目录结构
Spark组件服务日志目录:/var/log/spark/
![](https://img.haomeiwen.com/i2638478/1c75b70257c84c65.png)
spark-history-server-hostname.log 代表history-server服务运行的
lineage 目录下的log文件是spark运行的log
Spark任务日志查看:
![](https://img.haomeiwen.com/i2638478/5c80a2bfcf92b1a2.png)
![](https://img.haomeiwen.com/i2638478/04c2bd1253dc6a9c.png)
![](https://img.haomeiwen.com/i2638478/ac1bdef5a8ce956d.png)
三.维护命令与参数
因为使用spark-submit命令提交,此处主要讲解spark-submit的参数
spark-submit
--master MASTER_URL
--deploy-mode DEPLOY_MODE
--conf PROP=VALUE
--py-files PY_FILES
... # other options
<python file> [app arguments]
- Master URLS
- Local: local、local[K]、local[*]
- Standalone: spark://HOST:PORT
- Mesos: mesos://HOST:PORT
- YARN: yarn-client、yarn-cluster(根据本地hadoop配置)
举例:
spark-submit
--master yarn-client
--executor-memory 512M
--driver-memory 512M
--num-executors 3
--executor-cores 2
--queue root.spark
sparkpi.py 100
网友评论