- 写了个spark任务,
cd C:\Users\Administrator\IdeaProjects\SparkSQLProject>
mvn clean package -DskipTests
- 打成jar包-rz上传至spark服务器
home/hadoop/Downloads/
目录下,
local mode
,执行提交任务:
spark-submit \
--class com.xxx.cn.SQLContextApp \
--master local[2] \
home/hadoop/Downloads/sparksql-1.0.jar \
home/hadoop/app/spark-2.2.0-bin-2.6.0-cdh5.7.0/examples/src/main/resources/people.json
-
报错信息如下:
image.png
关键字:
Exception in thread "main" org.apache.spark.sql.AnalysisException: Path does not exist: hdfs://hadoop000:8020/home/hadoop/app/...
- 解决方法:
https://stackoverflow.com/questions/27299923/how-to-load-local-file-in-sc-textfile-instead-of-hdfs
Try explicitly specify
sc.textFile("file:///path to the file/")
. The error occurs when Hadoop environment is set.
SparkContext.textFile internally calls
org.apache.hadoop.mapred.FileInputFormat.getSplits
, which in turn usesorg.apache.hadoop.fs.getDefaultUri
if schema is absent. This method reads "fs.defaultFS" parameter of Hadoop conf. If you set HADOOP_CONF_DIR environment variable, the parameter is usually set as "hdfs://..."; otherwise "file://".
- 所以,
vi spark-env.sh
export JAVA_HOME=/home/hadoop/app/jdk1.8.0_144
export SPARK_MASTER_HOST=hadoop000
export SPARK_WORKER_CORES=2
export SPARK_WORKER_MEMORY=1g
export SPARK_WORKER_INSTANCES=2
export HADOOP_HOME=/home/hadoop/app/hadoop-2.6.0-cdh5.7.0
#export HADOOP_CONF_DIR=/home/hadoop/app/hadoop-2.6.0-cdh5.7.0/etc/hadoop
- 再次执行,spark-submit,结果OK.
网友评论