Java版本:
JavaSparkContext sc = ...;
SQLContext sqlContext = new SQLContext(sc);
DataFrame df = sqlContext.read().json("hdfs://spark1:9000/students.json");
df.show();
Scala版本:
val sc: SparkContext = ...
val sqlContext = new SQLContext(sc)
val df = sqlContext.read.json("hdfs://spark1:9000/students.json")
df.show()
案例 json数据源
{"id":1, "name":"leo", "age":18}
{"id":2, "name":"jack", "age":19}
{"id":3, "name":"marry", "age":17}
Java版本
public class DataFrameCreate {
public static void main(String[] args) {
SparkConf conf = newSparkConf().setAppName("DataFrameCreate").setMaster("local");
JavaSparkContext sc = new JavaSparkContext(conf);
SQLContext sqlContext = new SQLContext(sc);
DataFrame df = sqlContext.read().json("C:\\Users\\zhang\\Desktop\\students.json")
df.show();
}
}
运行到linux集群上面
- 打包 文件路径改成hdfs://spark1:9000/students.json
Sh文件
spark-submit \
--class sql.DataFrameCreate \
--num-executors 3 \
--driver-memory 100m \
--executor-memory 100m \
--executor-cores 3 \
--files /usr/local/hive/conf/hive-site.xml \
--driver-class-path /usr/local/hive/lib/mysql-connector-java-5.1.17.jar \
/sql/worldcount.jar \
Scala版本
object DataFrameCreate {
def main(args: Array[String]){
val conf = new SparkConf().setAppName("DataFrameCreate")
val sc = new SparkContext(conf)
val sqlContext = new SQLContext(sc)
val df = sqlContext.read.json("hdfs://spark1:9000/students.json")
df.show()
}
}
网友评论