美文网首页
创建DataFrame

创建DataFrame

作者: 一个人一匹马 | 来源:发表于2019-02-21 15:45 被阅读0次

    Java版本:

    JavaSparkContext sc = ...;

    SQLContext sqlContext = new SQLContext(sc);

    DataFrame df = sqlContext.read().json("hdfs://spark1:9000/students.json");

    df.show();

    Scala版本:

    val sc: SparkContext = ...

    val sqlContext = new SQLContext(sc)

    val df = sqlContext.read.json("hdfs://spark1:9000/students.json")

    df.show()

    案例 json数据源

    {"id":1, "name":"leo", "age":18}

    {"id":2, "name":"jack", "age":19}

    {"id":3, "name":"marry", "age":17}

    Java版本

    public class DataFrameCreate {
    ​public static void main(String[] args) {
    ​​SparkConf conf = newSparkConf().setAppName("DataFrameCreate").setMaster("local");
    ​​JavaSparkContext sc = new JavaSparkContext(conf);
    ​​SQLContext sqlContext = new SQLContext(sc);
    ​​DataFrame df = sqlContext.read().json("C:\\Users\\zhang\\Desktop\\students.json")      
    ​​df.show();
    ​}
    }
    

    运行到linux集群上面

    1. 打包 文件路径改成hdfs://spark1:9000/students.json

    Sh文件

    spark-submit \
    
    --class sql.DataFrameCreate \
    
    --num-executors 3 \
    
    --driver-memory 100m \
    
    --executor-memory 100m \
    
    --executor-cores 3 \
    
    --files /usr/local/hive/conf/hive-site.xml \
    
    --driver-class-path /usr/local/hive/lib/mysql-connector-java-5.1.17.jar \
    
    /sql/worldcount.jar \
    

    Scala版本

    object DataFrameCreate {
    
    def main(args: Array[String]){
    
    val conf = new SparkConf().setAppName("DataFrameCreate")
    
    val sc = new SparkContext(conf)
    
    val sqlContext = new SQLContext(sc)
    
    val df = sqlContext.read.json("hdfs://spark1:9000/students.json")
    
    df.show()
    
    }
    }
    

    相关文章

      网友评论

          本文标题:创建DataFrame

          本文链接:https://www.haomeiwen.com/subject/fuaryqtx.html