Spark:读取mysql数据作为DataFrame

作者: 利伊奥克儿 | 来源:发表于2018-10-12 15:18 被阅读2次

Spark:读取mysql数据作为DataFrame
JDBC数据源
hive日常总结
Spark读取结构化数据
分析伯乐在线文章数据
spark python分析环境空气污染数据
spark连接mysql出现java.math.BigInteg
第八篇|Spark SQL百万级数据批量读写入MySQL
【Spark学习笔记】 Scala DataFrame操作大全
spark操作mysql总结

读取mysql数据作为DataFrame

import java.text.SimpleDateFormat
import java.util.{Calendar, Date}

import com.iptv.domain.DatePattern
import com.iptv.job.JobBase
import org.apache.spark.sql.expressions.Window
import org.apache.spark.sql.functions._
import org.apache.spark.sql.types.DoubleType
import org.apache.spark.sql.{DataFrame, SQLContext, SaveMode}
import org.apache.spark.{SparkConf, SparkContext}

/**
    * 获取配置文件
    *
    * @param proPath
    * @return
    */
  def getProPerties(proPath: String): Properties = {
    val properties: Properties = new Properties()
    properties.load(new FileInputStream(proPath))
    properties
  }

  /**
    * 获取 Mysql 表的数据
    *
    * @param sqlContext
    * @param tableName 读取Mysql表的名字
    * @param proPath   配置文件的路径
    * @return 返回 Mysql 表的 DataFrame
    */
  def readMysqlTable(sqlContext: SQLContext, tableName: String, proPath: String) = {
    val properties: Properties = getProPerties(proPath)
      sqlContext
        .read
        .format("jdbc")
        .option("url", properties.getProperty("mysql.url"))
        .option("driver", properties.getProperty("mysql.driver"))
        .option("user", properties.getProperty("mysql.username"))
        .option("password", properties.getProperty("mysql.password"))
        //        .option("dbtable", tableName.toUpperCase)
        .option("dbtable", tableName)
        .load()

  }

  /**
    * 获取 Mysql 表的数据 添加过滤条件
    *
    * @param sqlContext
    * @param table           读取Mysql表的名字
    * @param filterCondition 过滤条件
    * @param proPath         配置文件的路径
    * @return 返回 Mysql 表的 DataFrame
    */
  def readMysqlTable(sqlContext: SQLContext, table: String, filterCondition: String, proPath: String) = {
    val properties: Properties = getProPerties(proPath)
    var tableName = ""
      tableName = "(select * from " + table + " where " + filterCondition + " ) as t1" //支持将条件套入sql将临时表用作数据源
      sqlContext
        .read
        .format("jdbc")
        .option("url", properties.getProperty("mysql.url"))
        .option("driver", properties.getProperty("mysql.driver"))
        .option("user", properties.getProperty("mysql.username"))
        .option("password", properties.getProperty("mysql.password"))
        //        .option("dbtable", tableName.toUpperCase)
        .option("dbtable", tableName)
        .load()
  }

使用示例

//不添加过滤条件
val conf: SparkConf = new SparkConf().setAppName(getClass.getSimpleName)
    val sc: SparkContext = new SparkContext(conf)
    val sqlContext: SQLContext = getSQLContext(sc)
    import sqlContext.implicits._
val test_table_dataFrame: DataFrame = readMysqlTable(sqlContext, "TEST_TABLE", proPath).persist(PERSIST_LEVEL)
----------------------------------------------------------------------------------------------------
//添加过滤条件
//获取 task_id = ${task_id} 数据做为DataFrame
val conf: SparkConf = new SparkConf().setAppName(getClass.getSimpleName)
    val sc: SparkContext = new SparkContext(conf)
    val sqlContext: SQLContext = getSQLContext(sc)
    import sqlContext.implicits._
val test_table_dataFrame = readMysqlTable(sqlContext, "TEST_TABLE", s"task_id=${task_id}", configPath)

配置文件部分内容

配置文件部分内容
#mysql数据库配置
mysql.driver=com.mysql.jdbc.Driver
mysql.url=jdbc:mysql://0.0.0.0:3306/iptv?useSSL=false&autoReconnect=true&failOverReadOnly=false&rewriteBatchedStatements=true
mysql.username=lillclol
mysql.password=123456

#hive
hive.root_path=hdfs://ns1/user/hive/warehouse/

此为本人日常工作中的原创总结，转载请注明出处！！！！！

Spark:读取mysql数据作为DataFrame
读取mysql数据作为DataFrame 使用示例配置文件部分内容此为本人日常工作中的原创总结，转载请注明出处...
JDBC数据源
Spark SQL支持使用JDBC从关系型数据库（比如MySQL）中读取数据。读取的数据，依然由DataFrame...
hive日常总结
spark sql 读取mysql 数据库和写入mysql数据库时， dataframe 字段比表字段多一个直接...
Spark读取结构化数据
读取结构化数据 Spark可以从本地CSV，HDFS以及Hive读取结构化数据，直接解析为DataFrame，进行...
分析伯乐在线文章数据
一、读取文章数据 pandas读取mysql数据到DataFrame中二、数据分析 1. 查看数据 df.inf...
spark python分析环境空气污染数据
spark python分析环境空气污染数据在测试中发现不能实现spark读取dataframe后，遍历dataf...
spark连接mysql出现java.math.BigInteg
在Spark连接mysql数据库进行查询数据时遇到这样一个坑 val df:DataFrame = spark.r...
第八篇|Spark SQL百万级数据批量读写入MySQL
Spark SQL读取MySQL的方式 Spark SQL还包括一个可以使用JDBC从其他数据库读取数据的数据源。...
【Spark学习笔记】 Scala DataFrame操作大全
1、创建DataFrame本文所使用的DataFrame是通过读取mysql数据库获得的，代码如下: 2、Data...
spark操作mysql总结
1、Spark DataFrame写入mysql DataFrame写入mysql就没什么可重点注意的了，这里说的...

Spark:读取mysql数据作为DataFrame

读取mysql数据作为DataFrame

相关文章

Spark:读取mysql数据作为DataFrame

JDBC数据源

hive日常总结

Spark读取结构化数据

分析伯乐在线文章数据

spark python分析环境空气污染数据

spark连接mysql出现java.math.BigInteg

第八篇|Spark SQL百万级数据批量读写入MySQL

【Spark学习笔记】 Scala DataFrame操作大全

spark操作mysql总结

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读

大数据，机器学习，人工智能

玩转大数据

大数据爬虫Python AI Sql

大数据

spark

大数据