spark:2.3.0
livy : 0.5.0
下载 bin包
(下载的livy包的lib中hadoop的版本是2.7.3 , 可能会有问题!)
配置 livy-env.sh
# - JAVA_HOME Java runtime to use. By default use "java" from PATH.
# - HADOOP_CONF_DIR Directory containing the Hadoop / YARN configuration to use.
# - SPARK_HOME Spark which you would like to use in Livy.
# - SPARK_CONF_DIR Optional directory where the Spark configuration lives.
# (Default: $SPARK_HOME/conf)
# - LIVY_LOG_DIR Where log files are stored. (Default: ${LIVY_HOME}/logs)
# - LIVY_PID_DIR Where the pid file is stored. (Default: /tmp)
# - LIVY_SERVER_JAVA_OPTS Java Opts for running livy server (You can set jvm related setting here,
# like jvm memory/gc algorithm and etc.)
# - LIVY_IDENT_STRING A name that identifies the Livy server instance, used to generate log file
# names. (Default: name of the user starting Livy).
# - LIVY_MAX_LOG_FILES Max number of log file to keep in the log directory. (Default: 5.)
# - LIVY_NICENESS Niceness of the Livy server process when running in the background. (Default: 0.)
JAVA_HOME=/usr/java/jdk1.8.0_172
HADOOP_CONF_DIR=/etc/hadoop/conf
SPARK_HOME=/usr/lib/apacheori/spark-2.3.0-bin-hadoop2.6
配置 livy.conf
livy.server.host = node203.hmbank.com
# What port to start the server on.
livy.server.port = 8998
# What spark master Livy sessions should use.
livy.spark.master = spark://node202.hmbank.com:7077
在restapi中不支持指定master , 需要在上面配置文件中指定。
上面的配置使用master 是 spark standalone集群。
当使用spark on yarn时, 需要配置为yarn 。
网友评论