美文网首页
2020-06-29 搭建spark3的jupyter环境

2020-06-29 搭建spark3的jupyter环境

作者: 缘起助 | 来源:发表于2020-06-29 20:55 被阅读0次
  1. 访问spark-3.0.0-bin-hadoop2.7.tgz
    选择一个镜像下载spark到本地。

  2. 解压
    tar zxf spark-3.0.0-bin-hadoop2.7.tgz

  3. 安装jupyter

pip3 install --user jupyterlab notebook
pip3 install --user findspark
  1. 为jupyter创建新的kernel
cd /usr/local/share/jupyter/kernels/
vi pyspark3/kernel.json

粘贴下面内容到文件中

{
"display_name": "PySpark",
"language": "python",
"argv": [
  "/usr/local/bin/python3",
  "-m",
  "ipykernel",
  "-f",
  "{connection_file}"
],
"env": {
  "SPARK_HOME": "/Users/apple/Downloads/spark-3.0.0-bin-hadoop3.2/",
  "PYTHONPATH": "/Users/apple/Downloads/spark-3.0.0-bin-hadoop3.2/python/:/Users/apple/Downloads/spark-3.0.0-bin-hadoop3.2/python/lib/py4j-0.10.7-src.zip",
  "PYTHONSTARTUP": "/Users/apple/Downloads/spark-3.0.0-bin-hadoop3.2/python/pyspark/shell.py",
  "PYSPARK_SUBMIT_ARGS": "--master local[*] --conf spark.executor.cores=1 --conf spark.executor.memory=512m pyspark-shell"
}
}
  1. 修改pyspark参数
cd /Users/apple/Downloads/spark-3.0.0-bin-hadoop3.2/conf
cp spark-env.sh.template spark-env.sh
vi spark-env.sh

粘贴下面内容到文件中

export JAVA_HOME=$(/usr/libexec/java_home -v 1.8)
export PYSPARK_PYTHON='python3'
export PYSPARK_DRIVER_PYTHON='python3'
export PYSPARK_DRIVER_PYTHON_OPTS=' -m jupyter notebook'
  1. 启动pyspark
    在终端执行
cd /Users/apple/Downloads/spark-3.0.0-bin-hadoop3.2/
bin/pyspark

这时jupyter notebook应该可以启动了,点击New,选择”PySpark“就可以了。

  1. 试用一下。在notebook中输入
import findspark

findspark.init()

import pyspark
import random

sc = pyspark.SparkContext(appName="Pi")

sc

应该返回

SparkContext

Spark UI

Version
v3.0.0
Master
local[*]
AppName
Pi

  1. 谢谢

相关文章

网友评论

      本文标题:2020-06-29 搭建spark3的jupyter环境

      本文链接:https://www.haomeiwen.com/subject/akiafktx.html