美文网首页我爱编程
Jupyther 安装sparkMagic

Jupyther 安装sparkMagic

作者: 金刚_30bf | 来源:发表于2018-05-28 20:48 被阅读0次

sparkMagic : https://github.com/jupyter-incubator/sparkmagic

  1. 下载sparkMagic
    由于是离线环境, 至https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main/linux-64/ 下载
    sparkmagic-0.12.5-py36h8c657a7_0.tar.bz2

  2. 使用conda 安装 :

conda install *.bz2 
  1. 检查ipywidgets 已经正确安装 (这应该是在安装jupyther时安装的)
jupyter nbextension enable --py --sys-prefix widgetsnbextension 
  1. 使用 pip show sparkmagic 显示安装包位置:
[root@node203 offlinePython3Pkg]# pip show sparkmagic
Name: sparkmagic
Version: 0.12.5
Summary: SparkMagic: Spark execution via Livy
Home-page: https://github.com/jupyter-incubator/sparkmagic
Author: Jupyter Development Team
Author-email: jupyter@googlegroups.org
License: BSD 3-clause
Location: /usr/anaconda3/lib/python3.6/site-packages

  1. 至安装目录安装打包好的kernels
 jupyter-kernelspec install sparkmagic/kernels/sparkkernel
 jupyter-kernelspec install sparkmagic/kernels/pysparkkernel
 jupyter-kernelspec install sparkmagic/kernels/pyspark3kernel
 jupyter-kernelspec install sparkmagic/kernels/sparkrkernel
  1. 配置~/.magic/config.json

  2. 开启服务扩展

jupyter serverextension enable --py sparkmagic

报错:

[root@node203 jupyter]# jupyter serverextension enable --py sparkmagic
Traceback (most recent call last):
  File "/usr/anaconda3/bin/jupyter-serverextension", line 11, in <module>
    sys.exit(main())
  File "/usr/anaconda3/lib/python3.6/site-packages/jupyter_core/application.py", line 266, in launch_instance
    return super(JupyterApp, cls).launch_instance(argv=argv, **kwargs)
  File "/usr/anaconda3/lib/python3.6/site-packages/traitlets/config/application.py", line 658, in launch_instance
    app.start()
  File "/usr/anaconda3/lib/python3.6/site-packages/notebook/serverextensions.py", line 293, in start
    super(ServerExtensionApp, self).start()
  File "/usr/anaconda3/lib/python3.6/site-packages/jupyter_core/application.py", line 255, in start
    self.subapp.start()
  File "/usr/anaconda3/lib/python3.6/site-packages/notebook/serverextensions.py", line 210, in start
    self.toggle_server_extension_python(arg)
  File "/usr/anaconda3/lib/python3.6/site-packages/notebook/serverextensions.py", line 199, in toggle_server_extension_python
    m, server_exts = _get_server_extension_metadata(package)
  File "/usr/anaconda3/lib/python3.6/site-packages/notebook/serverextensions.py", line 327, in _get_server_extension_metadata
    m = import_item(module)
  File "/usr/anaconda3/lib/python3.6/site-packages/traitlets/utils/importstring.py", line 42, in import_item
    return __import__(parts[0])
  File "/usr/anaconda3/lib/python3.6/site-packages/sparkmagic/__init__.py", line 3, in <module>
    from sparkmagic.serverextension.handlers import load_jupyter_server_extension
  File "/usr/anaconda3/lib/python3.6/site-packages/sparkmagic/serverextension/handlers.py", line 9, in <module>
    from sparkmagic.kernels.kernelmagics import KernelMagics
  File "/usr/anaconda3/lib/python3.6/site-packages/sparkmagic/kernels/__init__.py", line 1, in <module>
    from sparkmagic.kernels.kernelmagics import *
  File "/usr/anaconda3/lib/python3.6/site-packages/sparkmagic/kernels/kernelmagics.py", line 12, in <module>
    from hdijupyterutils.utils import generate_uuid
ModuleNotFoundError: No module named 'hdijupyterutils'

缺少python依赖 : 下载 hdijupyterutils-0.12.5-py36hc0bb8fd_0.tar.bz2

离线安装缺少的包太多啦, 依赖不好管理 !!!!

(要么通过proxy联网安装, 要么有个环境, 将相应的包下载下来!)

  1. 联网环境下安装sparkMagic :
## Package Plan ##

  environment location: /usr/anaconda3

  added / updated specs: 
    - sparkmagic


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    pykerberos-1.1.14          |           py36_0          46 KB  https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free
    requests-kerberos-0.11.0   |           py36_0          15 KB  https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free
    plotly-2.0.11              |           py36_0         937 KB  https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free
    autovizwidget-0.12.1       |           py36_0          21 KB  https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free
    ca-certificates-2017.08.26 |       h1d4fec5_0         263 KB  https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
    hdijupyterutils-0.12.1     |           py36_0          13 KB  https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free
    sparkmagic-0.12.1          |           py36_0          64 KB  https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free
    certifi-2018.1.18          |           py36_0         144 KB  https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
    openssl-1.0.2n             |       hb7f436b_0         3.4 MB  https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
    krb5-1.13.2                |                0         3.5 MB  https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free
    ------------------------------------------------------------
                                           Total:         8.5 MB

The following NEW packages will be INSTALLED:

    autovizwidget:     0.12.1-py36_0         https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free
    hdijupyterutils:   0.12.1-py36_0         https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free
    krb5:              1.13.2-0              https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free
    plotly:            2.0.11-py36_0         https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free
    pykerberos:        1.1.14-py36_0         https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free
    requests-kerberos: 0.11.0-py36_0         https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free
    sparkmagic:        0.12.1-py36_0         https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free

The following packages will be UPDATED:

    ca-certificates:   2017.08.26-h1d4fec5_0 defaults                                                --> 2017.08.26-h1d4fec5_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
    certifi:           2018.1.18-py36_0      defaults                                                --> 2018.1.18-py36_0      https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
    openssl:           1.0.2n-hb7f436b_0     defaults                                                --> 1.0.2n-hb7f436b_0     https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main

Proceed ([y]/n)? y


Downloading and Extracting Packages
pykerberos 1.1.14: ###################################################################################################################################################################################### | 100% 
requests-kerberos 0.11.0: ############################################################################################################################################################################### | 100% 
plotly 2.0.11: ########################################################################################################################################################################################## | 100% 
autovizwidget 0.12.1: ################################################################################################################################################################################### | 100% 
ca-certificates 2017.08.26: ############################################################################################################################################################################# | 100% 
hdijupyterutils 0.12.1: ################################################################################################################################################################################# | 100% 
sparkmagic 0.12.1: ###################################################################################################################################################################################### | 100% 
certifi 2018.1.18: ###################################################################################################################################################################################### | 100% 
openssl 1.0.2n: ######################################################################################################################################################################################### | 100% 
krb5 1.13.2: ############################################################################################################################################################################################ | 100% 
Preparing transaction: done
Verifying transaction: done
Executing transaction: done
  1. 联网安装后 检查成功
[root@repo site-packages]# jupyter serverextension enable --py sparkmagic
Enabling: sparkmagic
- Writing config: /root/.jupyter
    - Validating...
      sparkmagic  OK

10 . 安装配置 Livy

  1. 配置 ~/.sparkmagic/config.json
使用的是git上的示例配置文件;
注意: master:8998 中的master替换成livy所在的主机名
  1. 配置 jupyter
    使用命令 生成配置文件:
 jupyter notebook --generate-config

配置文件目录在: ~/.jupyter/

  1. 修改notebook的初始化目录 , 在 jupyter_notebook_config.json 中
"notebook_dir":"/usr/anaconda3/dubook"
  1. 启动
jupyter notebook --no-browser --allow-root --ip=node203.hmbank.com --port=8888 &
  1. 修改初始密码

  2. 开始使用。

问题:

  • 导入 pysaprk 报错如下:
import pyspark 

      "The code failed because of a fatal error:\n",
      "\tError sending http request and maximum retry encountered..\n",
      "\n",
      "Some things to try:\n",
      "a) Make sure Spark has enough available resources for Jupyter to create a Spark context.\n",
      "b) Contact your Jupyter administrator to make sure the Spark magics library is configured correctly.\n",
      "c) Restart the kernel.\n"

解决:

  1. ~/.sparkmagic/config.json 中配置的livy的地址写错了 , 可以通过该目录下的logs日志发现。
2018-05-28 18:56:52,884 ERROR   ReliableHttpClient      Request to 'http://node202.hmbank.com:8998/sessions' failed with 'HTTPConnectionPool(host='node202.hmbank.com', port=8998): Max retries exceeded with url: /sessions (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fef9964f7f0>: Failed to establish a new connection: [Errno 111] Connection refused',))'
2018-05-28 18:56:52,888 INFO    EventsHandler   InstanceId: 4301a914-5087-4d77-a82b-19d6b2d7be7d,EventName: notebookSessionCreationEnd,Timestamp: 2018-05-28 10:56:52.888202,SessionGuid: 0131d7ad-d110-4559-9c76-549ba0916eae,LivyKind: pyspark3,SessionId: -1,Status: not_started,Success: False,ExceptionType: HttpClientException,ExceptionMessage: Error sending http request and maximum retry encountered.
2018-05-28 18:56:52,888 ERROR   SparkMagics     Error creating session: Error sending http request and maximum retry encountered.

  • 部署spark所在的机器上启动python , 执行 import pyspark 报错 :
>>> import pyspark
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'pyspark'

解决:
添加相关环境变量:

export SPARK_HOME=/usr/lib/apacheori/spark-2.3.0-bin-hadoop2.6

export PYSPARK_PYTHON=/usr/anaconda3/bin/python

export PYTHONPATH=$SPARK_HOME/python:$SPARK_HOME/python/lib/pyspark.zip:$SPARK_HOME/python/lib/py4j-0.10.6-src.zip:$PYTHONPATH

py4j-0.10.6-src.zip 和 pyspark.zip 在spark安装目录的python下。

相关文章

网友评论

    本文标题:Jupyther 安装sparkMagic

    本文链接:https://www.haomeiwen.com/subject/yjuijftx.html