美文网首页
Apache Superset 安装 (scratch方式)

Apache Superset 安装 (scratch方式)

作者: 偷油考拉 | 来源:发表于2023-09-30 17:56 被阅读0次

    https://superset.apache.org/docs/installation/installing-superset-from-scratch

    一、系统依赖

    sudo apt-get install build-essential libssl-dev libffi-dev libsasl2-dev libldap2-dev default-libmysqlclient-dev
    sudo apt-get install  python3-dev python3-pip
    

    二、Python Virtual Environment

    sudo apt-get install python3-virtualenv python3-venv
    
    cd /data/git-project/superset-scratch 
    python3 -m venv venv
    source venv/bin/activate
    ./venv/bin/pip install apache-superset -i https://pypi.tuna.tsinghua.edu.cn/simple
    

    三、初始化

    初始化数据库

    export FLASK_APP=superset
    mkdir superset_conf/ && touch superset_conf/superset_config.py
    export SUPERSET_CONFIG_PATH=`pwd`/superset_conf/superset_config.py
    superset db upgrade
    superset fab create-admin
    superset load_examples
    superset init
    # Build javascript assets 不再需要编译 frontend
    #cd superset-frontend
    #npm ci
    #npm run build
    #cd ..
    superset run -p 8088 --with-threads --reload --debugger
    

    附件一、superset load_examples 失败

    https://blog.csdn.net/yan15625123250/article/details/121077103
    下载 github examples-data 源码数据

    wget -c 'https://codeload.github.com/apache-superset/examples-data/zip/refs/heads/master' -O master.zip
    curl 'https://codeload.github.com/apache-superset/examples-data/zip/refs/heads/master' -o master.zip
    https://github.com/apache-superset/examples-data.git
    

    以该源码建一个Http站点 http://127.0.0.1/superset/examples-data/

    然后修改 site-packages/superset/examples/helpers.py,如下:

    #BASE_URL = "https://github.com/apache-superset/examples-data/blob/master/"
    #修改为自己的站点
    BASE_URL = "http://127.0.0.1/superset/examples-data/"
    

    修改site-packages/superset/examples/configs/datasets/examples/下的文件

    # 1. 备份
    tar czvf yaml.tar.gz *.yaml
    # 2. 测试解压
    tar tzvf yaml.tar.gz
    # 3. 正则替换
    sed -i 's/https:\/\/raw.githubusercontent.com\/apache-superset\/examples-data\/master/http:\/\/127.0.0.1\/superset\/examples-data/g' *.yaml
    # 4. 补充正则替换
    sed -i 's/https:\/\/github.com\/apache-superset\/examples-data\/raw\/master/http:\/\/127.0.0.1\/superset\/examples-data/g' *.yaml
    sed -i 's/https:\/\/github.com\/apache-superset\/examples-data\/raw\/lowercase_columns_examples/http:\/\/127.0.0.1\/superset\/examples-data/g' *.yaml
    
    

    附件二、报错集锦

    报错1:
    AttributeError: module 'sqlparse.keywords' has no attribute 'FLAGS'
    解决方案1:
    https://github.com/apache/superset/issues/23742
    In superset/sql_parse.py sqlparse.keywords.FLAGS is used. However, sqlparse has removed the FLAGS variable from the code sqlparse/kewords.py.
    对 sqlparse 降级,执行如下:

    ./venv/bin/pip install  sqlparse=='0.4.3' -i https://pypi.tuna.tsinghua.edu.cn/simple
    

    报错2:
    Error: Could not locate a Flask application. You did not provide the "FLASK_APP" environment variable, and a "wsgi.py" or "app.py" module was not found in the current directory.
    解决方案2:

    export FLASK_APP=superset
    superset db upgrade
    

    报错3:

    --------------------------------------------------------------------------------
                                        WARNING
    --------------------------------------------------------------------------------
    A Default SECRET_KEY was detected, please use superset_config.py to override it.
    Use a strong complex alphanumeric string and use a tool to help you generate 
    a sufficiently random sequence, ex: openssl rand -base64 42
    --------------------------------------------------------------------------------
    --------------------------------------------------------------------------------
    Refusing to start due to insecure SECRET_KEY
    

    解决方案3:
    https://superset.apache.org/docs/installation/configuring-superset/#configuring-superset

    生成秘钥

    openssl rand -base64 42
    

    秘钥范例:
    hdEOxiInMglRP3WvHm4XCCN/kVmiVlhywfZ5iNM4cK5SkvBwGuTdEJ8A

    要配置superset,需要创建文件 superset_config.py,并添加到 PYTHONPATH。如下范例:

    mkdir superset_conf/ && touch superset_conf/superset_config.py
    export SUPERSET_CONFIG_PATH=`pwd`/superset_conf/superset_config.py
    

    superset_config.py 配置文件内容如下:

    # Superset specific config
    ROW_LIMIT = 5000
    
    # Flask App Builder configuration
    # Your App secret key will be used for securely signing the session cookie
    # and encrypting sensitive information on the database
    # Make sure you are changing this key for your deployment with a strong key.
    # Alternatively you can set it with `SUPERSET_SECRET_KEY` environment variable.
    # You MUST set this for production environments or the server will not refuse
    # to start and you will see an error in the logs accordingly.
    SECRET_KEY = 'hdEOxiInMglRP3WvHm4XCCN/kVmiVlhywfZ5iNM4cK5SkvBwGuTdEJ8A'
    
    # The SQLAlchemy connection string to your database backend
    # This connection defines the path to the database that stores your
    # superset metadata (slices, connections, tables, dashboards, ...).
    # Note that the connection information to connect to the datasources
    # you want to explore are managed directly in the web UI
    SQLALCHEMY_DATABASE_URI = 'sqlite:////data/git-project/superset-scratch/superset.db'
    
    # Flask-WTF flag for CSRF
    WTF_CSRF_ENABLED = True
    # Add endpoints that need to be exempt from CSRF protection
    WTF_CSRF_EXEMPT_LIST = []
    # A CSRF token that expires in 1 year
    WTF_CSRF_TIME_LIMIT = 60 * 60 * 24 * 365
    
    # Set this API key to enable Mapbox visualizations
    MAPBOX_API_KEY = ''
    

    报错4:
    ModuleNotFoundError: No module named 'marshmallow_enum'

    解决方案4:

    ./venv/bin/pip install marshmallow_enum -i https://pypi.tuna.tsinghua.edu.cn/simple
    

    报错4:
    No PIL installation found
    解决方案4:

    ./venv/bin/pip install Pillow -i https://pypi.tuna.tsinghua.edu.cn/simple
    

    报错5:
    在执行 superset load_examples 的时候报错 urllib.error.URLError: <urlopen error [Errno 111] Connection refused>

    解决方案5:
    https://github.com/apache/superset/issues/9488
    由于初始化需要的数据是用github下载的,而在中国大陆访问有限制导致下载失败
    解决办法就是是墙外从github下载样例数据,将数据在本地启动一个http文件服务,修改BASE_URL为本地地址即可,启动方式可以了解下python的内置库http.server,或者其他工具也可以

    可以通过设置proxy来解决。

    报错6:
    We haven't found any Content Security Policy (CSP) defined in the configurations. Please make sure to configure CSP using the TALISMAN_ENABLED and TALISMAN_CONFIG keys or any other external software.Failing to configure CSP have serious security implications. Check https://developer.mozilla.org/en-US/docs/Web/HTTP/CSP for more information. You can disable this warning using the CONTENT_SECURITY_POLICY_WARNING key.

    解决方案6:
    https://superset.apache.org/docs/installation/configuring-superset/
    https://github.com/apache/superset/blob/master/superset/config.py
    https://superset.apache.org/docs/security/#content-security-policy-csp
    https://pypi.org/project/flask-talisman/
    https://content-security-policy.com/
    https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Content-Security-Policy#syntax

    TALISMAN_ENABLED = True
    TALISMAN_CONFIG = {
        "force_https": False,
        "content_security_policy": { 'default-src': '\'self\'' }
        }
    
    

    Allow everything but only from the same origin

    报错7:
    /data/git-project/superset-scratch/venv/lib/python3.10/site-packages/flask_appbuilder/models/sqla/interface.py:64: SAWarning: relationship 'SqlaTable.slices' will copy column tables.id to column slices.datasource_id, which conflicts with relationship(s): 'Slice.table' (copies tables.id to slices.datasource_id). If this is not the intention, consider if these relationships should be linked with back_populates, or if viewonly=True should be applied to one or more if they are read-only. For the less common case that foreign key constraints are partially overlapping, the orm.foreign() annotation can be used to isolate the columns that should be written towards. To silence this warning, add the parameter 'overlaps="table"' to the 'SqlaTable.slices' relationship. (Background on this error at: https://sqlalche.me/e/14/qzyx) for prop in class_mapper(obj).iterate_properties:

    解决方案7:
    https://superset.apache.org/docs/installation/configuring-superset/#using-a-production-metastore
    修改metastore数据库为生产级别的数据库,PostgreSQL 10.X - 15.X , MySQL 5.7 8.X

    https://github.com/apache/superset/issues/23483
    如上issuse,数据库改成 MySQL 5.7解决问题。(我这里测试没效果)

    安装 pgsql / mysql 驱动

    ./venv/bin/pip install psycopg2-binary -i https://pypi.tuna.tsinghua.edu.cn/simple
    ./venv/bin/pip install mysqlclient -i https://pypi.tuna.tsinghua.edu.cn/simple
    

    修改配置文件superset_conf/superset_config.py

    #SQLALCHEMY_DATABASE_URI = 'sqlite:////data/git-project/superset-scratch/superset.db'
    SQLALCHEMY_DATABASE_URI = 'postgresql://superset:superset@127.0.0.1/superset'
    

    相关文章

      网友评论

          本文标题:Apache Superset 安装 (scratch方式)

          本文链接:https://www.haomeiwen.com/subject/vbnvvdtx.html