JupyterHub集中管理Profiles

作者: 北邮郭大宝 | 来源:发表于2019-12-24 09:06 被阅读0次

JupyterHub集中管理Profiles
JupyterHub Multiple Profiles实现
ubuntu18.04虚拟机安装jupyterhub
jupyterhub 操作记录
Jupyterhub on K8s 定制notebook镜像
iOS【Jenkins-Keychain and Provisi
iOS持续集成（Jenkins+xcode插件）
iOS 发布资料(II)
Jenkins+Xcode+Git - 搭建持续构建环境 - 第
Iterm保持连接

之前的一篇文章中，介绍了Jupyterhub实现Multiple Profiles的方法，但是并不能满足企业的要求，比如期待对资源配置的动态化、以及集中管理Profiles权限等，本文将解决这个问题。由于涉及敏感开发信息，这里只做demo展示。

1. 开发依赖

mysql
jupyterhub helm chart

2. 源码解析

在KubeSpawner的spawner模块中，start方法是hub启动pod的入口，其中第一步是调用load_user_options()，即加载用户的可选配置。

1577096388735.jpg

在load_user_options里，实现了从profile_list里读取配置项。

1577096409625.jpg

而profile_list对象的注释中，明确写到，还可以通过spawner实例的一些参数返回结果dict，这就为我们指明了开发方向。即我们可以写一个函数，返回需要的profile_list。

1577096736476.jpg

3. 代码

3.1 MySQL

建表

create table profiles_test(
  id int(11) auto_increment not null,
  profile_name varchar(255) not null,
  profile_config varchar(255) not null,
  primary key(id)
)

插入数据

insert into profiles_test(profile_name, profile_config) values ('datascience', '{"display_name": "Datascience enviroment", "kubespawner_override": {"image_spec": "jupyter/datascience-notebook:0.1.2", "cpu_limit": 8, "mem_limit": "16G"}}');

insert into profiles_test(profile_name, profile_config) values ('deeplearn', '{"display_name": "DeepLearning enviroment", "kubespawner_override": {"image_spec": "ufoym/deepo:0.1.5", "cpu_limit": 4, "mem_limit": "8G"}}');

3.2 代码

由于只是展示，代码这里就比较粗暴了，直接在helm的chart配置文件values中直接写代码。

hub:
  extraConfig:
    myConfig.py: |
      from sqlalchemy import Column, String, Integer
      from sqlalchemy import create_engine
      from sqlalchemy.orm import sessionmaker
      from sqlalchemy.ext.declarative import declarative_base
      import json

      def get_profile_list(spawner):
        Base = declarative_base()

        class Profiles(Base):
          __tablename__ = 'profiles_test'

          id = Column(Integer, primary_key=True)
          profile_name = Column(String(64), nullable=False, unique=True)
          profile_config = Column(String(255), nullable=False)

        db_url = "mysql+mysqlconnector://<username>:<password>@<ip>:3306/app_jupyterhub"
        engine = create_engine(db_url)
        smaker = sessionmaker(bind=engine)
        session = smaker()
        profile_list = []
        for i in range(1,3):
          profiles = session.query(Profiles.profile_config).filter_by(id=i).one()
          profile_dict = json.loads(profiles[0])
          profile_list.append(profile_dict)
        return profile_list
        
      c.JupyterHub.spawner_class = 'kubespawner.KubeSpawner'
      c.KubeSpawner.profile_list = get_profile_list

demo的原因，我直接从Mysql里取出表里预存的配置信息，组装成dict返回。

4. 结果

1577098494521.jpg

5. 总结

核心其实比较简单，写一个返回profile_list结果字典的函数即可，这里需要补充的是：

需要设计一下User、Profile表等，通过查找DB的方式，可以实现Profile信息的动态化配置
get_profile_list(spawner)中实际可以使用spawner的实例信息，比如说通过spawner.user.id可以获得用户的id，那就可以根据这个信息返回不一样的profile_list结果。即可实现不同Profiles的集中管控和动态分配，这部分信息可以在数据库中配置，具体实现这里就不举例了。
实现代码可以写进hub镜像里

往期回顾：