美文网首页Openstack
Kolla-ansible(Ocata) 源码分析

Kolla-ansible(Ocata) 源码分析

作者: JohnLee1100 | 来源:发表于2017-08-07 17:24 被阅读455次

    Kolla-ansible 源码分析


    简介

    Kolla-ansible项目提供一个完整的Ansible Playbook,来部署Docker的镜像,再完成openstack组件的自动化部署。并提供all-in-one和multihost的环境。
    源码地址:https://github.com/openstack/kolla-ansible.git

    源码目录概要

    一级目录

    Paste_Image.png
    • Ansible: ansible的整个playbook代码,包括部署docker容器和openstack组件。源码主要集中在这个目录下。
    • Contrib:包括用heat和magnum的部署环境和vagrant的部署环境。
    • Deploy-guide: 部署指南,主要包括all-in-one和mulihosts两种部署方式的指南。
    • Doc:文档。
    • Etc: 一些配置文件,安装完了引用到了/etc目录下,all-in-one只要修改很少量的配置。
    • Kolla-ansible: 记录版本信息,cmd子目录下有生成密码和合并密码的两个脚本。pbr打包的时候被封装成为可执行命令。
    • Releasenodes: 发布特性说明。
    • Specs: 包含有Kolla社区关键参数代码库的变化。
    • Tests: 包括一些功能测试工具,这里还包括两个自定义的ansible plugin(merge_config)和module(kolla_docker)的测试。
    • Tools: 一些和kolla交换的脚本工具,大部分是可手动调用,主要完成一些安装前后的一些操作。有些会被ansible目录下的task调用到。

    二级目录

    • Ansible/action_plugins: 自定义ansible插件,两个脚本,用于合并yml和conifg的配置文件。
    • Ansible/group_vars: ansible脚本的全局变量定义。
    • Ansible/inventory: 包含all-in-one和mulitnode的样板hosts清单。
    • Ansible/library: 包括一些自定义的ansible模块,bslurp.py和kolla_docker.py用到比较多。
    • Ansible/role: 所有的openstack的组件,几乎包含了说有开源项目,当前版本有60个组件。
    • Ansible:除了文件夹之外的ansible脚本,主要用户安装前后的环境准备和清理,数据库恢复等特殊系统级的操作。

    关键代码解读

    Setup.cfg安装配置入口文件, 见中文注释

    [metadata]
    name = kolla-ansible  // 项目名称
    summary = Ansible Deployment of Kolla containers
    description-file = README.rst
    author = OpenStack
    author-email = openstack-dev@lists.openstack.org
    home-page = http://docs.openstack.org/developer/kolla-ansible/
    license = Apache License, Version 2.0
    classifier =
        Environment :: OpenStack
        Intended Audience :: Information Technology
        Intended Audience :: System Administrators
        License :: OSI Approved :: Apache Software License
        Operating System :: POSIX :: Linux
        Programming Language :: Python
        Programming Language :: Python :: 2
        Programming Language :: Python :: 2.7
        Programming Language :: Python :: 3
        Programming Language :: Python :: 3.5
    
     [files]
    packages = kolla_ansible   //包名
    data_files =        //pbr方式打包对应的文件映射
        share/kolla-ansible/ansible = ansible/*
        share/kolla-ansible/tools = tools/validate-docker-execute.sh
        share/kolla-ansible/tools = tools/cleanup-containers
        share/kolla-ansible/tools = tools/cleanup-host
        share/kolla-ansible/tools = tools/cleanup-images
        share/kolla-ansible/tools = tools/stop-containers
    share/kolla-ansible/doc = doc/*
    share/kolla-ansible/etc_examples = etc/*
    share/kolla-ansible = tools/init-runonce
    share/kolla-ansible = tools/init-vpn
    share/kolla-ansible = tools/openrc-example
    share/kolla-ansible = setup.cfg
    
    scripts =        //可执行脚本
        tools/kolla-ansible
    
    [entry_points]
    console_scripts =   //控制台可执行脚本,执行两个Python文件的main函数
    kolla-genpwd = kolla_ansible.cmd.genpwd:main
    kolla-mergepwd = kolla_ansible.cmd.mergepwd:main
    
    [global]
    setup-hooks =
    pbr.hooks.setup_hook
    
    [pbr]  //打包方式
    
    [build_sphinx]
    all_files = 1
    build-dir = doc/build
    source-dir = doc
    
    [build_releasenotes]
    all_files = 1
    build-dir = releasenotes/build
    source-dir = releasenotes/source
    

    Setup.py

    安装执行脚本,通过pbr打包,执行过程会读取setup.cfg配置,还会安装同父目录下requirements.txt中的依赖。更多参考https://julien.danjou.info/blog/2017/packaging-python-with-pbr

    import setuptools
    
    # In python < 2.7.4, a lazy loading of package `pbr` will break
    # setuptools if some other modules registered functions in `atexit`.
    # solution from: http://bugs.python.org/issue15881#msg170215
    try:
        import multiprocessing  # noqa
    except ImportError:
        pass
    
    setuptools.setup(
        setup_requires=['pbr>=2.0.0'],
        pbr=True)
    

    tools\kolla-ansible

    该脚本是封装了ansible-playbook,对kolla进行了ansible的定制。主要根据action的类型,传递不同的配置文件。
    中间基础变量定义:

    find_base_dir
    INVENTORY="${BASEDIR}/ansible/inventory/all-in-one"
    PLAYBOOK="${BASEDIR}/ansible/site.yml"
    VERBOSITY=
    EXTRA_OPTS=${EXTRA_OPTS}
    CONFIG_DIR="/etc/kolla"
    PASSWORDS_FILE="${CONFIG_DIR}/passwords.yml"
    DANGER_CONFIRM=
    INCLUDE_IMAGES=
    

    Find_base_dir是一个脚本开始时候的一个函数(不展开解释),用于找到kolla-ansible脚本所在的路径。

    脚本传参解释:
    while [ "$#" -gt 0 ]; do
    case "$1" in

        (--inventory|-i)
                INVENTORY="$2"
                shift 2
                ;;
    
        (--playbook|-p)
                PLAYBOOK="$2"
                shift 2
                ;;
    
        (--tags|-t)
                EXTRA_OPTS="$EXTRA_OPTS --tags $2"
                shift 2
                ;;
    
        (--verbose|-v)
                VERBOSITY="$VERBOSITY --verbose"
                shift 1
                ;;
    
        (--configdir)
                CONFIG_DIR="$2"
                shift 2
                ;;
    
        (--yes-i-really-really-mean-it)
                DANGER_CONFIRM="$1"
                shift 1
                ;;
    
        (--include-images)
                INCLUDE_IMAGES="$1"
                shift 1
                ;;
    
        (--key|-k)
                VAULT_PASS_FILE="$2"
                EXTRA_OPTS="$EXTRA_OPTS --vault-password-file=$VAULT_PASS_FILE"
                shift 2
                ;;
    
        (--extra|-e)
                EXTRA_OPTS="$EXTRA_OPTS -e $2"
                shift 2
                ;;
        (--passwords)
                PASSWORDS_FILE="$2"
                shift 2
                ;;
        (--help|-h)
                usage
                shift
                exit 0
                ;;
    
        (--)
                shift
                break
                ;;
    
        (*)
                echo "error"
                exit 3
                ;;
    esac
    done
    
    case "$1" in
    
    (prechecks)
            ACTION="Pre-deployment checking"
            EXTRA_OPTS="$EXTRA_OPTS -e action=precheck"
            ;;
    (check)
            ACTION="Post-deployment checking"
            EXTRA_OPTS="$EXTRA_OPTS -e action=check"
            ;;
    (mariadb_recovery)
            ACTION="Attempting to restart mariadb cluster"
            EXTRA_OPTS="$EXTRA_OPTS -e action=deploy -e common_run=true"
            PLAYBOOK="${BASEDIR}/ansible/mariadb_recovery.yml"
            ;;
    (destroy)
            ACTION="Destroy Kolla containers, volumes and host configuration"
            PLAYBOOK="${BASEDIR}/ansible/destroy.yml"
            if [[ "${INCLUDE_IMAGES}" == "--include-images" ]]; then
                EXTRA_OPTS="$EXTRA_OPTS -e destroy_include_images=yes"
            fi
            if [[ "${DANGER_CONFIRM}" != "--yes-i-really-really-mean-it" ]]; then
                cat << EOF
    WARNING:
        This will PERMANENTLY DESTROY all deployed kolla containers, volumes and host configuration.
        There is no way to recover from this action. To confirm, please add the following option:
        --yes-i-really-really-mean-it
    EOF
                exit 1
            fi
            ;;
    (bootstrap-servers)
            ACTION="Bootstraping servers"
            PLAYBOOK="${BASEDIR}/ansible/kolla-host.yml"
            EXTRA_OPTS="$EXTRA_OPTS -e action=bootstrap-servers"
            ;;
    (deploy)
            ACTION="Deploying Playbooks"
            EXTRA_OPTS="$EXTRA_OPTS -e action=deploy"
            ;;
    (deploy-bifrost)
            ACTION="Deploying Bifrost"
            PLAYBOOK="${BASEDIR}/ansible/bifrost.yml"
            EXTRA_OPTS="$EXTRA_OPTS -e action=deploy"
            ;;
    (deploy-servers)
            ACTION="Deploying servers with bifrost"
            PLAYBOOK="${BASEDIR}/ansible/bifrost.yml"
            EXTRA_OPTS="$EXTRA_OPTS -e action=deploy-servers"
            ;;
    (post-deploy)
            ACTION="Post-Deploying Playbooks"
            PLAYBOOK="${BASEDIR}/ansible/post-deploy.yml"
            ;;
    (pull)
            ACTION="Pulling Docker images"
            EXTRA_OPTS="$EXTRA_OPTS -e action=pull"
            ;;
    (upgrade)
            ACTION="Upgrading OpenStack Environment"
            EXTRA_OPTS="$EXTRA_OPTS -e action=upgrade -e serial=${ANSIBLE_SERIAL}"
            ;;
    (reconfigure)
            ACTION="Reconfigure OpenStack service"
            EXTRA_OPTS="$EXTRA_OPTS -e action=reconfigure -e serial=${ANSIBLE_SERIAL}"
            ;;
    (stop)
            ACTION="Stop Kolla containers"
            PLAYBOOK="${BASEDIR}/ansible/stop.yml"
            ;;
    (certificates)
            ACTION="Generate TLS Certificates"
            PLAYBOOK="${BASEDIR}/ansible/certificates.yml"
            ;;
    (genconfig)
            ACTION="Generate configuration files for enabled OpenStack services"
            EXTRA_OPTS="$EXTRA_OPTS -e action=config"
            ;;
    (*)     usage
            exit 0
            ;;
    esac                                                                
    

    这段是根据传递的参数,不同的参数针对不同的配置文件等额外属性。这里第一个参数有好多action,如deploy,post-deploy,stop等等。

    最后三行,组合命令,执行:

    CONFIG_OPTS="-e @${CONFIG_DIR}/globals.yml -e @${PASSWORDS_FILE} -e CONFIG_DIR=${CONFIG_DIR}"
    CMD="ansible-playbook -i $INVENTORY $CONFIG_OPTS $EXTRA_OPTS $PLAYBOOK $VERBOSITY"
    process_cmd
    

    传递进来的参数组合成ansible-playbook的CMD命令后,调用process_cmd函数执行。

    • 例子1:初始命令
      kolla-ansible deploy -i /home/all-in-one
      封装后命令
      ansible-playbook -i /home/all-in-one -e @/etc/kolla/globals.yml -e @/etc/kolla/passwords.yml -e CONFIG_DIR=/etc/kolla -e action=deploy /usr/share/kolla-ansible/ansible/site.yml
    • 例子2:初始命令
      kolla-ansible post-deploy
      封装后命令
      ansible-playbook -i /usr/share/kolla-ansible/ansible/inventory/all-in-one -e @/etc/kolla/globals.yml -e @/etc/kolla/passwords.yml -e CONFIG_DIR=/etc/kolla /usr/share/kolla-ansible/ansible/post-deploy.yml

    ansible剧本代码解

    由于openstack的组件比较多,且大多数处于并列关系。这里不再一一展开,以neutron为例进行解读。其他没有涉及的有重要的点,会进行内容穿插。

    Ansible/library/

    该目录下是一些自定义的一些模块,这些module在目标节点上运行,包括bslurp.py,kolla_container_facts.py,kolla_docker.py,kolla-toolbox.py,merge_configs.py,merge_yaml.py,前四个我会一一介绍,后两个是空文件,代码其实在action_plugins目录中(那两个空文件是映射同名的action,我们可以向module一样使用action)。

    Bslurp.py,好像做了文件分发的事情,通过copy_from_host(我没查到调用它的地方)和copy_to_host(在ceph角色部署的时候,下发keyring到osd节点是通过这个模块函数下发的)两个函数实现,。判断依据是模块参数dest。见代码中文注释。

    def copy_from_host(module):
        #此处省略不少代码
        module.exit_json(content=base   A64.b64encode(data), sha1=sha1, mode=mode,
                         source=src)
    
    def copy_to_host(module):
        compress = module.params.get('compress')
        dest = module.params.get('dest')
        mode = int(module.params.get('mode'), 0)
        sha1 = module.params.get('sha1')
        # src是加密后的数据
        src = module.params.get('src')
        # decode已经加密的数据
        data = base64.b64decode(src) 
    #解压数据
        raw_data = zlib.decompress(data) if compress else data
    
        #sha1安全算法数据校验
        if sha1:
            if os.path.exists(dest):
                if os.access(dest, os.R_OK):
                    with open(dest, 'rb') as f:
                        if hashlib.sha1(f.read()).hexdigest() == sha1:
                            module.exit_json(changed=False)
                else:
                    module.exit_json(failed=True, changed=False,
                                     msg='file is not accessible: {}'.format(dest))
    
            if sha1 != hashlib.sha1(raw_data).hexdigest():
                module.exit_json(failed=True, changed=False,
                                 msg='sha1 sum does not match data')
        
    # 保存数据到dest值。这段代码有健壮性问题,没有考虑到磁盘写满的场景,保险的做法创建一个tmp文件,把数据拷贝到tmp文件,再把tmp文件重命名为dest值。否则容易把文件写空。
        with os.fdopen(os.open(dest, os.O_WRONLY | os.O_CREAT, mode), 'wb') as f:
            f.write(raw_data)
        #调用module要求的exit_json接口退出。
        module.exit_json(changed=True)
    
    
    def main():
    # 定义dict类型的参数,ansible module的接口要求
        argument_spec = dict(
            compress=dict(default=True, type='bool'),
            dest=dict(type='str'),
            mode=dict(default='0644', type='str'),
            sha1=dict(default=None, type='str'),
            src=dict(required=True, type='str')
        )
    # 创建ansible模块对象
        module = AnsibleModule(argument_spec)
        # 获取模块dest参数值
        dest = module.params.get('dest')
    
        try:
            if dest: 
            # 如果dest参数存在,则推送操作,push下发到相应的host
                copy_to_host(module)
            else:
            # 如果dest参数不存在,则进行pull操作。
                copy_from_host(module)
        except Exception:
        # 异常场景下退出,ansible自定义模块语法规范。 
            module.exit_json(failed=True, changed=True,
                             msg=repr(traceback.format_exc()))
    
    
    # import module snippets
    from ansible.module_utils.basic import *  # noqa
    if __name__ == '__main__':
        main()
    

    kolla_docker.py,容器相关的操作,openstack的组件都通过容器部署,每个组件role的部署都会用到,非常重要。

    ...
    #创建docker的client函数
    def get_docker_client():
        try:
            return docker.Client
        except AttributeError:
            return docker.APIClient
    A
    
    class DockerWorker(object):
    
        def __init__(self, module):
            # 构造函数,传入参数是AnsibleModule类型的对象
            self.module = module
            # params参数续传
            self.params = self.module.params
            self.changed = False
    
            # TLS not fully implemented
            # tls_config = self.generate_tls()
    
            # 创建一个docker.client对象
            options = {
                'version': self.params.get('api_version')
            }
            self.dc = get_docker_client()(**options)
    # ....
    # ....
    
    # 启动容器的函数,是AnsibleModule的其中一种action
    def start_container(self):
        #检查镜像是否存在,不存在pull
            if not self.check_image():
                self.pull_image()
            
        #检查容器
            container = self.check_container()
        #容器异样,则删除,再回调
            if container and self.check_container_differs():
                self.stop_container()
                self.remove_container()
                container = self.check_container()
    
            #容器不存在,创建,再回调
            if not container:
                self.create_container()
                container = self.check_container()
            
        #容器状态非启动,则启动
            if not container['Status'].startswith('Up '):
                self.changed = True
                self.dc.start(container=self.params.get('name'))
    
            # We do not want to detach so we wait around for container to exit
        #如果container没有detach断开,那么进入wait状态,调用fail_json方法,传递fail的参数
        if not self.params.get('detach'):
                rc = self.dc.wait(self.params.get('name'))
                if rc != 0:
                    self.module.fail_json(
                        failed=True,
                        changed=True,
                        msg="Container exited with non-zero return code"
                    )
            #如果返回参数remove_on_exit,那么删除该container
                if self.params.get('remove_on_exit'):
                    self.stop_container()
                    self.remove_container()
    
    
    def generate_module():
        # NOTE(jeffrey4l): add empty string '' to choices let us use
        # pid_mode: "{{ service.pid_mode | default ('') }}" in yaml
    #定义参数字典,ansible module的api规范
        argument_spec = dict(
            common_options=dict(required=False, type='dict', default=dict()),
        #action参数,必须传递,类型为str,value值必须在choices的列表
            action=dict(required=True, type='str',
                        choices=['compare_container', 'compare_image',
                                 'create_volume', 'get_container_env',
                                 'get_container_state', 'pull_image',
                                 'recreate_or_restart_container',
                                 'remove_container', 'remove_volume',
                                 'restart_container', 'start_container',
                                 'stop_container']),
            api_version=dict(required=False, type='str', default='auto'),
            auth_email=dict(required=False, type='str'),
            auth_password=dict(required=False, type='str'),
            auth_registry=dict(required=False, type='str'),
            auth_username=dict(required=False, type='str'),
            detach=dict(required=False, type='bool', default=True),
            labels=dict(required=False, type='dict', default=dict()),
            name=dict(required=False, type='str'),
            environment=dict(required=False, type='dict'),
            image=dict(required=False, type='str'),
            ipc_mode=dict(required=False, type='str', choices=['host', '']),
            cap_add=dict(required=False, type='list', default=list()),
            security_opt=dict(required=False, type='list', default=list()),
            pid_mode=dict(required=False, type='str', choices=['host', '']),
            privileged=dict(required=False, type='bool', default=False),
            graceful_timeout=dict(required=False, type='int', default=10),
            remove_on_exit=dict(required=False, type='bool', default=True),
            restart_policy=dict(required=False, type='str', choices=[
                                'no',
                                'never',
                                'on-failure',
                                'always',
                                'unless-stopped']),
            restart_retries=dict(required=False, type='int', default=10),
            tls_verify=dict(required=False, type='bool', default=False),
            tls_cert=dict(required=False, type='str'),
            tls_key=dict(required=False, type='str'),
            tls_cacert=dict(required=False, type='str'),
            volumes=dict(required=False, type='list'),
            volumes_from=dict(required=False, type='list')
        )
    
    # 属性依赖ansible module的api规范,如start_container这个action,  #必须要image和name这个两个属性。
        required_if = [
            ['action', 'pull_image', ['image']],
            ['action', 'start_container', ['image', 'name']],
            ['action', 'compare_container', ['name']],
            ['action', 'compare_image', ['name']],
            ['action', 'create_volume', ['name']],
            ['action', 'get_container_env', ['name']],
            ['action', 'get_container_state', ['name']],
            ['action', 'recreate_or_restart_container', ['name']],
            ['action', 'remove_container', ['name']],
            ['action', 'remove_volume', ['name']],
            ['action', 'restart_container', ['name']],
            ['action', 'stop_container', ['name']]
        ]
        #实例化
    module = AnsibleModule(
            argument_spec=argument_spec,
            required_if=required_if,
            bypass_checks=False
        )
        
    #以下部分主要做环境变量和通用参数以及特殊参数的更新。
        new_args = module.params.pop('common_options', dict())
    
        # NOTE(jeffrey4l): merge the environment
        env = module.params.pop('environment', dict())
        if env:
            new_args['environment'].update(env)
    
        for key, value in module.params.items():
            if key in new_args and value is None:
                continue
            new_args[key] = value
    
        # if pid_mode = ""/None/False, remove it
        if not new_args.get('pid_mode', False):
            new_args.pop('pid_mode', None)
        # if ipc_mode = ""/None/False, remove it
        if not new_args.get('ipc_mode', False):
            new_args.pop('ipc_mode', None)
    
        module.params = new_args
    # 返回为AnsibleModule实例
        return module
    
    
    def main():
        module = generate_module()
    
        try:
            dw = DockerWorker(module)
            # TODO(inc0): We keep it bool to have ansible deal with consistent
            # types. If we ever add method that will have to return some
            # meaningful data, we need to refactor all methods to return dicts.
            #返回值 result是action传递的函数名的运行成功与否的结果,意义#不大
        result = bool(getattr(dw, module.params.get('action'))())
            module.exit_json(changed=dw.changed, result=result)
        except Exception:
            module.exit_json(failed=True, changed=True,
                             msg=repr(traceback.format_exc()))
    
    # import module snippets
    from ansible.module_utils.basic import *  # noqa
    if __name__ == '__main__':
        main()
    

    Kolla_toolbox.py,在toolbox容器中运行ansible命令。这个我就不展开了,只贴关键代码。

    #生成commandline函数,只包含ansible的命令
    def gen_commandline(params):
        command = ['ansible', 'localhost']
        ....
        ....
        return command
    
    #主函数
    def main():
        ....
    ....
        client = get_docker_client()(
            version=module.params.get('api_version'))
    #调用函数,获取命令(dict类型)
        command_line = gen_commandline(module.params)
    #过滤名称为kolla_toolbox的容器列表
        kolla_toolbox = client.containers(filters=dict(name='kolla_toolbox',
                                                       status='running'))
        if not kolla_toolbox:
            module.fail_json(msg='kolla_toolbox container is not running.')
        #默认只有一个,所有选了数组的第一个。kolla_toolbox变量名不建议重复使用,开源代码就是坑。
        kolla_toolbox = kolla_toolbox[0]
    #在容器中执行命令
        job = client.exec_create(kolla_toolbox, command_line)
        output = client.exec_start(job)
    ....
    ....
        module.exit_json(**ret)
    

    ****Kolla_container_facts.py**** 调用dockerclient的python接口获取指定容器的facts信息,只传递一个name值即可。result类型是dict(changed=xxx, _containers=[]),代码不展开了。

    Ansible/action_plugins/

    该目录下记录了自定义的action的plugins,这些plugins在master上运行。但也可以在library目录下定义同名空文件,可以当作module使用。这里有两个代码文件,merge_configs.py和merge_yaml.py,用于conf和yml配置文件的合并。这里就分下下merge_yaml.py这个action plugin.
    Merge_yaml.py,在task的参数sources中传递多个yml文件,合并之后输出到目标节点的dest中。期间在合并的同时,进行了参数模拟变量的渲染工作,最后调用copy模块把渲染后的数据文件复制过去。分析代码如下。
    from ansible.plugins import action

    #继承父类action.ActionBase
    class ActionModule(action.ActionBase):
    
        TRANSFERS_FILES = True
    
        def read_config(self, source):
            result = None
            # Only use config if present
            if os.access(source, os.R_OK):
                with open(source, 'r') as f:
                    template_data = f.read()
                # 渲染template模板数据,因为最终执行copy模块的时候,
                # 变量被重新还原了,所以这里要先template变量先渲染,
                # 因为有些变量对可能在copy模块中会消失
                template_data = self._templar.template(template_data)
                #把YAML数据,转化为dict对象
                result = safe_load(template_data)
            return result or {}
    
        # 自定义action plugin必须实现的方法
        def run(self, tmp=None, task_vars=None):
            #task_vars是这个task的一些外传入的变量,
            # 如host vars, group vars, config vars,etc
            if task_vars is None:
                task_vars = dict()
            #自定义action plugin必须调用父类的run方法
            result = super(ActionModule, self).run(tmp, task_vars)
    
            # NOTE(jeffrey4l): Ansible 2.1 add a remote_user param to the
            # _make_tmp_path function.  inspect the number of the args here. In
            # this way, ansible 2.0 and ansible 2.1 are both supported
            #创建tmp临时目录,兼容2.0以后的版本
            make_tmp_path_args = inspect.getargspec(self._make_tmp_path)[0]
            if not tmp and len(make_tmp_path_args) == 1:
                tmp = self._make_tmp_path()
            if not tmp and len(make_tmp_path_args) == 2:
                remote_user = (task_vars.get('ansible_user')
                               or self._play_context.remote_user)
                tmp = self._make_tmp_path(remote_user)
            # save template args.
            # _task.args是这个task的参数,这里把参数中key为vars的对应值
            # 保存到extra_vars变量中
            extra_vars = self._task.args.get('vars', list())
            # 备份template的可用变量
            old_vars = self._templar._available_variables
            # 将task_vars和extra_vars的所有变量merge到一起,赋值到temp_vars
            temp_vars = task_vars.copy()
            temp_vars.update(extra_vars)
            #把最新的变量数据设置到templar对象(模板对象)
            self._templar.set_available_variables(temp_vars)
    
            output = {}
            # 获取task的参数为sources的values值,可能是单个文件,
            # 也有可能是多个文件组成的list
            sources = self._task.args.get('sources', None)
            #非数组,转化为只有一个item的数组
            if not isinstance(sources, list):
                sources = [sources]
            #便历sources数组,读取文件中内容,并合并更新
            #dict.update方式有去重效果,相当于merge
            for source in sources:
                output.update(self.read_config(source))
    
            # restore original vars
            #还原templar对象的变量
            self._templar.set_available_variables(old_vars)
            #把最新的合并好的数据output传递到远端的target host。复制给xfered变量
            remote_path = self._connection._shell.join_path(tmp, 'src')
            xfered = self._transfer_data(remote_path,
                                         dump(output,
                                              default_flow_style=False))
            #把本task的参数拷贝,作为新模块的参数new_module_args
            new_module_args = self._task.args.copy()
            #更新new_module_args的src的值,后面copy模块的参数要求
            new_module_args.update(
                dict(
                    src=xfered
                )
            )
            #删除new_module_args的sources参数,后面copy模块的参数要求
            del new_module_args['sources']
            #传入最新的参数new_module_args,task_vars执行copy的module
            result.update(self._execute_module(module_name='copy',
                                               module_args=new_module_args,
                                               task_vars=task_vars,
                                               tmp=tmp))
            #返回result, action plugin 接口的要求
            return result
    

    Ansible/inventory/all-in-one

    #control主机组包含本地localhost节点,连接方式为local
    [control]
    localhost       ansible_connection=local
    
    [network]
    localhost       ansible_connection=local
    
    #neutron主机组包含network组下的所有节点
    [neutron:children]
    network
    
    # Neutron
    #neutron-server主机组包含control组下的所有节点
    [neutron-server:children]
    control
    
    #neutron-dhcp-agent主机组包含neutron组下的所有节点
    [neutron-dhcp-agent:children]
    neutron
    
    [neutron-l3-agent:children]
    neutron
    
    [neutron-lbaas-agent:children]
    neutron
    
    [neutron-metadata-agent:children]
    neutron
    
    [neutron-vpnaas-agent:children]
    neutron
    
    [neutron-bgp-dragent:children]
    neutron
    

    Ansible/site.yml

    #调用ansible的setup获取节点的facts。gather_facts被设置为false是为了避免ansible再次去gathering facts.
    - name: Gather facts for all hosts
      hosts: all
      serial: '{{ serial|default("0") }}'
      gather_facts: false
      tasks:
        - setup:
      tags: always
    
    # NOTE(pbourke): This case covers deploying subsets of hosts using --limit. The
    # limit arg will cause the first play to gather facts only about that node,
    # meaning facts such as IP addresses for rabbitmq nodes etc. will be undefined
    # in the case of adding a single compute node.
    # We don't want to add the delegate parameters to the above play as it will
    # result in ((num_nodes-1)^2) number of SSHs when running for all nodes
    # which can be very inefficient.
    
    - name: Gather facts for all hosts (if using --limit)
      hosts: all
      serial: '{{ serial|default("0") }}'
      gather_facts: false
      tasks:
        - setup:
          delegate_facts: True
          delegate_to: "{{ item }}"
          with_items: "{{ groups['all'] }}"
          when:
            - (play_hosts | length) != (groups['all'] | length)
    
    #检测openstack_release全局变量信息,默认在globals.yml是不配置的,而
    #在ansible/group_vars/all.yml中配置的默认值是auto。这里的两个tasks
    #就是如果在auto的场景下,通过python的pbr包去检测安装好的#kolla-ansible版本,再将该版本号赋值给openstack_release变量。
    #这里用到了ansible自带的local_action模块和register中间信息存储模块。
    - name: Detect openstack_release variable
      hosts: all
      gather_facts: false
      tasks:
        - name: Get current kolla-ansible version number
          local_action: command python -c "import pbr.version; print(pbr.version.VersionInfo('kolla-ansible'))"
          register: kolla_ansible_version
          changed_when: false
          when: openstack_release == "auto"
    
        - name: Set openstack_release variable
          set_fact:
            openstack_release: "{{ kolla_ansible_version.stdout }}"
          when: openstack_release == "auto"
      tags: always
    
    #对所有节点进行recheck检查,前提条件是ansible-playbook命令传递的#action的值是precheck。
    - name: Apply role prechecks
      gather_facts: false
      hosts:
        - all
      roles:
        - role: prechecks
          when: action == "precheck"
    
    # 基于ntp时间同步角色的部署,hosts组为chrony-server和chrony
    #前提条件是enable_chrony变量是否是yes,该值可在etc/kolla/globals.yml
    #中配置,默认是no。
    - name: Apply role chrony
      gather_facts: false
      hosts:
        - chrony-server
        - chrony
      serial: '{{ serial|default("0") }}'
      roles:
        - { role: chrony,
            tags: chrony,
            when: enable_chrony | bool }
    
    #部署neutron角色,这里部署的节点除了neutron相关的host组之外,还包括#compute和manila-share(openstack的一个文件共享组件)组。
    - name: Apply role neutron
      gather_facts: false
      hosts:
        - neutron-server
        - neutron-dhcp-agent
        - neutron-l3-agent
        - neutron-lbaas-agent
        - neutron-metadata-agent
        - neutron-vpnaas-agent
        - compute
        - manila-share
      serial: '{{ serial|default("0") }}'
      roles:
        - { role: neutron,
            tags: neutron,
            when: enable_neutron | bool }
    

    Ansible/role/neutron/task 检查场景

    该场景的action是precheck。由tasks/main.yml引用precheck.yml

    ---
    # kolla_container_facts是自定义的library,上文已经分析过代码,
    # 用于获取容器名为neutron_server的一些容器属性数据,注册到中间变量container_facts
    - name: Get container facts
      kolla_container_facts:
        name:
          - neutron_server
      register: container_facts
    
    # 中间变量container_facts没有找到neutron_server关键字且该主机在neutron-server主机组中,
    # 判断neutron_server_port 端口是否已经stopped
    - name: Checking free port for Neutron Server
      wait_for:
        host: "{{ hostvars[inventory_hostname]['ansible_' + api_interface]['ipv4']['address'] }}"
        port: "{{ neutron_server_port }}"
        connect_timeout: 1
        timeout: 1
        state: stopped
      when:
        - container_facts['neutron_server'] is not defined
        - inventory_hostname in groups['neutron-server']
    
    # enable_neutron_agent_ha是true,且只规划了多个一个dhcp和l3服务节点,给出fail提示
    - name: Checking number of network agents
      local_action: fail msg="Number of network agents are less than two when enabling agent ha"
      changed_when: false
      when:
        - enable_neutron_agent_ha | bool
        - groups['neutron-dhcp-agent'] | length < 2
          or groups['neutron-l3-agent'] | length < 2
    
    # When MountFlags is set to shared, a signal bit configured on 20th bit of a number
    # We need to check the 20th bit. 2^20 = 1048576. So we are validating against it.
    # 检查docker服务的MountFlags是否设置为了shared
    - name: Checking if 'MountFlags' for docker service is set to 'shared'
      command: systemctl show docker
      register: result
      changed_when: false
      failed_when: result.stdout.find('MountFlags=1048576') == -1
      when:
        - (inventory_hostname in groups['neutron-dhcp-agent']
           or inventory_hostname in groups['neutron-l3-agent']
           or inventory_hostname in groups['neutron-metadata-agent'])
        - ansible_os_family == 'RedHat' or ansible_distribution == 'Ubuntu'
    

    Ansible/role/neutron/task 部署场景

    该场景的action是deploy。由tasks/main.yml引用deploy.yml

    # enforce ironic usage only with openvswitch
    # 裸机部署检查,检查ironic服务必须启动,neutron的plugin必须使用OpenvSwitch
    
    - include: ironic-check.yml
    
    #在neutron-server的节点执行注册
    - include: register.yml
      when: inventory_hostname in groups['neutron-server']
    
    #执行配置,拷贝配置文件,启动组件容器主要都在这里实现
    - include: config.yml
    
    #在nova fake driver模拟场景下,计算节点执行config-neutron-fake.yml,不详细分析
    #nova fake driver可以在单个计算节点中创建多个docker容器运行novc-compute,
    #Nova fake driver can not work with all-in-one deployment. This is because the fake
    #neutron-openvswitch-agent for the fake nova-compute container conflicts with
    #neutron-openvswitch-agent on the compute nodes. Therefore, in the inventory
    #the network node must be different than the compute node.
    - include: config-neutron-fake.yml
      when:
        - enable_nova_fake | bool
        - inventory_hostname in groups['compute']
    
    #在neutron-server的节点执行创建数据库,创建容器
    #bootstrap.yml会去创建数据库相关信息,结束后会去调用#bootstrap_servcie.yml该文件是用于在server节点上创建容器。
    - include: bootstrap.yml
      when: inventory_hostname in groups['neutron-server']
    
    #执行handlers目录下的task任务
    - name: Flush Handlers
      meta: flush_handlers
    

    Register.yml, 往keystone中注册neutron服务的鉴权相关信息。

    ---
    # 在keystone创建neutron的service 和endpoint
    # kolla_toolbox见library分析,用于在toolbox容器中执行ansible命令
    # kolla_keystone_service模块是kolla-ansible的父项目kolla中的代码,已经是一个可调用的ansible模块
    #service名称为neutron,对应的endpoint分为内部,管理员,公共三个。
    # 变量主要在ansible/role/neutron/defauts/main.yml和ansible/group_vars/all.yml中
    - name: Creating the Neutron service and endpoint
      kolla_toolbox:
        module_name: "kolla_keystone_service"
        module_args:
          service_name: "neutron"
          service_type: "network"
          description: "Openstack Networking"
          endpoint_region: "{{ openstack_region_name }}"
          url: "{{ item.url }}"
          interface: "{{ item.interface }}"
          region_name: "{{ openstack_region_name }}"
          auth: "{{ '{{ openstack_neutron_auth }}' }}"
        module_extra_vars:
          openstack_neutron_auth: "{{ openstack_neutron_auth }}"
      run_once: True
      with_items:
        - {'interface': 'admin', 'url': '{{ neutron_admin_endpoint }}'}
        - {'interface': 'internal', 'url': '{{ neutron_internal_endpoint }}'}
        - {'interface': 'public', 'url': '{{ neutron_public_endpoint }}'}
        
    # 同上,创建项目,用户,角色。经分析openstack_neutron_auth变量实际为#openstack的admin的auth。
    - name: Creating the Neutron project, user, and role
      kolla_toolbox:
        module_name: "kolla_keystone_user"
        module_args:
          project: "service"
          user: "{{ neutron_keystone_user }}"
          password: "{{ neutron_keystone_password }}"
          role: "admin"
          region_name: "{{ openstack_region_name }}"
          auth: "{{ '{{ openstack_neutron_auth }}' }}"
        module_extra_vars:
          openstack_neutron_auth: "{{ openstack_neutron_auth }}"
      run_once: True
    

    Config.yml,配置文件合并下发,创建或重启容器。

    #调用sysctl模块,配置ip转发相关配置
    - name: Setting sysctl values
      vars:
        neutron_l3_agent: "{{ neutron_services['neutron-l3-agent'] }}"
        neutron_vpnaas_agent: "{{ neutron_services['neutron-vpnaas-agent'] }}"
      sysctl: name={{ item.name }} value={{ item.value }} sysctl_set=yes
      with_items:
        - { name: "net.ipv4.ip_forward", value: 1}
        - { name: "net.ipv4.conf.all.rp_filter", value: 0}
        - { name: "net.ipv4.conf.default.rp_filter", value: 0}
      when:
        - set_sysctl | bool
        - (neutron_l3_agent.enabled | bool and neutron_l3_agent.host_in_groups | bool)
          or (neutron_vpnaas_agent.enabled | bool and  neutron_vpnaas_agent.host_in_groups | bool)
    
    # 创建neutron各服务的配置文件目录,前提条件主要看host_in_groups变量,这个在
    # ansible/role/neutron/defauts/main.yml文件中进行了详细的定义
    - name: Ensuring config directories exist
      file:
        path: "{{ node_config_directory }}/{{ item.key }}"
        state: "directory"
        recurse: yes
      when:
        - item.value.enabled | bool
        - item.value.host_in_groups | bool
      with_dict: "{{ neutron_services }}"
    
    ....
    ....
    
    #下发配置文件到指定目录,三份陪配置文件合一,merge_conifgs模块在action plugin中已经分析过了
    #文件下发完了之后,通知相应组件的容器重启。在handlers目录下。
    # 重启容器这个操作会调用recreate_or_restart_container这个action,第一次会创建容器。
    - name: Copying over neutron_lbaas.conf
      vars:
        service_name: "{{ item.key }}"
        services_need_neutron_lbaas_conf:
          - "neutron-server"
          - "neutron-lbaas-agent"
      merge_configs:
        sources:
          - "{{ role_path }}/templates/neutron_lbaas.conf.j2"
          - "{{ node_custom_config }}/neutron/neutron_lbaas.conf"
          - "{{ node_custom_config }}/neutron/{{ inventory_hostname }}/neutron_lbaas.conf"
        dest: "{{ node_config_directory }}/{{ item.key }}/neutron_lbaas.conf"
      register: neutron_lbaas_confs
      when:
        - item.value.enabled | bool
        - item.value.host_in_groups | bool
        - item.key in services_need_neutron_lbaas_conf
      with_dict: "{{ neutron_services }}"
      notify:
        - "Restart {{ item.key }} container"
    
    ....
    ....
    #kolla_docker是自定义的模块,通过调用compare_container查找该节点的所有neutron service容器
    - name: Check neutron containers
      kolla_docker:
        action: "compare_container"
        common_options: "{{ docker_common_options }}"
        name: "{{ item.value.container_name }}"
        image: "{{ item.value.image }}"
        privileged: "{{ item.value.privileged | default(False) }}"
        volumes: "{{ item.value.volumes }}"
      register: check_neutron_containers
      when:
        - action != "config"
        - item.value.enabled | bool
        - item.value.host_in_groups | bool
      with_dict: "{{ neutron_services }}"
      notify:
        - "Restart {{ item.key }} container"
    

    Bootstrap.yml,创建neutron数据库对象。
    ---
    # kolla_toolbox自定义模块,在toolbox容器中调用mysql_db的ansible模块,创建db
    # delegate_to指定在第一个neutron-server上执行,run_onece只运行一次
    - name: Creating Neutron database
    kolla_toolbox:
    module_name: mysql_db
    module_args:
    login_host: "{{ database_address }}"
    login_port: "{{ database_port }}"
    login_user: "{{ database_user }}"
    login_password: "{{ database_password }}"
    name: "{{ neutron_database_name }}"
    register: database
    run_once: True
    delegate_to: "{{ groups['neutron-server'][0] }}"

    # 创建neuron数据库的用户并设置权限
    - name: Creating Neutron database user and setting permissions
      kolla_toolbox:
        module_name: mysql_user
        module_args:
          login_host: "{{ database_address }}"
          login_port: "{{ database_port }}"
          login_user: "{{ database_user }}"
          login_password: "{{ database_password }}"
          name: "{{ neutron_database_name }}"
          password: "{{ neutron_database_password }}"
          host: "%"
          priv: "{{ neutron_database_name }}.*:ALL"
          append_privs: "yes"
      run_once: True
      delegate_to: "{{ groups['neutron-server'][0] }}"
    
    #数据库改变之后,调用bootstrap_service.yml
    - include: bootstrap_service.yml
      when: database.changed
    

    Bootstrap_service.yml,创建bootrsap_neutron,bootrsap_neutron_lbassd-agent,bootrsap_neutron_vpnaas_agent容器。


    Ansible/role/neutron/handlers/main.yml,重建或重启neutron相关容器。列出一个分析下。

    # 举例分析:neutron_lbaas_confs变量是之前执行的时候注册的变量,这里的task变量
    # neutron_lbaas_conf从neutron_lbaas_confs的结果中取值分析作为when的条件判断
    # kolla_docker模块的volumes参数是从role/neutron/defaults/main.yml中获取,如下,
    #volumes:
    #      - "{{ node_config_directory }}/neutron-lbaas-agent/:{{
    #             container_config_directory }}/:ro"
    #      - "/etc/localtime:/etc/localtime:ro"
    #      - "/run:/run:shared"
    #      - "kolla_logs:/var/log/kolla/"
    #且在config.yml已经把neutron_lbaas的配置文件下发{{ node_config_directory }}/
    # neutron-lbaas-agent/这个目录系了。 容器中的路径container_config_directory变量
    # 可以在groups_var中找到,值为 /var/lib/kolla/config_files。
    - name: Restart neutron-server container
      vars:
        service_name: "neutron-server"
        service: "{{ neutron_services[service_name] }}"
        config_json: "{{ neutron_config_jsons.results|selectattr('item.key', 'equalto', service_name)|first }}"
        neutron_conf: "{{ neutron_confs.results|selectattr('item.key', 'equalto', service_name)|first }}"
        neutron_lbaas_conf: "{{ neutron_lbaas_confs.results|selectattr('item.key', 'equalto', service_name)|first }}"
        neutron_ml2_conf: "{{ neutron_ml2_confs.results|selectattr('item.key', 'equalto', service_name)|first }}"
        policy_json: "{{ policy_jsons.results|selectattr('item.key', 'equalto', service_name)|first }}"
        neutron_server_container: "{{ check_neutron_containers.results|selectattr('item.key', 'equalto', service_name)|first }}"
      kolla_docker:
        action: "recreate_or_restart_container"
        common_options: "{{ docker_common_options }}"
        name: "{{ service.container_name }}"
        image: "{{ service.image }}"
        volumes: "{{ service.volumes }}"
        privileged: "{{ service.privileged | default(False) }}"
      when:
        - action != "config"
        - service.enabled | bool
        - service.host_in_groups | bool
        - config_json | changed
          or neutron_conf | changed
          or neutron_lbaas_conf | changed
          or neutron_vpnaas_conf | changed
          or neutron_ml2_conf | changed
          or policy_json | changed
          or neutron_server_container | changed
    

    Ansible/role/neutron/task 下拉镜像

    Pull.yml 下拉镜像
    #调用kolla_docker的pull_image 下拉镜像
    - name: Pulling neutron images
    kolla_docker:
    action: "pull_image"
    common_options: "{{ docker_common_options }}"
    image: "{{ item.value.image }}"
    when:
    - item.value.enabled | bool
    - item.value.host_in_groups | bool
    with_dict: "{{ neutron_services }}"

    流程介绍

    镜像路径

    kolla-ansible\ansible\roles\chrony\defaults\main.yml文件中的一段,如下:

    #docker_registry是本地注册的registry地址,在global.yml中会配置。Registry虚拟机为会从定向5000端口带宿主机的4000端口。
    docker_namespace也在global.yml定义,如果从官网下载源码镜像的话,配置成lokolla。
    kolla_base_distro默认centos,也可以配置成ubuntu
    kolla_install_type 默认为binary, 我们配置在global.yml配置成了source
    例子:
    docker_registry:192.168.102.15:4000
    docker_namespace:lokolla
    kolla_base_distro:“centos”
    kolla_install_type: source
    openstack_release: auto (自发现,前文有讲到)
    所以最后的chrony_image_full为192.168.102.15:4000/lokolla/centos-source-chrony:4.0.2
    
    chrony_image: "{{ docker_registry ~ '/' if docker_registry else '' }}{{ docker_namespace }}/{{ kolla_base_distro }}-{{ kolla_install_type }}-chrony"
    chrony_tag: "{{ openstack_release }}"
    chrony_image_full: "{{ chrony_image }}:{{ chrony_tag }}"

    相关文章

      网友评论

        本文标题:Kolla-ansible(Ocata) 源码分析

        本文链接:https://www.haomeiwen.com/subject/haualxtx.html