nova与neutron交互的细节分析

作者: 刘力思 | 来源:发表于2018-08-04 23:17 被阅读212次

nova与neutron交互的细节分析
Openstack Neutron架构概览
openstack-prometheus-exporter源码解
openstack架构
nova neutron deadlock的问题
Ubuntu16手动安装OpenStack——neutron篇
nova 挂载卷源码分析
追踪openstack创建虚拟机入口
交互细节分析——分页
Android网络编程

前言

通过查询资料，我们应该大体都了解nova创建虚机的一般流程：首先是通过horizon下发指令到nova-API，然后是keystone认证，接着nova-conductor查询数据库，nova-scheduler进行物理主机选举，从glance下载镜像，通过neutron创建网络资源，最后nova-compute创建虚机。本文旨在记录一下，在代码层面，nova与neutron的交互过程中比较重要的细节。

nova处理流程

nova-conductor通过rpc的方式通知nova-compute创建虚机。nova-compute创建虚机的入口是build_and_run_instance，在文件/nova/compute/manager.py中：

    def build_and_run_instance(self, context, instance, image, request_spec,
                     filter_properties, admin_password=None, injected_files=None, requested_networks=None,
                     security_groups=None, block_device_mapping=None, node=None, limits=None):
           
        @utils.synchronized(instance.uuid)
        def _locked_do_build_and_run_instance(*args, **kwargs):
            ......
            with self._build_semaphore:
                try:
                    result = self._do_build_and_run_instance(*args, **kwargs)
                except Exception:

在这里，具体执行创建虚机流程的方法是_do_build_and_run_instance。进入这个方法，再进一步分析：

    def _do_build_and_run_instance(self, context, instance, image,
            request_spec, filter_properties, admin_password, injected_files,
            requested_networks, security_groups, block_device_mapping, node=None, limits=None):
        ......
            
        try:
            with timeutils.StopWatch() as timer:
                self._build_and_run_instance(context, instance, image,
                        decoded_files, admin_password, requested_networks,
                        security_groups, block_device_mapping, node, limits, filter_properties)
        ......

在这个方法里，主要关注_build_and_run_instance的调用，进入该方法的主体：

    def _build_and_run_instance(self, context, instance, image, injected_files,
            admin_password, requested_networks, security_groups,
            block_device_mapping, node, limits, filter_properties):
        ......

        try:
        ......
                with self._build_resources(context, instance,requested_networks, security_groups, image_meta,
                        block_device_mapping) as resources:
                    ......
                    instance.save(expected_task_state=task_states.BLOCK_DEVICE_MAPPING)
                    block_device_info = resources['block_device_info']
                    network_info = resources['network_info']
                    LOG.debug('Start spawning the instance on the hypervisor.',
                              instance=instance)
                    with timeutils.StopWatch() as timer:
                        self.driver.spawn(context, instance, image_meta, injected_files, admin_password,
                                          network_info=network_info, block_device_info=block_device_info)
        ......

实现的时候，_build_resources主要是创建各种资源，其中就包括虚机的网络资源。self.driver.spawn就是调用具体的hypervisor的driver进行虚机创建的操作，比如libvirt，它的实现就是根据资源信息生成虚机的xml文件，然后通过xml配置启动虚机。

nova与neutron的交互

nova与neutron的交互主要就是nova通知neutron服务器创建网络资源，并获取成功创建的资源信息。
首先进入_build_resources方法里：

    def _build_resources(self, context, instance, requested_networks,
        security_groups, image_meta, block_device_mapping):
        resources = {}
        network_info = None
        try:
            LOG.debug('Start building networks asynchronously for instance.',instance=instance)
            network_info = self._build_networks_for_instance(context, instance,
                    requested_networks, security_groups)
            resources['network_info'] = network_info

更深入查看方法的调用，_build_networks_for_instance方法：

    def _build_networks_for_instance(self, context, instance,requested_networks, security_groups):
        ......
        network_info = self._allocate_network(context, instance,
                requested_networks, macs, security_groups, dhcp_options)

_allocate_network方法：

    def _allocate_network(self, context, instance, requested_networks, macs, security_groups, dhcp_options):
      
        return network_model.NetworkInfoAsyncWrapper(
                self._allocate_network_async, context, instance,
                requested_networks, macs, security_groups, is_vpn, dhcp_options)

_allocate_network_async方法：

    def _allocate_network_async(self, context, instance, requested_networks,
                                macs, security_groups, is_vpn, dhcp_options):
        ......
     
        bind_host_id = self.driver.network_binding_host_id(context, instance)
        for attempt in range(1, attempts + 1):
            try:
                nwinfo = self.network_api.allocate_for_instance(
                        context, instance, vpn=is_vpn, requested_networks=requested_networks,
                        macs=macs, security_groups=security_groups,
                        dhcp_options=dhcp_options, bind_host_id=bind_host_id)

在这个方法里，首先会通过self.driver.network_binding_host_id获取到虚机所在主机的host_id。然后进行虚机的网络资源的创建。也就是方法allocate_for_instance：

    def allocate_for_instance(self, context, instance, vpn,requested_networks, macs=None,
                              security_groups=None,dhcp_options=None, bind_host_id=None):
        ......
        # We do not want to create a new neutron session for each call
        neutron = get_client(context)
        .....  
        # Create any ports that might be required,
        # after validating requested security groups
        security_groups = self._clean_security_groups(security_groups)
        security_group_ids = self._process_security_groups(instance, neutron, security_groups)
             
        requests_and_created_ports = self._create_ports_for_instance(
            context, instance, ordered_networks, nets, neutron, security_group_ids)
           
        # Update existing and newly created ports
        available_macs = _filter_hypervisor_macs(instance, ports, macs) 
        admin_client = get_client(context, admin=True)
              
        ordered_nets, ordered_ports, preexisting_port_ids, \
            created_port_ids = self._update_ports_for_instance(
                context, instance, neutron, admin_client, requests_and_created_ports, nets,
                bind_host_id, dhcp_options, available_macs)
              
        nw_info = self.get_instance_nw_info(context, instance, networks=ordered_nets, port_ids=ordered_ports,
            admin_client=admin_client,preexisting_port_ids=preexisting_port_ids, update_cells=True)
                    
        return network_model.NetworkInfo([vif for vif in nw_info
                                          if vif['id'] in created_port_ids + preexisting_port_ids])

从方法的实现上可以看到，首先会创建虚机的安全组和端口，注意这时候的端口，是最小配置的端口，还没有进行bind和qos策略这些extensions配置的，端口的bind状态是UNBOUND状态。创建成功后会调用_update_ports_for_instance方法将端口的bind_host_id传入进行端口bind_port操作。以上的操作都会调用neutron的api。重点关注以下几个方法的实现：

_process_security_groups
_create_ports_for_instance 调用neutron的create_port
_update_ports_for_instance 调用neutron的update_port

这里neutron创建端口的操作主要包括IP地址分配以及安全组的配置；接下来将单独分析一下neutron的端口绑定bind_port。

上面两部分的调用栈可以简单总结如下：

build_and_run_instance
--->_locked_do_build_and_run_instance
|   --->_do_build_and_run_instance
|   |   --->_build_and_run_instance
|   |   |   --->_build_resources
|   |   |   |   --->_build_networks_for_instance
|   |   |   |   |   --->_allocate_network
|   |   |   |   |   |   --->_allocate_network_async
|   |   |   |   |   |   |   --->allocate_for_instance
|   |   |   |   |   |   |   |   --->_process_security_groups
|   |   |   |   |   |   |   |   --->_create_ports_for_instance
|   |   |   |   |   |   |   |   |   --->_create_port_minimal
|   |   |   |   |   |   |   |   --->_update_ports_for_instance
|   |   |   |   |   |   |   |   |   --->_populate_neutron_extension_values
|   |   |   |   |   |   |   |   |   |   --->QOS_QUEUE,BINDING_HOST_ID,DNS_INTEGRATION
|   |   |   |   |   |   |   |   |   |   --->_populate_pci_mac_address
|   |   |   |   |   |   |   |   |   |   --->_populate_mac_address
|   |   |   |   |   |   |   |   |   |   --->extra_dhcp_opts
|   |   |   |   |   |   |   |   |
|   |   |   |   |   |   |   |   | 
|   |   |   --->self.driver.spawn

neutron端口绑定

neutron的端口绑定bind_port的代码，关注一下neutron的bind_port方法，在文件neutron/plugins/ml2/manager.py中：

    def bind_port(self, context):
        binding = context._binding
        LOG.debug("Attempting to bind port %(port)s on host %(host)s "
                  "for vnic_type %(vnic_type)s with profile %(profile)s",
                  {'port': context.current['id'],'host': context.host,
                   'vnic_type': binding.vnic_type,'profile': binding.profile})
        context._clear_binding_levels()
        if not self._bind_port_level(context, 0,context.network.network_segments):
            binding.vif_type = portbindings.VIF_TYPE_BINDING_FAILED

方法首先会获取端口即将binding的信息，清空端口的层次化绑定信息，然后调用_bind_port_level方法进行分层绑定。这里有个关注的点就是，目前的_bind_port_level方法是兼容了单层绑定和多层次化绑定的。目前对于层次化绑定可以参考这个BP

接下来分析_bind_port_level：

    def _bind_port_level(self, context, level, segments_to_bind):
        ......
        for driver in self.ordered_mech_drivers:
            if not self._check_driver_to_bind(driver, segments_to_bind,
                                              context._binding_levels):
                continue
            try:
                context._prepare_to_bind(segments_to_bind)
                driver.obj.bind_port(context)
                segment = context._new_bound_segment
                if segment:
                    context._push_binding_level(
                        models.PortBindingLevel(port_id=port_id,host=context.host,
                                                level=level,driver=driver.name,segment_id=segment))
                    next_segments = context._next_segments_to_bind
                    if next_segments:
                        # Continue binding another level.
                        if self._bind_port_level(context, level + 1,next_segments):
                            return True
                        else:
                            LOG.warning(_LW("Failed to bind port %(port)s on "
                                            "host %(host)s at level %(lvl)s"),
                                        {'port': context.current['id'],'host': context.host,'lvl': level + 1})
                            context._pop_binding_level()
                    else:
                        # Binding complete.
                        ......
                        return True

首先会遍历每个ml2 driver（例如openvswitch，networking-huawei等），调用_check_driver_to_bind，遍历每个context._binding_levels中的level，如果该level.segment_id在segments_to_bind里面以及level.driver与当前driver匹配的话，将会退出该driver的绑定流程，进行下一个driver的绑定。简单总结一下：就是在同一个driver上不能绑定相同的segment_id。

当然，如果是层次化绑定的话，需要higher level的driver将分配的segment_id通过continue_binding()传递给bottom level的driver。各种driver进行端口绑定的过程就不详细分析了，主要就是获取host中运行的alive状态的agent，如果agent的信息符合绑定的需求（如检验network_type），就会将该agent信息写入端口绑定信息里（vif_type：ovs，vif_details）。nova创建虚机，需要获取vif_type：ovs，vif_detail信息。

个人分析，欢迎指正，若转载请注明出处！
欢迎访问我的主页

nova与neutron交互的细节分析
前言通过查询资料，我们应该大体都了解nova创建虚机的一般流程：首先是通过horizon下发指令到nova-AP...
Openstack Neutron架构概览
Neutron是什么 Neutron跟Nova、Cinder等一样都是Openstack的核心服务组件，它主要负责...
openstack-prometheus-exporter源码解
1.简介 1.1 服务的功能获取openstack集群中cinder,nova,neutron, swift,s...
openstack架构
Nova：管理VM的生命周期，是Openstack中最核心的服务。Neutron: 为openstack提供网络连...
nova neutron deadlock的问题
dead lock两个例子 nova-comptue rpc nova-conductor 更新port SAVE...
Ubuntu16手动安装OpenStack——neutron篇
目标紧接着《Ubuntu16手动安装OpenStack——nova篇》，本文我们来安装neutron，主要参考N...
nova 挂载卷源码分析
本文以kilo版本的nova为例 nova挂载卷源码分析 nova/api/openstack/compute/c...
追踪openstack创建虚拟机入口
wsgi 暂不分析直接进入nova restful api 分析每个API对应的Controller都在nova...
交互细节分析——分页
Vonnie|2012-02-04|交互设计说说目前常用的三种分类显示信息方法：常规翻页信息滚动翻页滚动条...
Android网络编程
一、后台与APP 交互过程分析 1 、后台与APP 交互过程分析基于http/https协议的app前后台交互包...