美文网首页
ipvlan l3s模式

ipvlan l3s模式

作者: 苏苏林 | 来源:发表于2021-03-12 17:19 被阅读0次

ipvlan 三种模式,l2、l3、l3s,前两种网上资料很多,但第三种却很少,自己看了下代码记录一下。
为什么要看ipvlan?它特别适合做多租户nat场景,这种场景下,用户的内网VPC地址网段可能是重叠的,需要使用net namespace、vrf等手段隔离路由,但通常公网网卡和公网网关就一个,且网关地址和用户公网地址通常不是一个网段的,也就是说不能放到一个二层中。但使用ipvlan l3 mode可以解决这个问题。到此为止,有时间写一下用法,继续l3s mode。
附英文注释:

4.1 L2 mode:
    In this mode TX processing happens on the stack instance attached to the
slave device and packets are switched and queued to the master device to send
out. In this mode the slaves will RX/TX multicast and broadcast (if applicable)
as well.

4.2 L3 mode:
    In this mode TX processing up to L3 happens on the stack instance attached
to the slave device and packets are switched to the stack instance of the
master device for the L2 processing and routing from that instance will be
used before packets are queued on the outbound device. In this mode the slaves
will not receive nor can send multicast / broadcast traffic.

4.3 L3S mode:
    This is very similar to the L3 mode except that iptables (conn-tracking)
works in this mode and hence it is L3-symmetric (L3s). This will have slightly less
performance but that shouldn't matter since you are choosing this mode over plain-L3
mode to make conn-tracking work.

虽然说 l3s is very similar to the L3 mode,但代码上面完全不同。

物理口收到报文之后,调用接口的handler函数,可以看到IPVLAN_MODE_L3S模式下,直接返回了RX_HANDLER_PASS,什么也没做,继续内核的协议栈,走ip_rcv函数过了PREROUTING,再调用ip_rcv_finish函数。


rx_handler_result_t ipvlan_handle_frame(struct sk_buff **pskb)
{
    struct sk_buff *skb = *pskb;
    struct ipvl_port *port = ipvlan_port_get_rcu(skb->dev);

    if (!port)
        return RX_HANDLER_PASS;

    switch (port->mode) {
    case IPVLAN_MODE_L2:
        return ipvlan_handle_mode_l2(pskb, port);
    case IPVLAN_MODE_L3:
        return ipvlan_handle_mode_l3(pskb, port);
    case IPVLAN_MODE_L3S:
        return RX_HANDLER_PASS;
    }

    /* Should not reach here */
    WARN_ONCE(true, "ipvlan_handle_frame() called for mode = [%hx]\n",
              port->mode);
    kfree_skb(skb);
    return RX_HANDLER_CONSUMED;
}

ip_rcv_finish 会调用l3mdev_ip_rcv 函数,这个是重点。

在创建ipvlan接口时(ipvlan_link_new),如果是 IPVLAN_MODE_L3S 模式,会给物理口挂载l3mdev_ops=&ipvl_l3mdev_ops 和 一个netfilter 钩子函数。


static int ipvlan_set_port_mode(struct ipvl_port *port, u16 nval)
{
   struct ipvl_dev *ipvlan;
   struct net_device *mdev = port->dev;
   int err = 0;

   ASSERT_RTNL();
   if (port->mode != nval) {
       if (nval == IPVLAN_MODE_L3S) {
           /* New mode is L3S */
           err = ipvlan_register_nf_hook();
           if (!err) {
               mdev->l3mdev_ops = &ipvl_l3mdev_ops;
               mdev->priv_flags |= IFF_L3MDEV_MASTER;
           } else
               return err;
       } else if (port->mode == IPVLAN_MODE_L3S) {
           /* Old mode was L3S */
           mdev->priv_flags &= ~IFF_L3MDEV_MASTER;
           ipvlan_unregister_nf_hook();
           mdev->l3mdev_ops = NULL;
       }
       list_for_each_entry(ipvlan, &port->ipvlans, pnode) {
           if (nval == IPVLAN_MODE_L3 || nval == IPVLAN_MODE_L3S)
               ipvlan->dev->flags |= IFF_NOARP;
           else
               ipvlan->dev->flags &= ~IFF_NOARP;
       }
       port->mode = nval;
   }
   return err;
}

l3mdev_ops一般用来定义路由查找逻辑,如vrf的实现,它会在特定的路由表中查找路由。ipvlan l3s模式这个l3mdev_l3_rcv函数自己定义了路由查找方式,会根据报文目的地址或者arp的target ip找到slave 接口,然后将其作为入接口,在其所在的netns中查找路由,由于是本地报文,会走到ip_local_deliver,作为本机报文处理。

static struct nf_hook_ops ipvl_nfops[] __read_mostly = {
    {
        .hook     = ipvlan_nf_input,
        .pf       = NFPROTO_IPV4,
        .hooknum  = NF_INET_LOCAL_IN,
        .priority = INT_MAX,
    },
    {
        .hook     = ipvlan_nf_input,
        .pf       = NFPROTO_IPV6,
        .hooknum  = NF_INET_LOCAL_IN,
        .priority = INT_MAX,
    },
};


static struct l3mdev_ops ipvl_l3mdev_ops __read_mostly = {
    .l3mdev_l3_rcv = ipvlan_l3_rcv,
};


struct sk_buff *ipvlan_l3_rcv(struct net_device *dev, struct sk_buff *skb,
                  u16 proto)
{
    struct ipvl_addr *addr;
    struct net_device *sdev;

    addr = ipvlan_skb_to_addr(skb, dev);
    if (!addr)
        goto out;

    sdev = addr->master->dev;
    switch (proto) {
    case AF_INET:
    {
        int err;
        struct iphdr *ip4h = ip_hdr(skb);

        err = ip_route_input_noref(skb, ip4h->daddr, ip4h->saddr,
                       ip4h->tos, sdev);
        if (unlikely(err))
            goto out;
        break;
    }
    case AF_INET6:
    {
        struct dst_entry *dst;
        struct ipv6hdr *ip6h = ipv6_hdr(skb);
        int flags = RT6_LOOKUP_F_HAS_SADDR;
        struct flowi6 fl6 = {
            .flowi6_iif   = sdev->ifindex,
            .daddr        = ip6h->daddr,
            .saddr        = ip6h->saddr,
            .flowlabel    = ip6_flowinfo(ip6h),
            .flowi6_mark  = skb->mark,
            .flowi6_proto = ip6h->nexthdr,
        };

        skb_dst_drop(skb);
        dst = ip6_route_input_lookup(dev_net(sdev), sdev, &fl6, flags);
        skb_dst_set(skb, dst);
        break;
    }
    default:
        break;
    }

out:
    return skb;
}

可以看到和l3 mode在代码流程方面区别很大,l3s过了PREROUTING hook点,找到slave接口后,再过LOCAL_IN hook点。
nat场景下,l3 mode在找到slave 接口之后调用netif_rx_internal会完整再走一边协议栈,可以做由外到内的一对一nat,而l3s mode,找到slave接口后,调用ip_local_deliver直接进LOCAL_IN hook点,后面送上层协议栈了,没机会做DNAT了。

相关文章

  • ipvlan l3s模式

    ipvlan 三种模式,l2、l3、l3s,前两种网上资料很多,但第三种却很少,自己看了下代码记录一下。为什么要看...

  • terway 路由设计

    以后可能ipvlan更普及,那就拿ipvlan l2 做个例子,其实和veth pair是一样的 pod 内部

  • Linux网络协议栈6--ipvlan

    本来想将macvlan和ipvlan放一起写,但是在测试过程中发现,ipvlan使用起来还是挺复杂的,于是单独作为...

  • 网卡虚拟化 Ipvlan

    L2模式 添加网络命名空间net1和net2 添加link ipv1和 ipv2,类型为ipvlan,mode为L...

  • Macvlan 和 IPvlan

    1. 介绍 2. 工作模式(Bridge VS MACVlan) 2.1 Bridge Mode Bridge 是...

  • ipvlan 手动测试参考

    ipvlan l2 升级内核 实验: 参考: https://kernel.taobao.org/2019/11/...

  • ping 只有 arp 没有icmp

    k8s node 双网卡, eth1用于为pod 提供ipvlan子接口 问题: node ping pod 不通...

  • ipvlan cni的一些使用上的细节

    一:不给eth1分配IP(ipvlan+whereabouts的方案)二:给eth1分配IP,但用ipvl子接口禁...

  • l2 ipvlan cni 集成免费arp功能

    关于免费arp的一般使用场景: https://baike.baidu.com/item/gratuitous%2...

  • JS 设计模式

    工厂模式 单体模式 模块模式 代理模式 职责链模式 命令模式 模板方法模式 策略模式 发布-订阅模式 中介者模式 ...

网友评论

      本文标题:ipvlan l3s模式

      本文链接:https://www.haomeiwen.com/subject/qauzqltx.html