TL;DR 网络隔离是很重要的一个概念,大到云主机,小到 docker,都在用。本次学习笔记先做个实验,如何做网络隔离并通信。
docker 网络架构
docker 网络上图是
docker
经典的网格结构
-
eth0
是宿主机网络接口,docker0
是构建在eth0
物理网卡上的网桥设备 - 每个容器都有一对
veth pair
设备,一端放到容器中,做为eth0
,一端设置在宿主机网桥上用于通信 - 其中
iptables
用于做宿主机网络 snat, 并做容器到宿主机的端口映射
手工实现
1. 创建新的 net ns
首先来说,docker 网络是在不同的 net namespace 中,所以首先要创建新的 net ns
root@myali1:~# ip netns add ns1
当前己经创建了新的 net ns, 另外为了测试我们还要设置对应的 dns
root@myali1:~# mkdir -p /etc/netns/ns1
root@myali1:~# echo "nameserver 8.8.8.8" > /etc/netns/ns1/resolv.conf
root@myali1:~# cat /etc/netns/ns1/resolv.conf
nameserver 8.8.8.8
使用 list 查看当前 net ns 是否创建成功
root@myali1:~# ip netns list
ns1
然后为 net ns1 添加 loopback 本地网络接口
root@myali1:~# ip netns exec ns1 ip link set dev lo up
root@myali1:~# ip netns exec ns1 ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2. 新建网桥
root@myali1:~# brctl addbr lxcbr0
root@myali1:~# brctl stp lxcbr0 off
然后为新建的网桥设备添加 ip 地址,测试使用 192.168.10.1/24, 不与当前机器网段冲突即可。
root@myali1:~# ifconfig lxcbr0 192.168.10.1/24 up
root@myali1:~# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether 00:16:3e:00:59:f2 brd ff:ff:ff:ff:ff:ff
inet 172.24.213.39/20 brd 172.24.223.255 scope global dynamic eth0
valid_lft 315359423sec preferred_lft 315359423sec
4: lxcbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
link/ether 0a:ac:41:a6:88:4e brd ff:ff:ff:ff:ff:ff
inet 192.168.10.1/24 brd 192.168.10.255 scope global lxcbr0
valid_lft forever preferred_lft forever
3. 添加 veth pair
首先 veth 也是虚拟的网格设备,成对出现的,一端放置在容器中,一端放在网桥,可以实现通信。
root@myali1:~# ip link add veth-ns1 type veth peer name lxcbr0.1
root@myali1:~# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether 00:16:3e:00:59:f2 brd ff:ff:ff:ff:ff:ff
inet 172.24.213.39/20 brd 172.24.223.255 scope global dynamic eth0
valid_lft 315358965sec preferred_lft 315358965sec
4: lxcbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
link/ether 0a:ac:41:a6:88:4e brd ff:ff:ff:ff:ff:ff
inet 192.168.10.1/24 brd 192.168.10.255 scope global lxcbr0
valid_lft forever preferred_lft forever
5: lxcbr0.1@veth-ns1: <BROADCAST,MULTICAST,M-DOWN> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether ee:2a:c0:d5:df:04 brd ff:ff:ff:ff:ff:ff
6: veth-ns1@lxcbr0.1: <BROADCAST,MULTICAST,M-DOWN> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 46:cc:62:01:7a:ef brd ff:ff:ff:ff:ff:ff
注意看,添加完 veth pair 后出现两个网络接口,lxcbr0.1@veth-ns1
和 veth-ns1@lxcbr0.1
,这是两个方向的。然后我们把 veth-ns1
添加到容器中,也就是 net ns1 做为他的 eth0
root@myali1:~# ip link set veth-ns1 netns ns1
root@myali1:~# ip netns exec ns1 ip link set dev veth-ns1 name eth0
root@myali1:~# ip netns exec ns1 ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
6: eth0@if5: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 46:cc:62:01:7a:ef brd ff:ff:ff:ff:ff:ff link-netnsid 0
此时我们发现 net ns1 中己经有了 eth0 设备接口,我们再设置与网桥同网段的 ip
root@myali1:~# ip netns exec ns1 ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
6: eth0@if5: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state LOWERLAYERDOWN group default qlen 1000
link/ether 46:cc:62:01:7a:ef brd ff:ff:ff:ff:ff:ff link-netnsid 0
inet 192.168.10.11/24 brd 192.168.10.255 scope global eth0
valid_lft forever preferred_lft forever
然后我们再将 veth pair 另一段放在网桥上
root@myali1:~# brctl addif lxcbr0 lxcbr0.1
测试网络
1. 测试网桥连通性
以上几个步骤己经完成了配置,现在测试容器与宿主机连通性。
root@myali1:~# ip netns exec ns1 ping 192.168.10.1
如果发现 ping 不通,可能是网桥和 veth 没有 up 起来
root@myali1:~# ifconfig lxcbr0 up
root@myali1:~# ifconfig lxcbr0.1 up
2. 测试宿主机连通性
测试机本机 ip 172.24.213.39
为例
root@myali1:~# ip netns exec ns1 ping 172.24.213.39
connect: Network is unreachable
此时不通,我们查看下容器中的路由
root@myali1:~# ip netns exec ns1 route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
192.168.10.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0
只有直连路由,需要添加默认路由为网桥,然后再偿试
root@myali1:~# ip netns exec ns1 route add default gw 192.168.10.1 dev eth0
root@myali1:~# ip netns exec ns1 ping 172.24.213.39
PING 172.24.213.39 (172.24.213.39) 56(84) bytes of data.
64 bytes from 172.24.213.39: icmp_seq=1 ttl=64 time=0.040 ms
--- 172.24.213.39 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1006ms
rtt min/avg/max/mdev = 0.040/0.048/0.057/0.011 ms
3. 测试外网连通性
以测试 dns 8.8.8.8
为例
root@myali1:~# ip netns exec ns1 ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
^C
--- 8.8.8.8 ping statistics ---
3 packets transmitted, 0 received, 100% packet loss, time 2042ms
默认当然是不通的,需要用 iptables 在宿主机做 nat 转换
root@myali1:~# iptables -t filter -A FORWARD -i lxcbr0 ! -o lxcbr0 -j ACCEPT
root@myali1:~# iptables -t filter -A FORWARD -i lxcbr0 -o lxcbr0 -j ACCEPT
root@myali1:~# iptables -t filter -A FORWARD -o lxcbr0 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
root@myali1:~# iptables -t nat -A POSTROUTING -s 192.168.10.0/24 ! -o lxcbr0 -j MASQUERADE
root@myali1:~# ip netns exec ns1 ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
64 bytes from 8.8.8.8: icmp_seq=2 ttl=46 time=46.2 ms
64 bytes from 8.8.8.8: icmp_seq=3 ttl=46 time=46.2 ms
^C
--- 8.8.8.8 ping statistics ---
4 packets transmitted, 2 received, 50% packet loss, time 3022ms
rtt min/avg/max/mdev = 46.238/46.254/46.270/0.016 ms
端口映射
同样需要用 iptables 实现,我们先在 ns1 下启动测试服务
root@myali1:~# ip netns exec ns1 python -m SimpleHTTPServer
Serving HTTP on 0.0.0.0 port 8000 ...
这个 python http 程序使用 ns1 namespace, 先测试使用容器 ip 访问
root@myali1:~# curl -I http://192.168.10.11:8000
HTTP/1.0 200 OK
Server: SimpleHTTP/0.6 Python/2.7.15+
Date: Tue, 05 Nov 2019 09:27:19 GMT
Content-type: text/html; charset=UTF-8
Content-Length: 1160
现在要做端口映射,使得其他机器可以访问
root@myali1:~# iptables -t nat -A PREROUTING -p tcp -m tcp --dport 80 -j DNAT --to-destination 192.168.10.11:8000
root@myali1:~# iptables -t filter -A FORWARD -p tcp -m tcp --dport 8000 -j ACCEPT
然后在另外一台机器测试,本机真实物理 ip 是 172.24.213.39
root@worker1:~# curl -I http://172.24.213.39:80
HTTP/1.0 200 OK
Server: SimpleHTTP/0.6 Python/2.7.15+
Date: Tue, 05 Nov 2019 09:38:19 GMT
Content-type: text/html; charset=UTF-8
Content-Length: 1160
现在要做端口映射,使得本机可以访问
root@myali1:~# iptables -t nat -A OUTPUT -p tcp -m tcp --dport 80 -j DNAT --to-destination 192.168.10.11:8000
root@myali1:~# iptables -t nat -A POSTROUTING -p tcp -m tcp --dport 8000 -j MASQUERADE
然后分别是 172.24.213.39 与网桥 192.168.10.1 做测试
root@myali1:~# curl -I http://192.168.10.1:80
HTTP/1.0 200 OK
Server: SimpleHTTP/0.6 Python/2.7.15+
Date: Tue, 05 Nov 2019 09:42:03 GMT
Content-type: text/html; charset=UTF-8
Content-Length: 1160
root@myali1:~# curl -I http://172.24.213.39
HTTP/1.0 200 OK
Server: SimpleHTTP/0.6 Python/2.7.15+
Date: Tue, 05 Nov 2019 11:04:26 GMT
Content-type: text/html; charset=UTF-8
Content-Length: 1160
小结
网络这块还是比较复杂,另外端口映射如果用 iptables nat, 线上性能肯定炸锅。
网友评论