通过一个例子说明docker swarm网络如何路由。
假设有一个2 nodes的环境, swarm环境已经配置好。
- 创建一个service
$ docker service create --detach=true --name <myservice> --replicas=2 --publish 18080:8080 <myimage>
一般来说<myservice>会在两个node上各自创建一个container。
- 主机访问如何路由
在主机上
$ curl -v $(hostname):18080/hello
第一步:查看主机路由表
nat表是什么:the table where rules are set to implement network address translation
$ sudo iptables --list --table nat
Chain PREROUTING (policy ACCEPT)
target prot opt source destination
DOCKER-INGRESS all -- anywhere anywhere ADDRTYPE match dst-type LOCAL
DOCKER all -- anywhere anywhere ADDRTYPE match dst-type LOCAL
Chain INPUT (policy ACCEPT)
target prot opt source destination
Chain OUTPUT (policy ACCEPT)
target prot opt source destination
DOCKER-INGRESS all -- anywhere anywhere ADDRTYPE match dst-type LOCAL
DOCKER all -- anywhere !loopback/8 ADDRTYPE match dst-type LOCAL
Chain POSTROUTING (policy ACCEPT)
target prot opt source destination
MASQUERADE all -- anywhere anywhere ADDRTYPE match src-type LOCAL
MASQUERADE all -- 172.17.0.0/16 anywhere
MASQUERADE all -- 172.18.0.0/16 anywhere
Chain DOCKER (2 references)
target prot opt source destination
RETURN all -- anywhere anywhere
RETURN all -- anywhere anywhere
Chain DOCKER-INGRESS (2 references)
target prot opt source destination
DNAT tcp -- anywhere anywhere tcp dpt:18080 to:172.18.0.2:18080
RETURN all -- anywhere anywhere
在DOCKER-INGRESS里面看到一条:
DNAT tcp -- anywhere anywhere tcp dpt:18080 to:172.18.0.2:18080
把所有目标端口为18080的请求都导向到172.18.0.2:18080
第二步:172.18.0.2又是谁
从ifconfig可以知道,172.17.X.X是docker0,而172.18.X.X是docker_gwbridge
$ ifconfig
docker0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 172.17.0.1 netmask 255.255.0.0 broadcast 172.17.255.255
...
docker_gwbridge: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 172.18.0.1 netmask 255.255.0.0 broadcast 172.18.255.255
...
他们分别对应docker network ls
里面的bridge
和docker_gwbridge
。
然后我们列出所有哪些containers链接到docker_gwbridge
:
$ docker network inspect --format '{{ json .Containers }}' docker_gwbridge | jq
{
"81cd73da4eb0a9607d66df0e23dd2ba04d74551e69dacad96fc3050cfb6083aa": {
"Name": "gateway_ee21c8c2c0d3",
"EndpointID": "7679eccd97314dd634f9b5440b79f0039901efdf891c7ccaac068753d5626677",
"IPv4Address": "172.18.0.3/16",
"IPv6Address": ""
},
"ingress-sbox": {
"Name": "gateway_ingress-sbox",
"EndpointID": "f1a6ee1de7b9d0dedbf72fd3ba3937f6d2029c37da0f17be34b133b3397880e4",
"IPv4Address": "172.18.0.2/16",
"IPv6Address": ""
}
}
在另外一个node上:
$ docker network inspect --format '{{ json .Containers }}' docker_gwbridge | jq
{
"37008339f27a6e3ce0f8ff3ab4b20f1b76cac929fee04a4cb913209fb5dd5391": {
"Name": "gateway_f32045a92934",
"EndpointID": "12ec3275ee8fdfe5e16bcbb2a0ecbb510ba1e110d3d3772712324e85c1b87413",
"IPv4Address": "172.18.0.3/16",
"IPv6Address": ""
},
"ingress-sbox": {
"Name": "gateway_ingress-sbox",
"EndpointID": "b69304de53627027a455050c9160fc24e3bd3dbab62b9f077a6d514203bc4f9f",
"IPv4Address": "172.18.0.2/16",
"IPv6Address": ""
}
}
可见172.18.0.2是各自node上的ingress-sbox。
第三步:ingress-sbox又是什么鬼
先看ingress-sbox有什么内容:
$ sudo nsenter --net=/var/run/docker/netns/ingress_sbox ipvsadm --list --numeric
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
FWM 259 rr
-> 10.0.0.24:0 Masq 1 0 0
-> 10.0.0.25:0 Masq 1 0 0
FWM: firewall mark
对于标记为FWM=259的请求,在10.0.0.24和10.0.0.25之间进行load-balance
这个FWM在哪里设置的:
$ sudo nsenter --net=/var/run/docker/netns/ingress_sbox iptables --list --table mangle
Chain PREROUTING (policy ACCEPT)
target prot opt source destination
MARK tcp -- anywhere anywhere tcp dpt:18080 MARK set 0x103
Chain INPUT (policy ACCEPT)
target prot opt source destination
MARK all -- anywhere 10.0.0.23 MARK set 0x103
Chain FORWARD (policy ACCEPT)
target prot opt source destination
Chain OUTPUT (policy ACCEPT)
target prot opt source destination
Chain POSTROUTING (policy ACCEPT)
target prot opt source destination
两个rules:
- 目标端口是18080
- 目标地址是10.0.0.23的
- 地址10.0.0.23又是谁:是<myservice>的虚地址:
$ docker service inspect <myservice> --format '{{ json .Endpoint.VirtualIPs }}' | jq .
[
{
"NetworkID": "srqkemty80sdtczb5gl8io45i",
"Addr": "10.0.0.23/24"
}
]
第四步:10.0.0.24/10.0.0.25又是谁
查看两个service container的IP address:
$ docker exec <myservice_container1> ip addr show eth0
17986: eth0@if17987: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP group default
inet 10.0.0.25/24 brd 10.0.0.255 scope global eth0
$ docker exec <myservice_container1> ip addr show eth1
17988: eth1@if17989: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
inet 172.18.0.3/16 brd 172.18.255.255 scope global eth1
在另一个node上:
$ docker exec <myservice_container2> ip addr show eth0
30: eth0@if31: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP group default
inet 10.0.0.24/24 brd 10.0.0.255 scope global eth0
$ docker exec <myservice_container2> ip addr show eth1
32: eth1@if33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
inet 172.18.0.3/16 brd 172.18.255.255 scope global eth1
这样就清楚了,10.0.0.24/10.0.0.25分别是service的两个container的eth0的地址。(另外每一个swarm container都有两个网卡eth0和eth1,eth0在ingress网络上,而eth1在docker_gwbridge网络上)
- 总结
整个2-node的网络结构是这样的。
----------------node1------------------+--------------------node2------------
|
172.18.0.2----------172.18.0.3-----------172.18.0.2----------172.18.0.3 -- docker_gwbridge
| | | | |
| | | | |
ingress-sbox <myservice-container1> | ingress-sbox <myservice-container2>
| | | | |
| | | | |
10.0.0.2------------10.0.0.25---------+--10.0.0.3------------10.0.0.24 -- ingress
-- myservice vip(10.0.0.23)
网友评论