docker swarm 修改集群ip

作者: bboysoul | 来源:发表于2018-07-04 21:04 被阅读33次

    概述

    事情是这样的,我在树梅派上运行着docker swarm的集群原来的ip是192.168.0.113,之后因为搬家,然后ip变成了192.168.11.113,接着的事情大家肯定可以想出来,就是node和manager肯定连接不上了。所以我直接把node移出集群,然后重新加入,但是那时候的join token的命令是下面这样子的

    ╰─➤  docker swarm join-token worker 
    To add a worker to this swarm, run the following command:
    
        docker swarm join --token SWMTKN-1-0shhd0b7uwajhgymxgpp1nv5u17jvcup9vvmhnqkg77ds57e5h-57e7hvjaxaagxtxddz416q5z2 192.168.0.113:2377
    

    当时我就直接把最后的ip修改为192.168.11.113这样把node加入集群,没错,的确可以加入,但是不知道为什么,node还是往192.168.0.113去连接,导致报错

    ╰─➤  tail -f daemon.log                                                                                                                                                                                 1 ↵
    Jul  4 08:07:16 pi-slave dockerd[21221]: time="2018-07-04T08:07:16.316088897Z" level=info msg="grpc: addrConn.resetTransport failed to create client transport: connection error: desc = \"transport: dial tcp 192.168.0.113:2377: getsockopt: no route to host\"; Reconnecting to {192.168.0.113:2377 <nil>}" module=grpc
    Jul  4 08:07:16 pi-slave dockerd[21221]: time="2018-07-04T08:07:16.317349637Z" level=info msg="Failed to dial 192.168.0.113:2377: grpc: the connection is closing; please retry." module=grpc
    Jul  4 08:07:19 pi-slave dockerd[21221]: time="2018-07-04T08:07:19.316151478Z" level=info msg="grpc: addrConn.resetTransport failed to create client transport: connection error: desc = \"transport: dial tcp 192.168.0.113:2377: getsockopt: no route to host\"; Reconnecting to {192.168.0.113:2377 <nil>}" module=grpc
    Jul  4 08:07:22 pi-slave dockerd[21221]: time="2018-07-04T08:07:22.318750223Z" level=info msg="grpc: addrConn.resetTransport failed to create client transport: connection error: desc = \"transport: dial tcp 192.168.0.113:2377: getsockopt: no route to host\"; Reconnecting to {192.168.0.113:2377 <nil>}" module=grpc
    Jul  4 08:07:22 pi-slave dockerd[21221]: time="2018-07-04T08:07:22.319398354Z" level=error msg="agent: session failed" backoff=8s error="rpc error: code = Unavailable desc = grpc: the connection is unavailable" module=node/agent node.id=yi1u62gxyockd23h2z4nm3of1
    Jul  4 08:07:22 pi-slave dockerd[21221]: time="2018-07-04T08:07:22.319955650Z" level=info msg="manager selected by agent for new session: {h6i2x0hals6za31uya16zlmli 192.168.0.113:2377}" module=node/agent node.id=yi1u62gxyockd23h2z4nm3of1
    Jul  4 08:07:22 pi-slave dockerd[21221]: time="2018-07-04T08:07:22.320147058Z" level=info msg="waiting 7.379239722s before registering session" module=node/agent node.id=yi1u62gxyockd23h2z4nm3of1
    Jul  4 08:07:25 pi-slave dockerd[21221]: time="2018-07-04T08:07:25.315780534Z" level=info msg="grpc: addrConn.resetTransport failed to create client transport: connection error: desc = \"transport: dial tcp 192.168.0.113:2377: getsockopt: no route to host\"; Reconnecting to {192.168.0.113:2377 <nil>}" module=grpc
    Jul  4 08:07:25 pi-slave dockerd[21221]: time="2018-07-04T08:07:25.318872383Z" level=info msg="Failed to dial 192.168.0.113:2377: grpc: the connection is closing; please retry." module=grpc
    Jul  4 08:07:25 pi-slave dockerd[21221]: time="2018-07-04T08:07:25.318573109Z" level=info msg="grpc: addrConn.resetTransport failed to create client transport: connection error: desc = \"transport: dial tcp 192.168.0.113:2377: getsockopt: no route to host\"; Reconnecting to {192.168.0.113:2377 <nil>}" module=grpc
    Jul  4 08:07:28 pi-slave dockerd[21221]: time="2018-07-04T08:07:28.316269050Z" level=info msg="grpc: addrConn.resetTransport failed to create client transport: connection error: desc = \"transport: dial tcp 192.168.0.113:2377: getsockopt: no route to host\"; Reconnecting to {192.168.0.113:2377 <nil>}" module=grpc
    Jul  4 08:07:31 pi-slave dockerd[21221]: time="2018-07-04T08:07:31.331695034Z" level=info msg="grpc: addrConn.resetTransport failed to create client transport: connection error: desc = \"transport: dial tcp 192.168.0.113:2377: getsockopt: no route to host\"; Reconnecting to {192.168.0.113:2377 <nil>}" module=grpc
    Jul  4 08:07:31 pi-slave dockerd[21221]: time="2018-07-04T08:07:31.332270925Z" level=error msg="agent: session failed" backoff=8s error="rpc error: code = Unavailable desc = grpc: the connection is unavailable" module=node/agent node.id=yi1u62gxyockd23h2z4nm3of1
    Jul  4 08:07:31 pi-slave dockerd[21221]: time="2018-07-04T08:07:31.332991816Z" level=info msg="manager selected by agent for new session: {h6i2x0hals6za31uya16zlmli 192.168.0.113:2377}" module=node/agent node.id=yi1u62gxyockd23h2z4nm3of1
    Jul  4 08:07:31 pi-slave dockerd[21221]: time="2018-07-04T08:07:31.333203172Z" level=info msg="waiting 1.698796755s before registering session" module=node/agent node.id=yi1u62gxyockd23h2z4nm3of1
    

    之后在qq群 facebook里面问了下,还是没有人可以给出一个肯定的解决方式,重新生成join token也不行

    ╭─root@pi-master /etc/default  
    ╰─➤  docker swarm join-token --rotate worker 
    Successfully rotated worker join token.
    
    To add a worker to this swarm, run the following command:
    
        docker swarm join --token SWMTKN-1-0shhd0b7uwajhgymxgpp1nv5u17jvcup9vvmhnqkg77ds57e5h-34adsf2isqqqnn7gd5hnhumdh 192.168.0.113:2377
    

    生成出来的ip还是192.168.0.113,那么就没办法了,因为是测试环境,只能暴力解决了重新建立集群

    重新建立集群

    首先把worker和manager都直接leave集群了,首先leave node

    docker swarm leave --force

    之后是manager

    docker swarm leave --force

    因为我的机器上没有什么重要的容器,所以直接这么做了

    接着重新建立集群

    ╭─root@pi-master /etc/default  
    ╰─➤  docker swarm init
    Swarm initialized: current node (wylbqh0q0p0yromn6hh44jrvx) is now a manager.
    
    To add a worker to this swarm, run the following command:
    
        docker swarm join --token SWMTKN-1-0n8n2vbtnuksw78opuex5hhtgffzn36dbii7u1dp5rrul8z85p-3buc8f7h2n8st570hcx9lwthk 192.168.11.113:2377
    
    To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions.
    

    可以看到上面的ip变为192.168.11.113了
    把node加入集群

    docker swarm join --token SWMTKN-1-0n8n2vbtnuksw78opuex5hhtgffzn36dbii7u1dp5rrul8z85p-3buc8f7h2n8st570hcx9lwthk 192.168.11.113:2377

    最后看下docker的日志

    ╭─root@pi-slave /var/log  
    ╰─➤  tail -f daemon.log                                                                                                                 
    Jul  4 08:34:35 pi-slave dhcpcd[517]: docker_gwbridge: using IPv4LL address 169.254.160.155
    Jul  4 08:34:35 pi-slave avahi-daemon[306]: Registering new address record for 169.254.160.155 on docker_gwbridge.IPv4.
    Jul  4 08:34:35 pi-slave dhcpcd[517]: docker_gwbridge: adding route to 169.254.0.0/16
    Jul  4 08:34:35 pi-slave dhcpcd[517]: veth06a7c80: using IPv4LL address 169.254.142.194
    Jul  4 08:34:35 pi-slave avahi-daemon[306]: Joining mDNS multicast group on interface veth06a7c80.IPv4 with address 169.254.142.194.
    Jul  4 08:34:35 pi-slave avahi-daemon[306]: New relevant interface veth06a7c80.IPv4 for mDNS.
    Jul  4 08:34:35 pi-slave avahi-daemon[306]: Registering new address record for 169.254.142.194 on veth06a7c80.IPv4.
    Jul  4 08:34:35 pi-slave dhcpcd[517]: veth06a7c80: adding route to 169.254.0.0/16
    Jul  4 08:34:38 pi-slave dhcpcd[517]: veth06a7c80: no IPv6 Routers available
    Jul  4 08:34:38 pi-slave dhcpcd[517]: docker_gwbridge: no IPv6 Routers available
    

    ok,一切正常

    要值得注意的是我上面这么做之后所有的容器就都没有了,几乎就是重新安装了一次集群,所以肯定不是最佳的解决方式,如果有人知道怎么处理这个问题欢迎联系我可以告诉我一下

    接着就可以创建容器了

    docker network create --driver=overlay --subnet=192.168.12.1/24 visualizer

    值得注意的是我现在才发现,当你在manager里面创建了这个overlay网络之后如果你不做在这个网络创建容器或者什么其他可以影响到worker节点的操作的时候,那么你就是不能在worker节点上看到这个网络的
    创建容器

    docker service create --name visualizer --replicas=3 --publish=8088:8080 --network=visualizer --mount=type=bind,src=/var/run/docker.sock,dst=/var/run/docker.sock alexellis2/visualizer-arm

    欢迎关注Bboysoul的博客www.bboysoul.com
    Have Fun

    相关文章

      网友评论

        本文标题:docker swarm 修改集群ip

        本文链接:https://www.haomeiwen.com/subject/nraeuftx.html