docker swarm close instantly after "docker swarm init"

  docker, tcp

I created three VMs[vm1(ip1),vm2(ip2),vm3(ip3)] on a machine and try to execute "docker swarm init –advertise-addr ip1". However, vm1 left the swarm just a few seconds later which leads "connection refused" when I execute "docker swarm join –token <init_token> ip1:2377" on vm2 or vm3.
Here is docker log on vm1:

May 28 18:11:52 Rsubuntu dockerd[22203]: time="2021-05-28T18:11:52.910962868+08:00" level=info msg="ClientConn switching balancer to "pick_first"" module=grpc
MAy 28 18:11:52 Rsubuntu dockerd[22203]: time="2021-05-28T18:11:52.911042759+08:00" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc42116c920, CONNECTING" module=grpc
May 28 18:11:52 Rsubuntu dockerd[22203]: time="2021-05-28T18:11:52.911821825+08:00" level=warning msg="grpc: addrConn.createTransport failed to connect to {ip1:2377 0  <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp ip1:2377: connect: connection refused". Reconnecting..." module=grpc
May 28 18:11:52 Rsubuntu dockerd[22203]: time="2021-05-28T18:11:52.911902997+08:00" level=info msg="pickfirstBalancer: HandleSubConnStateChange: 0xc42116c920, TRANSIENT_FAILURE" module=grpc
May 28 18:11:52 Rsubuntu dockerd[22203]: time="2021-05-28T18:11:52.911978717+08:00" level=error msg="failed to retrieve remote root CA certificate" error="rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = "transport: Error while dialing dial tcp 150.242.170.130:2377: connect: connection refused"" module=node
May 28 18:11:52 Rsubuntu dockerd[22203]: time="2021-05-28T18:11:52.912019788+08:00" level=warning msg="Failed to dial ip1:2377: context canceled; please retry." module=grpc
May 28 18:11:54 Rsubuntu dockerd[22203]: time="2021-05-28T18:11:54.912335978+08:00" level=error msg="cluster exited with error: rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = "transport: Error while dialing dial tcp 150.242.170.130:2377: connect: connection refused""
May 28 18:11:54 Rsubuntu dockerd[22203]: time="2021-05-28T18:11:54.912782313+08:00" level=error msg="Handler for POST /v1.39/swarm/join returned error: rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = "transport: Error while dialing dial tcp 150.242.170.130:2377: connect: connection refused""
May 28 18:12:28 Rsubuntu dockerd[22203]: time="2021-05-28T18:12:28.310688663+08:00" level=error msg="Error getting nodes: This node is not a swarm manager. Use "docker swarm init" or "docker swarm join" to connect this node to swarm and try again."
May 28 18:12:28 Rsubuntu dockerd[22203]: time="2021-05-28T18:12:28.310759132+08:00" level=error msg="Handler for GET /v1.39/nodes returned error: This node is not a swarm manager. Use "docker swarm init" or "docker swarm join" to connect this node to swarm and try again."

Next is the result of netstat on vm1:

Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name    
tcp        0      0 127.0.0.53:53           0.0.0.0:*               LISTEN      351/systemd-resolve 
tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN      3967/sshd           
tcp        0      0 127.0.0.1:631           0.0.0.0:*               LISTEN      16672/cupsd         
tcp        0      0 127.0.0.1:6011          0.0.0.0:*               LISTEN      5992/sshd: [email protected]/7 
tcp        0      0 127.0.0.1:6012          0.0.0.0:*               LISTEN      6078/sshd: [email protected]/8 
tcp        0      0 0.0.0.0:80              0.0.0.0:*               LISTEN      26417/nginx: master 
tcp        0      0 150.242.170.131:22      111.42.148.30:5483      ESTABLISHED 6011/sshd: rs [priv 
tcp        0      0 150.242.170.130:22      223.72.95.216:14168     ESTABLISHED 30888/sshd: rs [pri 
tcp        0      0 150.242.170.131:33170   34.122.121.32:80        TIME_WAIT   -                   
tcp        0    232 150.242.170.130:22      111.42.148.30:5800      ESTABLISHED 5919/sshd: rs [priv 
tcp        0      0 150.242.170.132:48710   34.122.121.32:80        TIME_WAIT   -                   
tcp6       0      0 :::22                   :::*                    LISTEN      3967/sshd           
tcp6       0      0 ::1:631                 :::*                    LISTEN      16672/cupsd         
tcp6       0      0 ::1:6011                :::*                    LISTEN      5992/sshd: [email protected]/7 
tcp6       0      0 ::1:6012                :::*                    LISTEN      6078/sshd: [email protected]/8 
tcp6       0      0 :::80                   :::*                    LISTEN      26417/nginx: master 
udp        0      0 0.0.0.0:631             0.0.0.0:*                           16673/cups-browsed  

port 2377 would open after executing "docker swarm init –advertise-addr ip1" but close after a few seconds.
Btw, I have turned off firewall on vm1.

Source: Docker Questions

LEAVE A COMMENT