I have a docker swarm node running a set of docker services connected by a overlay network. When needed I dynamically add another docker node via terraform . It’ll be a separate ec2 instance setup and connected as a worker node to the existing swarm network.
I’ll run a container from my manager and the running container needs to talk to the existing services in manager node. For eg: Connecting to postgres service and running few queries.
docker -H <node ip> run --network <overlay network where services are running> <some image> <command>
The script running in the container fails with “Name or service not known” error. I tried to manually ping by bashing into the container and ping succeeds after some 4 or 5 seconds. I tried this hundreds of times and I always get the same issue. Also, it doesn’t matter when the node is joined to the swarm. Every time I run the above command, I face the same issue.
Also, I don’t have control over what script is run in the container so I cannot add retries.
One more thing. Sometimes, some services can be reached immediately. For eg., Postgres will fail. But another service exposing rest end points can be reached. But it’s not always the case.