GitLab-CI: Connecting to dind service fails with docker/compose

I am using GitLab-CI to build images with docker-compose and the gitlab-runner is running as docker service. To keep the jobs abstracted from the host, I use the docker/compose image as runner with docker:dind as service.

Here is my .gitlab-ci.yml file:

image: 
  name: docker/compose:1.25.0
  entrypoint: ["/bin/sh", "-c"]

stages:
  - build
  - deploy

build-prod:
  stage: build
  tags:
    - docker
    - build
  services:
    - docker:18.09-dind
  variables:
    IMAGE_HOME: $CI_REGISTRY_IMAGE
    DOCKER_HOST: tcp://docker:2375
    DOCKER_DRIVER: overlay
  before_script:
    - docker version
    - docker-compose version
    - docker login -u gitlab-ci-token -p $CI_JOB_TOKEN $CI_REGISTRY
  script:
    - docker-compose pull -q --ignore-pull-failures
    - docker-compose build --parallel --pull
    - docker image ls
    - docker-compose push

and the config.toml (<GITLAB_URL> and <RUNNER_TOKEN> obviously with other content):

[[runners]]
  name = "docker-builder"
  url = "<GITLAB_URL>"
  token = "<RUNNER_TOKEN>"
  executor = "docker"
  [runners.custom_build_dir]
  [runners.docker]
    tls_verify = false
    image = "docker:latest"
    privileged = true
    disable_entrypoint_overwrite = false
    oom_kill_disable = false
    disable_cache = false
    volumes = ["/cache"]
    shm_size = 0
  [runners.cache]
    [runners.cache.s3]
    [runners.cache.gcs]

In the first line in the before_script section I get the following output (where <MY_HOST_IP> is the IP-adress of my server):

$ docker version

 Client: Docker Engine - Community
  Version:           18.09.7
  API version:       1.39
  Go version:        go1.10.8
  Git commit:        2d0083d
  Built:             Thu Jun 27 17:54:15 2019
  OS/Arch:           linux/amd64
  Experimental:      false
 error during connect: Get http://docker:2375/v1.39/version: dial tcp <MY_HOST_IP>:2375: connect: no route to host
ERROR: Job failed: exit code 1

It seems like the container running the docker/compose image is resolving docker to the host instead of the dind-service, but I don’t know why.

It also has nothing to do with the runner container itself, because the content of /etc/hosts seems fine as well as pinging the docker service worked just fine:

$ cat /etc/hosts
  127.0.0.1 localhost
  ::1   localhost ip6-localhost ip6-loopback
  fe00::0   ip6-localnet
  ff00::0   ip6-mcastprefix
  ff02::1   ip6-allnodes
  ff02::2   ip6-allrouters
  172.17.0.2    docker b6bbd7a43b73 runner-G1yqwPKG-project-9-concurrent-0-docker-0
  172.17.0.3    runner-G1yqwPKG-project-9-concurrent-0
$ ping -c 3 docker
  PING docker (172.17.0.2): 56 data bytes
  64 bytes from 172.17.0.2: seq=0 ttl=64 time=0.216 ms
  64 bytes from 172.17.0.2: seq=1 ttl=64 time=0.154 ms
  64 bytes from 172.17.0.2: seq=2 ttl=64 time=0.158 ms
  --- docker ping statistics ---
  3 packets transmitted, 3 packets received, 0% packet loss
  round-trip min/avg/max = 0.154/0.176/0.216 ms

When trying the build with DOCKER_HOST: tcp://172.17.0.2:2375 everything worked perfectly and the build succeeded. The problem is that I can’t use this approach because in some case there might be other containers created in the default docker-bridge or multiple builds running.

I am as well completely confused because I am using the same setup on another GitLab instance where everything is setup almost exactly the same except for the host where the working one is running standard docker on ubuntu:16.04 in a vm and the not working one is using docker swarm on debian:10 natively on the server.

I would be very helpful if someone could point out the probably obvious mistake..

Source: StackOverflow