Intermittent network timeout in docker

I am experience a network/http timeout issue with a docker-in-docker app that’s running in a Kubernetes cluster and I need help in figuring out what may be happening.

I am running a docker container within docker (it’s a build tool). In the innermost container, the docker build hangs on executing this line in the Dockerfile:
apk add –no-cache tzdata

The console output says:
fetch http://dl-cdn.alpinelinux.org/alpine/v3.12/main/x86_64/APKINDEX.tar.gz

I have tried a simple curl with this URL and it works about 50% of the time, the rest of the time it times out. The issue is also limited to the Alpine CDN URL. So for example, I can download an image from flickr.com 100% of the time. It is also downloading 100% of the time in a different cluster in a different VPC. Therefore, there is something particular to this specific Kubernetes stack, and this particular URL, that’s causing the issue. What I need help with is how to dig further to try to identify the problem.

I have stripped the app down to the bare essence that highlights the problem. Here is the project structure:

project file structure

Here is app.py:

from time import sleep

while True:
    sleep(60)

This is the Dockerfile:

FROM python:3.7-alpine3.11

RUN apk add --no-cache                                                  
    docker

COPY entrypoint.sh /
RUN chmod 0700 /entrypoint.sh

RUN mkdir /app
WORKDIR /app/
COPY app /app/

ENTRYPOINT [ "/entrypoint.sh" ]

This is entrypoint.sh:

#!/bin/sh
set -e

echo 'Starting dockerd...'
# check if docker pid file exists (can linger from docker stop or unclean shutdown of container)
if [ -f /var/run/docker.pid ]; then
  rm -f /var/run/docker.pid
fi
mkdir -p /etc/docker
echo '{ "storage-driver": "vfs" }' > /etc/docker/daemon.json
nohup dockerd > /var/log/dockerd.log &

# The following command does not spawn execution to the background as
#     we need to leave something holding the container in run state.
echo "Starting canary app..."
exec python3 app.py

And service.yml

apiVersion: v1
kind: List
items:
- apiVersion: apps/v1
  kind: Deployment
  metadata:
    labels:
      run: canary
    name: canary
  spec:
    replicas: 1
    selector:
      matchLabels:
        run: canary
    template:
      metadata:
        labels:
          run: canary
      spec:
        containers:
          - image: canary
            imagePullPolicy: IfNotPresent
            name: canary
            securityContext:
              capabilities:
                add:
                  - SYS_ADMIN
              privileged: true
        dnsPolicy: ClusterFirst
- apiVersion: v1
  kind: Service
  metadata:
    name: canary
    labels:
      run: canary
  spec:
    ports:
      - port: 80
        protocol: TCP
    selector:
      run: canary
    sessionAffinity: None
    type: ClusterIP

Source: Docker Questions