How do I use separate types of gpus (e.g. 1080Ti vs 2080Ti) on the same docker image without needing to re-run `python setup.py develop`?

  docker, gpu, nvidia, python, pytorch

I’m using a pytorch-based repository where the installation step specifies to run python setup.py develop with this setup.py file. I have been running the repository fine with 1080Ti and 1080 GPUs using a docker image which clones the repo and runs the setup.py script in the build process. The following are files copied from my Dockerfile.

RUN git clone https://github.com/CVMI-Lab/ST3D.git
WORKDIR /ST3D
RUN nvidia-smi
RUN pip install -r requirements.txt
RUN python setup.py develop

Upon entering the container, I only mount specific folders within the repo as follows:

GPU_ID = 0

ENVS="  --env=NVIDIA_VISIBLE_DEVICES=$GPU_ID
        --env=CUDA_VISIBLE_DEVICES=$GPU_ID
        --env=NVIDIA_DRIVER_CAPABILITIES=all"

VOLUMES="       --volume=$DATA_PATH:/ST3D/data
                --volume=$CODE_PATH/pcdet:/ST3D/pcdet
                --volume=$CODE_PATH/tools:/ST3D/tools
                --volume=$CODE_PATH/output:/ST3D/output"

docker  run -d -it --rm 
        $VOLUMES 
        $ENVS 
        --runtime=nvidia 
        --gpus $GPU_ID 
        --privileged 
        --net=host 
        --workdir=/ST3D 
        darrenjkt/st3d:v0.3.0

Recently we installed a 2080Ti in the same computer. When I enter the same docker container with solely the 2080Ti gpu, using the same python script, I get the following error:

RuntimeError: CUDA error: no kernel image is available for execution on the device

This error pertains to one of the cpp modules installed in the setup.py.

I can solve this by running python setup.py develop again which would then enable it to work with the 2080Ti. I’ve tried then committing the docker container to a 2080Ti specific docker image, and the 1080 docker container to a 1080 specific image. However I noticed that once I run python setup.py develop on the 2080Ti container, it then gives me the CUDA error for the 1080 images. And if I run the setup.py on the 1080 gpu again, it’ll then give me the CUDA error back on that 2080Ti image. This baffles me as I have not mounted the build files but rather kept them solely in the container and committed it to a new image.

So my question is, how can I set up my environment/docker image such that it doesn’t require a rebuild of setup.py each time?

Source: Docker Questions

LEAVE A COMMENT