Latency Tensorflow serving with docker for video analysis

I would like to ask a question regarding the response time of a tensorflow model in docker.
I transformed a Yolo model from .h5 into a model API with the function.

I use a Ubuntu machine with an i9 CPU and an RTX 2080 TI graphics card

Model.save(
    filepath,
    overwrite=True,
    include_optimizer=True,
    save_format=None,
    signatures=None,
    options=None,
)

I took a docker image from tensorflow/serving:latest-gpu which I copied my models as follows:

FROM tensorflow/serving:latest-gpu
COPY models /apps/
EXPOSE 8500
EXPOSE 8501

I wrote the following script to analyze a video

img = img.astype('float32')
    img = img / 255.0

    data = json.dumps({"signature_name": "serving_default", "instances": img.tolist()})


        }

    headers = {"content-type": "application/json"}
    start_time = time.time()
    json_response = requests.post('http://172.27.240.5:8501/v1/models/vetements:predict', data=dat   a, headers=headers)
    print("--- %s seconds ---" % (time.time() - start_time))

    predictions = json.loads(json_response.text)['predictions']

my question is as follows the rest time to analyze a single image is 0.5 seconds, please does it have a way to speed up the execution of analysis

thank you

Source: Docker Questions