I have deployed an aws endpoint using a docker container. (i followed this https://docs.aws.amazon.com/sagemaker/latest/dg/docker-containers.html).
Everything is working perfectly but now i need to put it in production, and define an auto scaling strategy.
I tried 2 things:
first with the aws console but the autoscaling button is greyed out.
Then i tried to use the method describe here : https://docs.aws.amazon.com/sagemaker/latest/dg/endpoint-auto-scaling-add-code-apply.html. My endoint name is EmbeddingEndpoint and my variant name is SimpleVariant. So my final command is
aws application-autoscaling put-scaling-policy --policy-name scalable_policy_for_embedding --policy-type TargetTrackingScaling --resource-id endpoint/EmbeddingEndpoint/variant/SimpleVariant --service-namespace sagemaker --scalable-dimension sagemaker:variant:DesiredInstanceCount --target-tracking-scaling-policy-configuration file://policy_config.json
but i get this result :
An error occurred (ObjectNotFoundException) when calling the PutScalingPolicy operation: No scalable target registered for service namespace: sagemaker, resource ID: endpoint/EmbeddingEndpoint/variant/SimpleVariant, scalable dimension: sagemaker:variant:DesiredInstanceCount
does someone has another solution, or is it that i didn’t set the variable well ?
Thank you in advance !
Source: Docker Questions