Sagemaker Deployment Issues

MuhammadAli · May 26, 2021, 3:11am

Hi Muellerzr, thanks for this great timm support.
I have trained a classification model using fastai wwf and timm.
NOw I want to deploy it as endpoint using Sagemaker.But I am facing very weird error, as you know the script runs in docker so it simply ends the process, without clear description of the errors.
So, if you can kindly help me out with this.

Thanks.

dhoa · May 26, 2021, 7:11am

@MuhammadAli . Does docker image works fine locally ? Normally, if you don’t run your docker image in --detach mode, you can see the log in the terminal, or you can run docker logs docker logs | Docker Documentation to see the logs.

You can also go inside your docker image in interactive mode to really experiment what is not working. docker exec -it <container-name-or-id> bash

Hope it helps

MuhammadAli · May 26, 2021, 7:15am

Hi Dhoa, thanks a lot for your time.
But sorry I cannot get you, can you explain me a bit, I need it to be solved by today by all means.
If possible for you, can you have a short live meeting with me, so that you can help me out with this.

Sorry, for such weird request, but I have to.

Thanks alot…

MuhammadAli · May 26, 2021, 7:16am

In local mode, it stops with no error, I think so there is some problem with packages installatations, perhaps?

dhoa · May 26, 2021, 7:20am

Can you post the logs here so we can see what’s going on ? yes maybe it’s about the package installations, I ran into one then I need to update the requirement.txt. You can try to create a new virtual environment (with pip or conda), then install only the requirement.txt, then run your code to see if it works.

MuhammadAli · May 26, 2021, 7:23am

The json_serializer has been renamed in sagemaker>=2.

See: Use Version 2.x of the SageMaker Python SDK — sagemaker 2.42.0 documentation for details.
The json_deserializer has been renamed in sagemaker>=2.
See: Use Version 2.x of the SageMaker Python SDK — sagemaker 2.42.0 documentation for details.
vagn6a8863-algo-1-rwt01 | 2021-05-25 15:44:41,534 [INFO ] W-model-2-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Deserializing the input data.
vagn6a8863-algo-1-rwt01 | 2021-05-25 15:44:41,534 [INFO ] W-model-2-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Request body is: “kGAfx2i8jZ.png”
vagn6a8863-algo-1-rwt01 | 2021-05-25 15:44:41,534 [INFO ] W-model-2-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Loaded JSON object: kGAfx2i8jZ.png
vagn6a8863-algo-1-rwt01 | 2021-05-25 15:44:41,535 [INFO ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerThread - Backend response time: 0
vagn6a8863-algo-1-rwt01 | 2021-05-25 15:44:41,535 [INFO ] W-9000-model ACCESS_LOG - /172.18.0.1:40562 “POST /invocations HTTP/1.1” 500 2

JSONDecodeError Traceback (most recent call last)
in
----> 1 response = predictor.predict(‘kGAfx2i8jZ.png’)
2
3 print(response)

~/anaconda3/envs/pytorch_latest_p36/lib/python3.6/site-packages/sagemaker/predictor.py in predict(self, data, initial_args, target_model, target_variant, inference_id)
135 )
136 response = self.sagemaker_session.sagemaker_runtime_client.invoke_endpoint(**request_args)
→ 137 return self._handle_response(response)
138
139 def _handle_response(self, response):

~/anaconda3/envs/pytorch_latest_p36/lib/python3.6/site-packages/sagemaker/predictor.py in _handle_response(self, response)
141 response_body = response[“Body”]
142 content_type = response.get(“ContentType”, “application/octet-stream”)
→ 143 return self.deserializer.deserialize(response_body, content_type)
144
145 def _create_request_args(

~/anaconda3/envs/pytorch_latest_p36/lib/python3.6/site-packages/sagemaker/deprecations.py in deprecate(*args, **kwargs)
120 def deprecate(*args, **kwargs):
121 renamed_warning(f"The {name}")
→ 122 return func(*args, **kwargs)
123
124 return deprecate

~/anaconda3/envs/pytorch_latest_p36/lib/python3.6/site-packages/sagemaker/deserializers.py in deserialize(self, stream, content_type)
253 “”"
254 try:
→ 255 return json.load(codecs.getreader(“utf-8”)(stream))
256 finally:
257 stream.close()

~/anaconda3/envs/pytorch_latest_p36/lib/python3.6/json/init.py in load(fp, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw)
297 cls=cls, object_hook=object_hook,
298 parse_float=parse_float, parse_int=parse_int,
→ 299 parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw)
300
301

~/anaconda3/envs/pytorch_latest_p36/lib/python3.6/json/init.py in loads(s, encoding, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw)
352 parse_int is None and parse_float is None and
353 parse_constant is None and object_pairs_hook is None and not kw):
→ 354 return _default_decoder.decode(s)
355 if cls is None:
356 cls = JSONDecoder

~/anaconda3/envs/pytorch_latest_p36/lib/python3.6/json/decoder.py in decode(self, s, _w)
337
338 “”"
→ 339 obj, end = self.raw_decode(s, idx=_w(s, 0).end())
340 end = _w(s, end).end()
341 if end != len(s):

~/anaconda3/envs/pytorch_latest_p36/lib/python3.6/json/decoder.py in raw_decode(self, s, idx)
355 obj, end = self.scan_once(s, idx)
356 except StopIteration as err:
→ 357 raise JSONDecodeError(“Expecting value”, s, err.value) from None
358 return obj, end

JSONDecodeError: Expecting value: line 1 column 1 (char 0)

MuhammadAli · May 26, 2021, 7:24am

Actually, I have included in requirements.txt:
fastai==2.3.0
timm
wwf
ipykernel

MuhammadAli · May 26, 2021, 7:24am

And that’s all is needed…

dhoa · May 26, 2021, 7:37am

Thanks. I’m not very familiar with SageMaker and timm,wwf so I think I can not help much here. I guess kGAfx2i8jZ.png is your local file and predictor.predict need some kind of HTTP Request. Can you try out an URL of an image there ?

MuhammadAli · May 26, 2021, 7:42am

Actually, I have tried, the url…
And along with I also followed a simple fastai res18 model, that worked, so I think may be there is some libarary issues…

MuhammadAli · May 26, 2021, 7:43am

Thanks for your time.
Actually, I need your help to dignose the issue, as I am not able to see the logs, once figured out ,I think I can solve it myself, as far as the fastai related is concerned.

Thanks…

MuhammadAli · May 26, 2021, 7:44am

Also, there was an option of setting the content, and accept type, but currently I am not able to set these for predictor.

MuhammadAli · May 26, 2021, 7:52am

Dhoa, if you donot mind.
Can I ask you for 1 hour paid consultation, as you are quite good with docker, and sagemaker deployments?

Thanks alot…

Akash97715 · March 22, 2022, 7:35am

Hi Team,

I was trying to perform training using pytorch estimator using script mode on sagemaker. while passing my script to pytorch estimator i got below error.

Also i am attaching the script container configuration.

Let me know if i am missing anything or i need to provide any extra information

Akash97715 · March 22, 2022, 7:36am