Deployment Platform: Amazon SageMaker

amit_aec_it · July 29, 2019, 9:28pm

Thanks Bo. This is really very helpful. This makes life lot easier. Specially I like the Sage maker deployment part. I will test the same and get back to you.

faib · July 30, 2019, 6:51am

I also found another way to update the fastai version:

Simply supply the PyTorchModel instance with a source_dir parameter.
Put a requirements.txt and your entry point file in the specified folder. In the requirements.txt list your desired fastai version and you`re good to go.

maincarey · August 19, 2019, 8:19pm

hey @matt.mcclean When opening the lesson 1 pets on sagemaker it says kernel not found The documentation says chose conda fastai but its not one of my options which one do i chose or is there another issue?

theshop · September 18, 2019, 3:49pm

I’ve been yak shaving for 2 days and this was the key information i needed thank you!

faib · September 18, 2019, 6:21pm

Glad my comment could provide help, I was stuck some time as well.

Ankit_Doshi · October 9, 2019, 5:37am

Hi @faib

Let’s say this is the code -

pets_estimator = PyTorch(entry_point='source/pets.py',
                         base_job_name='fastai-pets',
                         role=role,
                         framework_version='1.0.0',
                         train_instance_count=1,
                         train_instance_type='ml.p3.2xlarge')

You are suggesting to add the source_dir parameter (“source”) and the requirement.txt file into it, right?

Can you also explain how would you mention the version in that and by doing so how it will work?

I was going through the PyTorch Sagemaker doc- PyTorch — sagemaker 2.231.0 documentation
and I found this explanation for source_dir-

source_dir ( str ) – Path (absolute or relative) to a directory with any other training source code dependencies aside from tne entry point file (default: None). Structure within this directory are preserved when training on Amazon SageMaker.

faib · October 11, 2019, 12:52pm

You can instantiate the PyTorch class like this

mail_model=PyTorchModel(model_data=model_artefact,
                        name=name,
                        role=role,
                        framework_version='1.1.0',
                        entry_point='serve.py',
                        predictor_cls=TextPredictor,
                        source_dir='my_src'
                       )

my_src is a folder containing serve.py and requirements.txt.

The requirements.txt file has the usual structure and contains e.g.:
fastai==1.0.52

Hope this helps.

Ankit_Doshi · October 11, 2019, 1:50pm

Thanks @faib

I did exactly the same and it worked!

Regards

matt.mcclean · October 12, 2019, 6:04pm

You need to use the kernel named "Python 3"

tbass134 · November 19, 2019, 8:48pm

The notebook url is returning a 404. could you update please?

yubozhao · November 19, 2019, 9:43pm

@tbass134 Yes. I will update it, the new link is https://github.com/bentoml/BentoML/tree/master/guides/deployment/deploy-with-sagemaker. I will also update my reply as well. Thanks for point out the dead link!

yubozhao · November 19, 2019, 9:44pm

We update the example guide for sagemaker deployment. You can find the new guide at https://github.com/bentoml/BentoML/tree/master/guides/deployment/deploy-with-sagemaker

justin.tomlinson · February 27, 2020, 2:07am

Hi @faib, just following through with this and am stuck with the predictor_cls=TextPredictor.

What does your TextPredictor class look like?

my model accepts a single string of text but im not sure how to pass this for Sagemaker model’s .predict()

zack404 · April 9, 2020, 12:27pm

getting this error from CloudWatch when trying to deploy :
Traceback (most recent call last):
File “/usr/local/lib/python3.6/dist-packages/gunicorn/workers/base_async.py”, line 56, in handle
self.handle_request(listener_name, req, client, addr)
File “/usr/local/lib/python3.6/dist-packages/gunicorn/workers/ggevent.py”, line 160, in handle_request
addr)
File “/usr/local/lib/python3.6/dist-packages/gunicorn/workers/base_async.py”, line 107, in handle_request
respiter = self.wsgi(environ, resp.start_response)
File “/usr/local/lib/python3.6/dist-packages/sagemaker_pytorch_container/serving.py”, line 107, in main
user_module_transformer.initialize()
File “/usr/local/lib/python3.6/dist-packages/sagemaker_containers/_transformer.py”, line 157, in initialize
self._model = self._model_fn(_env.model_dir)
File “/usr/local/lib/python3.6/dist-packages/sagemaker_containers/_functions.py”, line 87, in wrapper
six.reraise(error_class, error_class(e), sys.exc_info()[2])
File “/usr/local/lib/python3.6/dist-packages/six.py”, line 692, in reraise
raise value.with_traceback(tb)
File “/usr/local/lib/python3.6/dist-packages/sagemaker_containers/_functions.py”, line 85, in wrapper
return fn(*args, **kwargs)
File “/usr/local/lib/python3.6/dist-packages/serve.py”, line 15, in model_fn
learn = load_learner(model_dir, fname=‘export.pkl’)

also tried to print the fastai version on the deployement fase it give me : 1.0.39
solution ?

zack404 · April 9, 2020, 3:37pm

Hi @matt.mcclean I would like to thank you first of all,

I have a model created with Transformers and Fastai , and I have tried to deploy it ,this is my serve.py file

import logging, requests, os, io, glob, time
import json

import fastai

from transformers import BertTokenizer
from transformers import PreTrainedModel

from fastai.text import *
print("Fastai version "+str(fastai.__version__))
logger = logging.getLogger(__name__)
logger.setLevel(logging.DEBUG)

JSON_CONTENT_TYPE = 'application/json'

#redefining transformers custom model
class FastAiBertTokenizer(BaseTokenizer):
    """Wrapper around BertTokenizer to be compatible with fast.ai"""
    def __init__(self, tokenizer: BertTokenizer, max_seq_len: int=128, **kwargs):
        self._pretrained_tokenizer = tokenizer
        self.max_seq_len = max_seq_len

    def __call__(self, *args, **kwargs):
        return self

    def tokenizer(self, t:str) -> List[str]:
        """Limits the maximum sequence length"""
        return ["[CLS]"] + self._pretrained_tokenizer.tokenize(t)[:self.max_seq_len - 2] + ["[SEP]"]
    
class CustomTransformerModel(nn.Module):
    def __init__(self, transformer_model: PreTrainedModel):
        super(CustomTransformerModel,self).__init__()
        self.transformer = transformer_model
        
    def forward(self, input_ids):
        # Return only the logits from the transfomer
        logits = self.transformer(input_ids)[0]   
        return l
    
print(CustomTransformerModel)

# loads the model into memory from disk and returns it
def model_fn(model_dir):
    logger.info('model_fn')
    print("model directory "+str(model_dir))
    path = Path(model_dir)
    learn = load_learner(model_dir, 'fastai.pkl')
    return learn


# Perform prediction on the deserialized object, with the loaded model
def predict_fn(input, model):
    logger.info("Calling model")
    start_time = time.time()
    predict_class,predict_idx,predict_values = model.predict(input)
    print("--- Inference time: %s seconds ---" % (time.time() - start_time))
    print(f'Predicted class is {str(predict_class)}')
    print(f'Predict confidence score is {predict_values[predict_idx.item()].item()}')
    return json.dumps({
        "input": input,
        "pred_class": str(pred_class),
        "pred_idx":sorted(
            zip(learner.data.classes, map(float, pred_idx)),
            key=lambda p: p[1],
            reverse=True
        ),
        "predictions": sorted(
            zip(learner.data.classes, map(float, outputs)),
            key=lambda p: p[1],
            reverse=True
        )
    })
#    return dict(class = str(predict_class),confidence = predict_values[predict_idx.item()].item())

# Serialize the prediction result into the desired response content type
def output_fn(prediction, accept=JSON_CONTENT_TYPE):        
    logger.info('Serializing the generated output.')
    if accept == JSON_CONTENT_TYPE: return json.dumps(prediction), accept
    raise Exception('Requested unsupported ContentType in Accept: {}'.format(accept))

but when trying to deploy the model using PyTorch container, the problem is that when I try to load the model using load_learner() this function search for the CustomTransformerModel class , but since the file is in the serve.py ,the class name is
<class ‘serve.CustomTransformerModel’> and the function is searching for
<class 'main.CustomTransformerModel> so it doesn’t find it,
and I get this error from CloudWatch logs :

 Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/sagemaker_containers/_functions.py", line 85, in wrapper
    return fn(*args, **kwargs)
  File "/opt/ml/code/serve.py", line 47, in model_fn
    learn = load_learner(model_dir, 'fastai.pkl')
  File "/usr/local/lib/python3.6/dist-packages/fastai/basic_train.py", line 598, in load_learner
    state = torch.load(source, map_location='cpu') if defaults.device == torch.device('cpu') else torch.load(source)
  File "/usr/local/lib/python3.6/dist-packages/torch/serialization.py", line 387, in load
    return _load(f, map_location, pickle_module, **pickle_load_args)
  File "/usr/local/lib/python3.6/dist-packages/torch/serialization.py", line 574, in _load
    result = unpickler.load()
AttributeError: Can't get attribute 'CustomTransformerModel' on <module '__main__' from '/usr/local/bin/gunicorn'>

how could I solve this ?

matt.mcclean · April 10, 2020, 10:12am

Looks to be a problem with how your model has been serialized into a pickle object rather than being a SageMaker issue. Have you tried to export and load the model locally first? You can also try and get the SageMaker inference working locally on your machine by specifying the instance type parameter to be ‘local’ and it will deploy into a Docker container on your SageMaker/Jupyter notebook instance

zack404 · April 10, 2020, 4:18pm

@matt.mcclean Thank you I have solved this issue by retraining my model and instead of putting classes in notebook I have maked them on a module called utils.py , and when deploying model to SageMaker I have used utils.py as the name of the entry so that python when looks for the classes will find it .
I have another question what is the type_content that I must set in PytorchModel parameter before deploying when I am dealing with text , I have used Text/plain and it’s not supported by the model , I have also tried to use the unput_fn but didn’t succeded , could you please show how to use deserialization function input_fn or how we could use content_type in text ?

MuhammadAli · May 25, 2021, 10:15am

Hi Matt, hope you are fine.
I need your help with deploying a fastai trained image-classifier to sagemaker:

model=PyTorchModel(model_data=model_artefact, name=name_from_base(“Julian73ClsEFFB3-model”),
role=role, framework_version=‘1.8.0’, py_version=‘py3’, entry_point=‘my_src/serve.py’,
predictor_cls=ImagePredictor)

predictor = model.deploy(initial_instance_count=1, instance_type=‘ml.m4.xlarge’)

This is my serve.py
`
import logging, requests, os, io, glob, time
from fastai.vision.all import *
from fastai.basics import *

logger = logging.getLogger(name)
logger.setLevel(logging.DEBUG)

JSON_CONTENT_TYPE = ‘application/json’
JPEG_CONTENT_TYPE = ‘image/jpeg’

loads the model into memory from disk and returns it

def model_fn(model_dir):
logger.info(‘model_fn’)
path = Path(model_dir)
learn = load_learner(model_dir, fname=‘EFFB3-73CLS-24May.pkl’)
return learn

Deserialize the Invoke request body into an object we can perform prediction on

def input_fn(request_body, content_type=JPEG_CONTENT_TYPE):
logger.info(‘Deserializing the input data.’)
# process an image uploaded to the endpoint
if content_type == JPEG_CONTENT_TYPE: return open_image(io.BytesIO(request_body))
# process a URL submitted to the endpoint
if content_type == JSON_CONTENT_TYPE:
img_request = requests.get(request_body[‘url’], stream=True)
return open_image(io.BytesIO(img_request.content))
raise Exception(‘Requested unsupported ContentType in content_type: {}’.format(content_type))

Perform prediction on the deserialized object, with the loaded model

def predict_fn(input_object, model):
logger.info(“Calling model”)
start_time = time.time()
predict_class,predict_idx,predict_values = model.predict(input_object)
print("— Inference time: %s seconds —" % (time.time() - start_time))
print(f’Predicted class is {str(predict_class)}’)
print(f’Predict confidence score is {predict_values[predict_idx.item()].item()}’)
return dict(class_name = str(predict_class),
confidence = predict_values[predict_idx.item()].item())

Serialize the prediction result into the desired response content type

def output_fn(prediction, accept=JSON_CONTENT_TYPE):
logger.info(‘Serializing the generated output.’)
if accept == JSON_CONTENT_TYPE: return json.dumps(prediction), accept
raise Exception(‘Requested unsupported ContentType in Accept: {}’.format(accept))
`
I am facing error, load_learner not defined.
Thanks for any help.

MuhammadAli · May 25, 2021, 10:18am

Now my error is:
FileNotFoundError: [Errno 2] No such file or directory: ‘/opt/ml/model/model.pth’

theshop · June 3, 2021, 6:53pm

load_learner has changed is you’re using the latest fastai, the only argument is fname so you need to give the full path there like

def model_fn(model_dir):
    path = Path(model_dir)
    return load_learner(path/'export.pkl')