Lesson 2 - Official Topic

pinaki · April 14, 2020, 8:39am

Hi Folks,
My app is running on binder. However, the model is stored in github lfs – which has some file size restrictions. In stead I was hoping to save the model (pkl) file to some remote bucket say S3 / Google Drive and load the model from there. I dont think load_learner works with urls – looks the first argument should be a file / path. I did a little bit of hunting and found torch.utils.model_zoo.load_url(model_url) to be a good alternative. However, this does download the file in the local fs and then loads the model. I am thinking if there is a way to just stream the bytes and create the model in memory. Typically, for all files I use the requests object and then wrap it in an io.BytesIO object. However, load_learner or torch.load() which it internally calls seems to have some problems with it. Seems like there are some issues with reading pickle files this way. I am doing the following:

response = requests.get(model_url)
buffer = io.BytesIO(response.content)
torch.load(buffer) # or use load_learner(buffer)

I am seeing this error UnpicklingError: invalid load key, '\xff'.
Any thoughts / pointers on how to load pickle objects as Bytes and then finally into torch.load?