Deployment Platform: Render ✅

The logs window is empty.

The model file is 470 MB so I could see that leading to memory issues. It’s a super resolution model so you’ve got the whole Unet structure. What would be a way forward from here?

We’re working to surface OOMKilled logs in UI logs. In terms of what to do next: we’re going to introduce higher memory tiers soon, but you won’t be able to use the current model unless it’s somehow modified to take less than 1GB RAM. What happens when you run it locally? Can you monitor the memory to figure out how much it ends up taking?

@KarlH, for now, we’ve increased the memory limit for all Docker deploys to 1.5GB. We’ll eventually create a priced tier where you’d end up paying more than $5, but you don’t have to until we do that. Try deploying your service again?

1 Like

https://enhance-u2e7.app.render.com/ is up after the limit increase.

1 Like

Thanks for working on this Anurag, I appreciate it!

1 Like

@anurag
I am using the new API with pickle file but end up getting following error;

 File "/usr/local/lib/python3.6/site-packages/torch/serialization.py", line 78, in validate_cuda_device
    raise RuntimeError('Attempting to deserialize object on a CUDA '
RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location='cpu' to map your storages to the CPU.
Jan 21 01:23:00 PM  error building image: error building stage: waiting for process to exit: exit status 1
Jan 21 01:23:00 PM  error: exit status 1

Here is how I modified your server.py file:

async def setup_learner():
    await download_file(model_file_url, path/'models'/f'{model_file_name}.pkl')
    defaults.device = torch.device('cpu')
    learn = load_learner(path/'models')
return learn

Could you also update or create an alternate example file with load_learner() from fastai 1.40, please?

Thanks for Render!

Done. Thanks for letting me know!

FYI, uvicorn changed their API so you might have to get my latest changes from origin. Specifically, this commit:

Due to the depreciation of the single_from_class method of DataBunch, please note that :

  • You now need to give the app the export.pkl file download link given by learner.export() rather than the .pth file given by learner.save()
  • No need to create an empty DataBunch, simply create a learner with load_learner. Here’s what the setup_learner should look like in server.py :
async def setup_learner():
    await download_file(model_file_url, path/'models'/f'{model_file_name}.pkl')
    defaults.device = torch.device('cpu')   
    learn = load_learner(path/'models', f'{model_file_name}.pkl')
    return learn

This is written mainly so that @anurag can update the github repo and the tutorial, but if others are still using the old version be careful it won’t work properly. It’s not throwing an error, but the model isn’t working as it should be.

3 Likes

The error you pointed out was due to how the export and load_learner methods were implemented in fastai. I’ve submitted a PR that fixes this, the error should go away when (and if) the PR is merged.

3 Likes

I’ve updated https://github.com/render-examples/fastai-v3 but it’ll need the PR to be merged to work.

Just saw that I forgot something to make it work once the PR is merged (to tell load_learner to use the cpu).
The setup_learner function should be :

async def setup_learner():
    await download_file(model_file_url, path/'models'/f'{model_file_name}.pkl')
    learn = load_learner(path/'models', f'{model_file_name}.pkl', device='cpu')
    return learn

You can also delete default.device = torch.device('cpu') I think

3 Likes

We’ll need to create a new release of fastai to get this to work with the default notebook instructions cc @jeremy

3 Likes

Please note I have changed it to a flag so it should be
learn = load_learner(path/'models', f'{model_file_name}.pkl', cpu=True)
since it didn’t support any type of device (just cpu). There will be a release today.

1 Like

I’m updating the repo at https://github.com/render-examples/fastai-v3 to make sure everything works as intended. Will post on this thread when it’s ready.

FYI, thanks to @sgugger making some quick and important updates to fastai, https://github.com/render-examples/fastai-v3 is working again. Everyone who’s forked the sample repo may want to update their forks. See instructions here: https://robots.thoughtbot.com/keeping-a-github-fork-updated

cc @PierreO

5 Likes

Great job from you both, thanks !

1 Like

For anyone running into this error on deployment:

raise RuntimeError('Attempting to deserialize object on a CUDA '
RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location='cpu' to map your storages to the CPU.

Make sure to update the version of fastai to 1.0.42 (or the latest) in your notebook environment, restart the kernel and export the .pkl file again.

Just FYI, I’ve made a few changes to the sample repo. Please update your forks accordingly.

@anurag, I am on fastai 1.0.42, and using all the latest files, including your latest server.py forked this morning.

sudo /opt/anaconda3/bin/conda list fastai

packages in environment at /opt/anaconda3:

Name Version Build Channel

fastai 1.0.42 1 fastai

The export.pkl file that I generated is still failing with the same error listed above on deployment on Render. I have repeated the process with the same result. In my modified course-v3/lesson2-download Jupyter notebook, the following works just fine:

learn.export()
img = open_image(path/‘samoyed’/‘00000057.jpg’)
learn = load_learner(path)
pred_class,pred_idx,outputs = learn.predict(img)
pred_class

Am I overlooking something? It seems like I have done everything as required.

Thanks,
Jeff