Deployment Platform: Render ✅

KarlH · January 16, 2019, 6:22pm

The logs window is empty.

The model file is 470 MB so I could see that leading to memory issues. It’s a super resolution model so you’ve got the whole Unet structure. What would be a way forward from here?

anurag · January 16, 2019, 6:37pm

We’re working to surface OOMKilled logs in UI logs. In terms of what to do next: we’re going to introduce higher memory tiers soon, but you won’t be able to use the current model unless it’s somehow modified to take less than 1GB RAM. What happens when you run it locally? Can you monitor the memory to figure out how much it ends up taking?

anurag · January 17, 2019, 8:19pm

@KarlH, for now, we’ve increased the memory limit for all Docker deploys to 1.5GB. We’ll eventually create a priced tier where you’d end up paying more than $5, but you don’t have to until we do that. Try deploying your service again?

anurag · January 17, 2019, 8:40pm

https://enhance-u2e7.app.render.com/ is up after the limit increase.

KarlH · January 17, 2019, 8:45pm

Thanks for working on this Anurag, I appreciate it!

shreyas · January 21, 2019, 1:35pm

@anurag
I am using the new API with pickle file but end up getting following error;

 File "/usr/local/lib/python3.6/site-packages/torch/serialization.py", line 78, in validate_cuda_device
    raise RuntimeError('Attempting to deserialize object on a CUDA '
RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location='cpu' to map your storages to the CPU.
Jan 21 01:23:00 PM  error building image: error building stage: waiting for process to exit: exit status 1
Jan 21 01:23:00 PM  error: exit status 1

Here is how I modified your server.py file:

async def setup_learner():
    await download_file(model_file_url, path/'models'/f'{model_file_name}.pkl')
    defaults.device = torch.device('cpu')
    learn = load_learner(path/'models')
return learn

Could you also update or create an alternate example file with load_learner() from fastai 1.40, please?

Thanks for Render!

anurag · January 21, 2019, 7:07pm

Done. Thanks for letting me know!

anurag · January 22, 2019, 2:44am

FYI, uvicorn changed their API so you might have to get my latest changes from origin. Specifically, this commit:

PierreO · January 22, 2019, 8:46pm

Due to the depreciation of the single_from_class method of DataBunch, please note that :

You now need to give the app the export.pkl file download link given by learner.export() rather than the .pth file given by learner.save()
No need to create an empty DataBunch, simply create a learner with load_learner. Here’s what the setup_learner should look like in server.py :

async def setup_learner():
    await download_file(model_file_url, path/'models'/f'{model_file_name}.pkl')
    defaults.device = torch.device('cpu')   
    learn = load_learner(path/'models', f'{model_file_name}.pkl')
    return learn

This is written mainly so that @anurag can update the github repo and the tutorial, but if others are still using the old version be careful it won’t work properly. It’s not throwing an error, but the model isn’t working as it should be.

PierreO · January 22, 2019, 11:03pm

The error you pointed out was due to how the export and load_learner methods were implemented in fastai. I’ve submitted a PR that fixes this, the error should go away when (and if) the PR is merged.

anurag · January 23, 2019, 1:07am

I’ve updated https://github.com/render-examples/fastai-v3 but it’ll need the PR to be merged to work.

PierreO · January 23, 2019, 11:17am

Just saw that I forgot something to make it work once the PR is merged (to tell load_learner to use the cpu).
The setup_learner function should be :

async def setup_learner():
    await download_file(model_file_url, path/'models'/f'{model_file_name}.pkl')
    learn = load_learner(path/'models', f'{model_file_name}.pkl', device='cpu')
    return learn

You can also delete default.device = torch.device('cpu') I think

anurag · January 23, 2019, 7:24pm

We’ll need to create a new release of fastai to get this to work with the default notebook instructions cc @jeremy

sgugger · January 24, 2019, 3:12pm

Please note I have changed it to a flag so it should be
learn = load_learner(path/'models', f'{model_file_name}.pkl', cpu=True)
since it didn’t support any type of device (just cpu). There will be a release today.

anurag · January 24, 2019, 3:51pm

I’m updating the repo at https://github.com/render-examples/fastai-v3 to make sure everything works as intended. Will post on this thread when it’s ready.

anurag · January 24, 2019, 5:31pm

FYI, thanks to @sgugger making some quick and important updates to fastai, https://github.com/render-examples/fastai-v3 is working again. Everyone who’s forked the sample repo may want to update their forks. See instructions here: https://robots.thoughtbot.com/keeping-a-github-fork-updated

cc @PierreO

PierreO · January 24, 2019, 8:10pm

Great job from you both, thanks !

anurag · January 25, 2019, 6:47pm

For anyone running into this error on deployment:

raise RuntimeError('Attempting to deserialize object on a CUDA '
RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location='cpu' to map your storages to the CPU.

Make sure to update the version of fastai to 1.0.42 (or the latest) in your notebook environment, restart the kernel and export the .pkl file again.

anurag · January 28, 2019, 6:18pm

Just FYI, I’ve made a few changes to the sample repo. Please update your forks accordingly.

jeffchen72 · January 30, 2019, 8:53am

@anurag, I am on fastai 1.0.42, and using all the latest files, including your latest server.py forked this morning.

sudo /opt/anaconda3/bin/conda list fastai

packages in environment at /opt/anaconda3:

Name Version Build Channel

fastai 1.0.42 1 fastai

The export.pkl file that I generated is still failing with the same error listed above on deployment on Render. I have repeated the process with the same result. In my modified course-v3/lesson2-download Jupyter notebook, the following works just fine:

learn.export()
img = open_image(path/‘samoyed’/‘00000057.jpg’)
learn = load_learner(path)
pred_class,pred_idx,outputs = learn.predict(img)
pred_class

Am I overlooking something? It seems like I have done everything as required.

Thanks,
Jeff