Deployment Platform: Render ✅

I was wondering something, but maybe it’s complete nonsense because of my lack of knowledge. It seems very inefficient to to a whole build with render if just a few things was changed in the server.py file for example. If there’s no change in the requirements.txt or in the Dockerfile (or in other files that would require another build), wouldn’t it be possible to only update server.py (or other .py files) without doing another lengthy build ?

1 Like

Hello, my web app build with fastai v1 and render files did work locally (I pip install starlette on my fastai v1 environment with fastai version = 1.0.42) but failed to be deployed on Render.com (see following error codes). Any advice?

Jan 30 07:45:31 PM  INFO[0292] COPY app app/
Jan 30 07:45:31 PM  INFO[0292] RUN python app/server.py
Jan 30 07:45:31 PM  INFO[0292] cmd: /bin/sh
Jan 30 07:45:31 PM  INFO[0292] args: [-c python app/server.py]
Jan 30 07:45:36 PM  Traceback (most recent call last):
  File "app/server.py", line 43, in <module>
    learn = loop.run_until_complete(asyncio.gather(*tasks))[0]
  File "/usr/local/lib/python3.7/asyncio/base_events.py", line 584, in run_until_complete
    return future.result()
  File "app/server.py", line 31, in setup_learner
    learn = load_learner(path, export_file_name)
  File "/usr/local/lib/python3.7/site-packages/fastai/basic_train.py", line 469, in load_learner
    state = torch.load(open(Path(path)/fname, 'rb'))
  File "/usr/local/lib/python3.7/site-packages/torch/serialization.py", line 367, in load
    return _load(f, map_location, pickle_module)
  File "/usr/local/lib/python3.7/site-packages/torch/serialization.py", line 528, in _load
    magic_number = pickle_module.load(f)
_pickle.UnpicklingError: invalid load key, '<'.
Jan 30 07:45:36 PM  error building image: error building stage: waiting for process to exit: exit status 1
Jan 30 07:45:36 PM  error: exit status 1

I think what’s happening is that torch is checking you are loading an object with torch.load (the method used by load_learner) that was saved with torch.save (the method used by learner.export). So it seems the learner you are trying to load wasn’t saved with learner.export. Please retry to export the learner and load it back, I think it should work.

Bonjour Pierre. Thanks for your answer but my model was saved with learner.export().
Let me explain the followed steps:

  1. My environment:

    python : 3.6.8
    fastai : 1.0.42
    fastprogress : 0.1.18
    torch : 1.0.0
    torch cuda : 9.0 / is available
    torch cudnn : 7005 / is enabled
    platform : Windows-10

  2. I created resnet34, resnet50, resnet101 and resnet152 learners (xxxx.pkl files) through learner.export() (code in my jupyter notebook “Pretrained ImageNet Classifier with fastai v1”).
    The sizes of theses pkl files are: 85Mo (resnet34), 97.8Mo (resnet50), 170Mo (resnet50), 230Mo (resnet152).

  3. I tested my 4 learners by loading them in the same notebook with load_learner() and then got predictions with learn.predict(). Everything worked fine.

  4. I installed locally the starlette server in my fastai environment (1.0.42).

  5. I tested locally my 4 resnet learners in a Web app (using the last version of render Web App) launched with python app/server.py serve. The Web app worked well with 3 resnet learners: resnet34, resnet50 and resnet152 but not for resnet101 (see error in code below).

  6. I deployed one by one the same Web app for each of my 4 learners on Render.com Web service (following the render online procedure) and then, only 2 resnet learners did work: resnet34 and resnet50. For resnet101 and resnet152, I’ve got the same error in the render terminal as for resnet101 on my computer terminal.

Because of the working differences between my local jupyter notebook, my local Web app and the Web app on render.com, it is complicated to understand the problem. Any idea to solve my issue?

(fastai_v1) (...)>python app/server.py serve
Traceback (most recent call last):
  File "app/server.py", line 56, in <module>
    learn = loop.run_until_complete(asyncio.gather(*tasks))[0]
  File "(...)\Anaconda3\envs\fastai_v1\lib\asyncio\base_events.py", line 484, in run_until_complete
    return future.result()
  File "app/server.py", line 44, in setup_learner
    learn = load_learner(path, export_file_name)
  File "(...)\Anaconda3\envs\fastai_v1\lib\site-packages\fastai\basic_train.py", line 469, in load_learner
    state = torch.load(open(Path(path)/fname, 'rb'))
  File "(...)\Anaconda3\envs\fastai_v1\lib\site-packages\torch\serialization.py", line 367, in load
    return _load(f, map_location, pickle_module)
  File "(...)\Anaconda3\envs\fastai_v1\lib\site-packages\torch\serialization.py", line 528, in _load
    magic_number = pickle_module.load(f)
_pickle.UnpicklingError: invalid load key, '<'.
2 Likes

Very weird it only happens for some of the networks… I don’t really see why that’s the case. I’ll try the same thing as you later today and report back.

Thanks Pierre. I appreciate.

If you have also any idea on how to adapt the css/js files to make the Web app Apple-friendly, it would be great as well :slight_smile: (my post on that issue)

I’m afraid I’m way worst at web dev than I am at deep learning, and i’m still a beginner in deep learning so you can imagine my level in web dev :sweat_smile: For what it’s worth the example web app with bears works well on a macbook, so I’m not sure it’s an Apple issue.

I guess this is a Safari version issue. My (old) MacBook has the Safari 8.0.8 (2015) and my (old) ipad has the same age (gloups) :slight_smile:

I am also having the same error pop up. I forked the bears github and updated it with a google drive link to my export (resnet50 model). Maybe something related with the export? I did use learn.export() though.

@pierreguillou and @abduissa: are you exporting the models on a Windows environment? What happens if you try to export them with Jupyter on Linux (like on Crestle)?

Finally didn’t have time to run the tests today, sorry about that. I’m not sure if I’ll be able to do it tomorrow but I’ll try :slight_smile:

I am exporting on jupyter with my paperspace Linux machine. Latest version of fastai installed.

Can you verify that the download URL link is the raw link? (putting it in the browser address bar should start downloading the file)

Also, are you able to run the server locally (with python app/server.py)?

I can verify that the google drive link is the right one (it downloads immediately).

Weird thing is that the server will not run locally. I tried to restore the server.py file on my github fork back to the bear v. teddy classifier from the original repo, but this one will also not run locally or run on Render.

Update to this—I deleted the repo and re-forked it. This time I didn’t change the name of the repo in my fork. Now it works great. Very cool service!

Wonder if forking the fastai template means I can’t rename the repo…?

1 Like

I had updated fastai on my GCP vm instance and made sure it is on version 1.0.42. For some reason, I am still stuck with the same error. I even tried using a different server on Gradient instead of GCP. Again, I followed the Back to work instructions to update fastai to 1.0.42 and generated a new export.pkl.

I verified the download link worked correctly, but again got the out of date error.

Here’s my notebook output on Gradient with fastai 1.0.42. Am I not using learn.export() correct to output the pkl file needed for Render?

@anurag: yes. My computer is on Windows 10. Then, I use Anaconda Prompt as terminal.
My environment:

python : 3.6.8
fastai : 1.0.42
fastprogress : 0.1.18
torch : 1.0.0
torch cuda : 9.0 / is available
torch cudnn : 7005 / is enabled
platform : Windows-10

I do not use Jupyter on Linux.

It looks like there might still be some unresolved issues in fastai 1.0.42. See this comment and open issue: https://github.com/fastai/fastai/pull/1502#issuecomment-459165078

Thanks, @anurag. I will step away from this for a while and come back later then. It’s good to know that I am not going crazy. Haha!

You’re absolutely right! We’ve now enabled Docker caching, so it shouldn’t build the whole thing every time. This halved the build time for the example repo.

2 Likes