Are the libraries in your requirements.txt on your repo the same version numbers as the Gradient/Jupyter notebook library versions you trained your model on.
If this is not the case the model generally does not work. Search this thread for ‘pip list’ and change the versions numbers in requirements.txt in you repo to match the Gradient/Jupyter notebook versions.
Also I would recommend just creating a virtual environment on your desktop and getting the model running there as its much easier to debug.
Thanks! I’m confused, though, because there is not a one-to-one correlation between the contents of !pip list from Gradient/Jupyter and what’s in the Github repo now. In fact, there are certain elements in the Github repo requirements file that are nowhere to be found in the pip list.
No need to be confused Python on Gradient has many libraries so when ever a newbie or experienced person logs in, they do not have to install every library themselves.
You are only concerned about the libraries in the repo, or any additional libraries your app uses. For example your repo has incorrect versions for the two libraries below.
As mentioned in other posts you need to change all the repo library versions to match the gradient versions you are using. Replace the two pytorch libraries as they are not the versions you trained your model on using Gradient (see your pip list). Also change any other libraries that do not match.
If your model is old I would train it again as I consider pip list and the model (.pkl file) a pair. You must do the two together just in case library changes are made by the owners.
What problem did you have when you tried a standalone virtual env?
Problems are good, which is why Jeremy wants us to go from model to application as it makes you learn loads of stuff. I had about forty problems on my first classifier from docker.com to github.com and it took about 2-4 weeks from watching the videos to deployment. Lol!
Now I can deploy a simple classifier in about 15 - 30 mins.
However using these to links I can get the bear classifier to work but not the jay bird classifier to work, the jaybird classifier gives the following error. _pickle.UnpicklingError: invalid load key, ‘<’.
This error above nearly always means there is a problem with the sharelink, model or network the model is on. In this case the model works if I download it to the app.
I suggest you check the dropbox share link, use the google share link or if you have enough space use the model locally by saving it to the app directory.
Thank you so much for taking all the time to troubleshoot this. I tried moving the .pkl file to Google and using a universal sharing link, but I got pretty much the same error:
Jun 23 12:59:13 PM #10 2.356 File “/usr/local/lib/python3.7/site-packages/torch/serialization.py”, line 529, in load
Jun 23 12:59:13 PM #10 2.356 File “/usr/local/lib/python3.7/site-packages/fastai/basic_train.py”, line 618, in load_learner
Jun 23 12:59:13 PM #10 2.356 state = torch.load(source, map_location=‘cpu’) if defaults.device == torch.device(‘cpu’) else torch.load(source)
Jun 23 12:59:13 PM error: exit status 1
Jun 23 12:59:13 PM #10 2.356 return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
Jun 23 12:59:13 PM #10 2.356 File “/usr/local/lib/python3.7/site-packages/torch/serialization.py”, line 692, in _legacy_load
Jun 23 12:59:13 PM #10 2.356 magic_number = pickle_module.load(f, **pickle_load_args)
Jun 23 12:59:13 PM #10 2.356 _pickle.UnpicklingError: invalid load key, ‘<’.
Jun 23 12:59:13 PM #10 ERROR: executor failed running [/bin/sh -c python app/server.py]: buildkit-runc did not terminate successfully
Jun 23 12:59:13 PM ------
Jun 23 12:59:13 PM > [6/6] RUN python app/server.py:
Jun 23 12:59:13 PM ------
Jun 23 12:59:13 PM error: failed to solve: rpc error: code = Unknown desc = executor failed running [/bin/sh -c python app/server.py]: buildkit-runc did not terminate successfully
Since I have not yet been able to set up my virtual environment, and since Github won’t allow me to load a file larger than 25MB, I seem to still be stuck on this one.
Thanks again for all your help, even though it hasn’t solved my problem. I’m moving on to the next lesson and maybe when I figure out how to get my virtual environment set up I will come back to this.
I did try to upload the .pkl file to the app directory in Github, but it won’t let me upload anything bigger than 25MB and the file is 80MB. I even tried upgrading my Github account but the 25MB limit remains.
I’m giving up for now. But I do think that Fast.ai should remove Render as a recommended platform for this purpose, don’t you?
Hi hodanajan Hope all is well!
I would suggest you get the app working on your desktop first, then deploy it on render.com.
Unfortunately the errors on render.com are not reflecting very well what is actually the problem in some cases.
I’ve just been struggling with this exact same problem as you two for the entire day (@LessW2020) : _pickle.UnpicklingError: invalid load key, ‘<’.
I was using Google Colab to train the model and strangely I think it was that I right-click downloaded the exported .pkl file from Colab and then re-uploaded again to Google Drive from my desktop. When I drag and dropped the file directly into the mounted Google Drive folder using the Colab interface and shared that file afterwards, the render build worked.
Perhaps the downloading/uploading to Google Drive in that way corrupted the pickle file which gave this error.
Thanks Mike! Do you mind sharing your python file code? I tried Colab but I couldn’t figure out the proper filepaths. (And I assume you put your .pkl file in the Colab “sample_data” folder?)
Which python code do you need? From Colab or the render deployment? In Colab I just exported the learner without giving it a specific name using learn.export() which places the “export.pkl” file in the working directory (this was the default /content folder). Then I just drag n dropped the “export.pkl” file from there to the folder I wanted it to be in in my Google Drive that I had already mounted in the normal way as below (i.e. to the project folder I had already created in Google Drive):
from google.colab import drive
drive.mount('/content/drive')
In this case yes, as I’m using fastai-v2. So to get the render deployment working (in addition to the method mentioned above re drag n drop of export.pkl->Google Drive within Colab rather than download), I updated the files as follows.
requirements.txt - dependencies must match those in the Colab environment you trained the model in, as stated earlier in the thread by others. In my case:
server.py - a few small changes to imported libraries and prediction function (in addition to the Gdrive direct download URL, labels, “export.pkl” filename as necessary):
import aiohttp
import asyncio
import uvicorn
from fastai2 import *
from fastai2.vision.all import *
from io import BytesIO
from starlette.applications import Starlette
from starlette.middleware.cors import CORSMiddleware
from starlette.responses import HTMLResponse, JSONResponse
from starlette.staticfiles import StaticFiles
export_file_url = 'GOOGLE DRIVE URL'
export_file_name = 'export.pkl'
classes = ['YOURLABEL1', 'YOURLABEL2', 'ETC.']
path = Path(__file__).parent
app = Starlette()
app.add_middleware(CORSMiddleware, allow_origins=['*'], allow_headers=['X-Requested-With', 'Content-Type'])
app.mount('/static', StaticFiles(directory='app/static'))
async def download_file(url, dest):
if dest.exists(): return
async with aiohttp.ClientSession() as session:
async with session.get(url) as response:
data = await response.read()
with open(dest, 'wb') as f:
f.write(data)
async def setup_learner():
await download_file(export_file_url, path / export_file_name)
try:
print("File exists?:",os.path.exists(path/export_file_name))
learn = load_learner(path/export_file_name)
return learn
except RuntimeError as e:
if len(e.args) > 0 and 'CPU-only machine' in e.args[0]:
print(e)
message = "\n\nThis model was trained with an old version of fastai and will not work in a CPU environment.\n\nPlease update the fastai library in your training environment and export your model again.\n\nSee instructions for 'Returning to work' at https://course.fast.ai."
raise RuntimeError(message)
else:
raise
loop = asyncio.get_event_loop()
tasks = [asyncio.ensure_future(setup_learner())]
learn = loop.run_until_complete(asyncio.gather(*tasks))[0]
loop.close()
@app.route('/')
async def homepage(request):
html_file = path / 'view' / 'index.html'
return HTMLResponse(html_file.open().read())
@app.route('/analyze', methods=['POST'])
async def analyze(request):
img_data = await request.form()
img_bytes = await (img_data['file'].read())
pred = learn.predict(img_bytes)
return JSONResponse({'result': str(pred[0])})
if __name__ == '__main__':
if 'serve' in sys.argv:
uvicorn.run(app=app, host='0.0.0.0', port=5000, log_level="info")