Productionizing models thread


Have tried models which have hooks? It didn’t work for me with UNET because apparently its not supported in the current Pytorch version.

(vijaysai) #25

Am entirely not sure if I can post this here. Would move if it belongs to the Class 1 chat.

The work on your repo putting the model to production was really really good. I was following your repo trying to do inference from my model (on my CPU after downloading the model from Colab)from the Class 1. I’ve two classes

  1. Driving Licence

  2. Pan Card

I’m bit confused on the way the order of the names in the fnames is affecting the results

Order 1

dl_pan_fnames = [
for c in [
“Driving Licence”,
“Pan Card”
dl_pan_learner = ConvLearner(dl_pan_data, models.resnet34)
torch.load(“dl_or_pan.pth”, map_location=“cpu”)


Order 2

dl_pan_images_path = Path("/tmp")
dl_pan_fnames = [
for c in [
“Pan Card”,
“Driving Licence”


dl_pan_learner = ConvLearner(dl_pan_data, models.resnet34)
torch.load(“dl_or_pan.pth”, map_location=“cpu”)


For the image I was using, Pan Card is the correct answer. The data classes during training the model was as below.


Are the losses printed according to the order of the data.classes?

Also I’m used to the kind of results of a classifier finally adding upto 1. I understand that we’re calculating the losses here. So what does a strong negative value imply vs a strong positive value?

I noticed that the you cougar model was 83 MB and my model is also 83 MB. Do all the resnet34 model have the same size ?

(vijaysai) #26

Yes Jeremy. This is doable and pretty easy. I was trying to do this with the free account but had issues with space. The free account comes with 512MB storage.The memory is getting consumed during fastai installation itself. Most of the fastai pre requisites are installed. I tried installing individual libraries, but could not go through spacy. It ends with the quota error.

So a Hacker account($5.month) which comes with 1 GB storage would be sufficient to start the work. Also I noticed that there are lot of libraries that are pre-installed including tensorflow. So we’re facing conflicts if we try and install fastai globally.

I think virtualenv will be the way to go. But if there could be a variant that comes with fastai pre installed like paperspace, even the free account would be enough to start with. Will keep posted if I can get fastai working on free tier.

(Piotr Żelasko) #27

Hello Jeremy and community!

I’ve been working a bit with ULMFiT and I’m curious what is the preferred way to use the resulting LM or classifier model for inference. From what I’ve gathered, the preprocessing pipeline (tokenizer, vocabulary selection, numericalization) is a part of the DataBunch object, but I’m not yet sure how to extract all this into some kind of inference function which @devforfu mentioned.

I guess my ultimate goal/wish would be to perform something like:

data = ['sentence 1', 'sentence 2', 'yet another sentence', ...]
predictions = inference(data)

which probably is a common use case.

Any suggestions which components are required to build a function like this?

Also, another thing - I’ve tried to move the model to cpu by doing:

data_lm = TextLMDataBunch.from_csv(path)
learn = RNNLearner.language_model(...)

X, y = next(iter(data_lm.valid_dl))

cpu_model = learn.model.cpu()
preds = cpu_model(X.cpu())

but the last line failed with RuntimeError: Expected object of backend CPU but got backend CUDA for argument #4 'mat1' - do you have any suggestions what steps are needed to amend this?

Thank you!

(Johannes Lackner) #28

Motivated by @simonw’s great “cougar or not” webapp featured in lesson 2 Part 1 v3, I tried to set up a “Starlette” app on pythonanywhere, but that does not seem to be possible on that platform.
Sadly, pythonanywhere doesn’t support ASGI, so no “asyncio”.

fastai-1.0 and pytorch-1.0 installation worked for me with a paid “Hacker” account - I can confirm @vijaysai’s observation that the free account disk space on pythonanywhere is insufficient.

(Henri Palacci) #29

Hi everyone,

I played a little more with the solutions discussed in this thread and have a webapp up and running:; see also the repo.

Some observations:

  • I didn’t spend a lot of time optimizing my DL docker image (ended up being close to 1G) - that said even if it were smaller, I found having to deploy an image that includes pytorch to be super cumbersome; had to rebuild it a bunch of times and wish it had been smaller and more nimble; even for “exploration” havine to wait 10 minutes for it to build was not fun. I’m going to try and deploy something C++ based and see how that goes.
  • The serverless solutions discussed here (like now) are really cool, but somehow unsuited to the deployment of large docker images. Even if there’s a workaround to deploy a 700MB to the free tier right now, looking at the pricing page doesn’t make me feel comfortable that this is a “safe” solution. I ended up going docker-compose with three containers on a DO droplet.
  • The deployment experience is pretty rough generally, and it feels like it shouldn’t be. Even though now looks super easy, if you want to deploy a real webapp there’s still a lot of stuff to take care of (CORS, local VS deploy envs, port configs, non-standard APIs, docker caches, . . . )

Will continue working on this and report back!

(Nissan Dookeran) #30

Changes needed to to work with updated fastai library.

  1. from import(...create_cnn should replace from import(...ConvLearner
  2. cat_learner = create_cnn(cat_data, models.resnet34) should replace cat_learner = ConvLearner(cat_data, models.resnet34)
  3. pred_class,pred_idx,losses = cat_learner.predict(img)
    should replace losses = img.predict(cat_learner)

PRs were submitted @simonw from me and another member for this if you want to update the repo. Thanks for publishing it.

(Charles Twardy) #31

[Edited after discovering my mistake. -crt]

I was having a problem adapting 104c_single_image_pred.ipynb to my data: the original model performed well, but the loaded model performed badly, always predicting a single class.

The issue was that I had mismatched .normalize(): I forgot it on save, but used it on load. Matching them up solves the issue. I can now use single_from_classes and predict as recommended above.

(Haider Alwasiti) #32

Deploying on Zeit

The tutorial showed by Jeremy is absolutely fantastic and easy.

Here is my toy example:

I used resnet50 (image size 299) instead of Jeremy’s resnet34 (224) example, which is trained on 200 images for each class of ['baby', 'boy', 'man', 'woman']

It can do better with a larger number of training images, but I was delighted that now, thanks to Jeremy, I have a way to deploy my future DL future projects.



I used the Deploying on Zeit

When I run

now scale $ sfo 1

i get an Error , Error! Cannot scale a deployment containing builds

(Will) #34

I get this same error following the tutorial. i tried deleting the directory and starting fresh and got the same error. I don’t know anything about web deployments or apps or docker for that matter and spend my time doing data science related coding so i’m flailing a bit.

(base) will@DaltonAI:~/zeit$ now ls
> 1 total deployment found under [249ms]
> To list more deployments for an app run `now ls [app]`

app     url                      inst #    type    state    age
  zeit         0    -       READY    9m

(base) will@DaltonAI:~/zeit$ now scale $ sfo 1
> Fetched deployment "" [187ms]
> Error! Cannot scale a deployment containing builds

(Navjot) #35

@vims11jan, @whamp, noting:

  • this last now scale $ sfo 1 command is only for scaling the deployment such that it never sleeps
  • the url you alias-ed to before this step, should still be working if you try

(without the scale, if the webapp is not used for sometime, the deployment goes to sleep, and wakes up when requested - just that the waking up takes a few seconds and is not the best user experience)

That said, the scaling should have still worked - @arunoda any ideas what might have happened?

(Arunoda Susiripala) #36

Hey, I’m not exactly sure about the question.

now scale $ sfo 1 should be one time command. Here we set the scaling rule to the alias.
So the aliased deployment always has a single instance running everytime.

For all other past instances have 0-1 scale rule. Basically, they’ll sleep after some inactive time.

(Will) #37

Going to that URL gives me a directory listing:

(Navjot) #38

@arunoda: Q was related to the above error message which 2 users got in this thread ysday

(Arunoda Susiripala) #39

cc @whamp

Add this to your now.json file.

    "features": {
        "cloud": "v1"
    "version": 1,
    "type": "docker"

Then it’ll work as expected.
I’ve also changed the production guide based on these:

(Navjot) #40

@arunoda: Has this come up because of the yesterday’s announcement/updates around v2?

(Arunoda Susiripala) #41

Yes. All new accounts are automatically set to the v2 API.

(nok) #42

I tried follow the doc to install npm and now, it does not work. I have to install npm follow nodejs doc and install now with an --unsafe-perm option

(Arunoda Susiripala) #43

Hmm. You don’t need Node.js or NPM.
Simply download Now Desktop from here:
It’ll download the CLI automatically.

Or you can download the CLI manually too: