Deploying models in production

(alenavkruchkova) #1

@henry.humadi graciously provided me with following following solution:
“Decouple your training from the prediction. Meaning you can train your model wherever you want then pickle your model and save it on s3. Then consume it in your production environment.
This way you can setup the machine however you want without affecting your production environment”

The model I am using is very large (about 2GB) and I am concerned with performance speed.

Would welcome any suggestions

Lesson 10 Discussion
(Jeremy Howard) #2

The size need not have any connection to the speed. Try it and see how it goes, and let us know if you run into any issues in practice.

(alenavkruchkova) #3

Got similar response from @henry.humadi

The model is big and it will live in your memory when it comes to prediction time and the size of it. The size of it isn’t that alarming to be honest

just make sure that you have enough memory to continue with other tasks

The performance of the model is a cpu bound problem and dependent on the model complexity rather than the size of the model itself

(janardhanp22) #4

It would be great to hear step by step entire process to put any machine learning model into production ? @alenavkruchkova @jeremy

(Jeremy Howard) #5

We’ll be looking at productionizing in part 2!

(alenavkruchkova) #6

We have not yet covered this in class, but I’m at a point where I’m trying to save a TF image classification model and deploy it to some server which will be used as more of an API server. The problem I’m finding is that the default saver classes provided by TF reinitialize (and/or reload) the graph on, basically, every request. Is there a standard method in deploying these models so that there’s no overhead cost of reloading a given model?

(Jeremy Howard) #7

Have you tried looking at ?

Regarding the specific issue you mentioned - can’t you just keep the graph in memory? What’s causing it to reinitialize, and what is making it hard to avoid that?


I am trying to build a simple production model ( using android app ) and following this link
Has any one tried this before. I am finding it difficult to setup bazel and paths for the android sdk as mentioned in the document.

Steps I completed so far:

  1. Installed bazel - ubuntu 16.04 - recommended method
  2. Downloaded android NDK
  3. downloaded android SDK as per the git page

I am missing the next step though.


I’ve done this, try and build jniLibs/armeabi-v7a/ then integrate in tensorflow/contrib/android/ into your project.