@henry.humadi graciously provided me with following following solution:
“Decouple your training from the prediction. Meaning you can train your model wherever you want then pickle your model and save it on s3. Then consume it in your production environment.
This way you can setup the machine however you want without affecting your production environment”
The model I am using is very large (about 2GB) and I am concerned with performance speed.
We have not yet covered this in class, but I’m at a point where I’m trying to save a TF image classification model and deploy it to some server which will be used as more of an API server. The problem I’m finding is that the default saver classes provided by TF reinitialize (and/or reload) the graph on, basically, every request. Is there a standard method in deploying these models so that there’s no overhead cost of reloading a given model?
Regarding the specific issue you mentioned - can’t you just keep the graph in memory? What’s causing it to reinitialize, and what is making it hard to avoid that?
I’ve done this, try and build jniLibs/armeabi-v7a/libtensorflow_demo.so then integrate in tensorflow/contrib/android/TensorFlowImageClassifier.java into your project.