Is there a way to instantiate a ConvLearner without specifying a DataBunch or having to download the pre-trained backbone, into which we can simply load previously saved weights into?
For example, I’m imaging something like this is possible …
In a production environment where I may want to do single image predictions I don’t want to have to download models.resnet34 since I already have fully trained model weights at my disposal.
EDIT: Now that I read it over again, I realize this is definitely not the answer you were looking for. I’d like to know as well, what the library usage would be for loading up and using a pretrained model without a databunch object.
Somebody please correct me if I’m wrong.
Since the resnet34 is directly imported from torchvision, I doubt you could easily change the weights urls for it. And, it’s also probably something you don’t want to mess around with.
Looking at the code, the only interaction with the learner object is to
get the transforms on the valid_ds and valid_dl
get the model
get the callbacks
So, I’m assuming one can just create a placeholder DataBunch for the code to work properly without necessarily providing any useful training/validation data. After initializing the model with the transforms applied (that were used during training), one should be able to get predictions.
I haven’t tried this out yet, but let us know if you dig further into it.
Still would be nice to create a ConvLearner without having to specify a DataBunch or download the weights for whatever resnet model you are using (since we are going to be loading our own trained weights). I really don’t want to deploy this into production and to have the resnet weights downloaded first as is the case when you run ConvLearner(data, models.resnet34,...).
It would be nice if we can get an empty learner if you will … one with the network architecture in place but not the weights.
If you follow the torchvision link above, you’ll see that it’s possible to create your own arch+weight combination. resnet34 is just a variation of Resnet and the weights are available on the internet.
You should be able to build a wgnet34 and have the weight available on the filesystem if you prefer.
I do agree that creating a ...DataBunch feels a bit roundabout when your aim isn’t to train/validate anymore. But, all the other attributes (valid_ds, valid_dl) needed for prediction are still referred via that class, at least for now.
Or maybe @jeremy can chime in, and tell about this magical feature somewhere in the library that we haven’t run into yet.
However, I’d love to be able to use that Image.predict(img, learn) which requires a working Learner which looks to require a valid DataBunch currently.
Super that you have it working. Yeah, probably would have been better to keep the dataloaders and transform objects separate in the library design, so we could use the transforms easily for post training use cases.
Maybe there’s a better way to do this in the library, and we are just talking a roundabout way for a one-liner.
But, if not, then since the predict function is rather small, we could
save the required python objs to filesystem using pickle, and load it back in production
write a custom predict function that uses these objects
I’d love to be able to use fastai in production as well, sometime this/next year. So, this would be very useful.
My goal is to train a model on an expensive GPU machine, save that model to disk (as a *.pth file), then load up the saved model on a much cheaper machine and use it to classify new images using the CPU rather than the GPU.
If I can do this it will be trivial to deploy my models to a production API somewhere that I can then evaluate new images against.
The problem is… I need to be able to instantiate a learner such that I can run the following:
I can’t figure out how to instantiate the learner. I have 600MB of images and I don’t want to have to deploy them along with that .pth file just so I can instantiate an ImageDataBunch (which I need to pass to the ConvLearner constructor).
I didn’t quite expect people to be asking for people to do this right after lesson 1, so apologies I’m a little unprepared. Let me figure out the best way to handle this…
I’d be happy with a simple way to classify new images on any hardware - expensive GPU or otherwise.
My test case for lesson one is facial recognition - specifically telling apart family members. In my experience with cell phones that use facial recognition to unlock a phone the phones’s facial recognition model can’t tell the difference between siblings nor parents and children…
I had to basically trick the ImageDataBunch constructor by feeding it some fake filenames, but I appear to have managed to deploy a pre-calculated model on a non-GPU machine!
Here’s the rough outline I used. I had to hard-code in my list of labels:
Note that I’m loading models.resnet34 (which means downloading it the first time the code is imported) even though I don’t think it’s actually needed since I’m using the model from disk. I couldn’t figure out how to call the ConvLearner constructor without it.
BAnd here’s the (crazy simple at the moment) API I’ve deployed using this technique: https://cougar-or-not.now.sh/form - upload a photo of a cougar, house cat or bobcat and it will attempt to tell you which one it is.
The underlying code (and pre-calculated model) is available here: https://github.com/simonw/cougar-or-not - including a Dockerfile which builds a container which can then be deployed to a hosting provider such as Zeit Now - that’s how I’m hosting the https://cougar-or-not.now.sh/form deployment.
The nice thing about this way of hosting models is that it’s essentially free: Zeit Now only spins up a server when a request comes in (kind of like AWS Lambda) and their pricing model is extremely generous. It wouldn’t work for giant model files (larger than a few hundred MB) but the 83MB models I got from building on models.resnet34 fit just fine.
Nice workaround! The data object will never complain if files aren’t there unless you try to load them (by asking for data.train_ds[smthg]).
In the future we’ll definitely add something to the library to make this easier, especially with all the new functionalities in pytorch v1 to put models into production.
Yeah this is a VERY hacky initial setup. It only works if you visit https://cougar-or-not.now.sh/form and submit the image there - any othe URL (including / ) currently throws an error.