Multicategory Pets Problem

alex.larrimore · September 22, 2020, 4:46pm

I’m grappling with turning the original classification books (Bears and Pet breeds) into multi-category blocks and getting correct outputs.

For bears, every image I download and try to get to work classifies as a teddy bear - I assume this is because my teddy bear images are so different that the model trains that as basically a not-black and not-grizzly category. My best guess at how to make this work is to include a no-bear category even though multi-classification normally can do that on it’s own. Am I making the wrong assumption here?

My real concern, however, is the pet breeds notebook. To create my own test data I found an image of a non-example, a basset hound image, and a beagle image all of which it identifies accurately. I edited my beagle and basset hound into the same image and it is not correctly identifying both of them in the combined image. Does anyone have an idea of what I am doing wrong? Here’s my notebook - https://colab.research.google.com/drive/1rpEEAshJhjN_-g2ccI7IAhwhu0a9mN_w?usp=sharing

Thanks in advance!

mccallionr · September 22, 2020, 6:38pm

Hi Alex! Can we have access to your nb?

alex.larrimore · September 22, 2020, 6:40pm

This link should work, I’ll update the bad one in my original post: https://colab.research.google.com/drive/1rpEEAshJhjN_-g2ccI7IAhwhu0a9mN_w?usp=sharing

mccallionr · September 22, 2020, 6:43pm

Thanks, working .

mccallionr · September 22, 2020, 7:35pm

My best guess is that you trained the model using images of single animals, and when you give it an image with two animals in it, it doesn’t assign accuracies that surpass your .5 thresh.

I think it was working a little better for your bear model because the bing ims do contain multiple bears per image sometime, but I do not know why your bear classifier is only returning teddies.

Jeremy says a model is made up of three things: data, arch, and loss. In this case, I’m going to guess that the arch and the loss are fine, but the data isn’t. If I were you, I’d either move on to the COCO or imagenettte datasets and train using those ims and labels (many ims in those datasets have multiple labels), or to create a script that makes “siamese images” using the pets data and train on those new siamese images (luckly, there’s a tutorial for that in the fastai docs under tutorials/advanced!).

Good luck, and keep at it!!

arampacha · September 23, 2020, 3:03pm

Hey @alex.larrimore! I guess one issue might be following:
You created the combined image by concatenating two images horizontally, this results in a wide image.
Then you have Resize(460) as item_transform in your dataloader. By default it makes a center crop from your image at test time. In this case it may result in uninformative image which doesn’t have enough of beagle. And so as I see in your notebook it predicts basset_hound, whose had happens to be in the center.
You can do prediction with with_input=True to see what is actually fed to the model:

img, pred, idx, probs = learn.predict('/content/test_img.png', with_input=True)
img.show()

I tried it and got:
test_img

arampacha · September 23, 2020, 3:13pm

Also it would be nice to hear from those more experienced with using image models in production on:

if it’s really an issue here, or I’m wrong in some way?
how do you deal with unexpected inputs like this one, to make model more robust in real world applications?

alex.larrimore · September 23, 2020, 4:43pm

That makes a lot of sense. Thank you! Totally forgot about how image adjustments would mess things up.

joedockrill · September 23, 2020, 5:05pm

See if you can pad it instead of cropping it

alex.larrimore · September 23, 2020, 7:03pm

That was a great guess but the training data still seems to be the problem. Even with images I load correctly (thank you for the way to display them!) I’m still always getting one result with this model.

My best next guesses are:

-Use the Siamese example from the advanced tutorials (which is quite complicated and beyond my current Pytorch competency) to update my training set to include paired images that, depending on a random number, will either be the same or a different label

-Train a model using the Pascal dataset and then train the last layers of that model with my pet classification model in the hopes that it will be more comfortable with classifying an image with multiple labels.

How do others build multicategory training sets from scratch? Do you usually write out a csv file with labels as seen in the Pascal set? I’d really like to find a way to make this work with the the single image system of downloading images into a folder with the correct label and it seems like a problem someone else has already solved.

alex.larrimore · September 24, 2020, 4:45pm

@mccallionr Is there an easy way to train my learner on two separate sets of data? I am almost positive there is and that it is incredibly simple but for some reason I am missing it. I’ve tried running a normal model, changing the dataset of my learner and then training again as well as loading multiple dataloaders into my cnn_learner initially. I have gotten errors with both approaches and am pretty sure I’m either over- or under-thinking this.

I’m currently tinkering with changing the dataloader and calling learn.fine_tune like in notebook 7’s progressive resizing example but it gives me: ValueError: Target size (torch.Size([2368])) must be the same as input size (torch.Size([1280]))

If I can get that to work I’d be able to train on one data set then another but it doesn’t help with the idea of training from two dataloaders simultaneously (alternating batches or something)

mccallionr · September 29, 2020, 11:11pm

Hi @alex.larrimore! Apologies for the delayed response.

That’s a really interesting idea. This seems similar to what we’re doing already when we use pre-trained models and train a customized head + unfreeze and tune the rest of it.

If you’re going to do something like that, be very very careful about the shapes of the layers in your model & the data coming from your dataloaders. Did you check the shape of the batches produced by each dls?

That being said, I think you may want to explore your options around generating more appropriate/similar data before trying to do anything extra on the architecture/training side. The data is really the first consideration of all modeling, and the choice of arch, loss, and training techniques should be tailored to your data. The other thing I’d recommend is writing a learner entirely from scratch by doing 04_mnist and adapting the code to your problem.

Sorry I don’t have better answers !