Hi! I haven’t seen a topic with this particular type of question so thought I’d post a new topic.
I have a large dataset of fashion images, and each image has 4-7 features associated with it (‘button-up’, ‘light blue color’, etc). I want to train a model that can examine a clothing image and determine what features that piece of clothing has.
Though each example only has a few features, the overall number of features to predict from are sometimes 3000 or more. I tried doing multiclass classification with a resnet50 and got disappointing results. I wasn’t expecting it to blow me away, but I think I can do better.
I’ve been trying chains of models that would segment out the clothing into categories first (jeans, for instance), and then put the image through a jeans-specific multiclass classification model.
Does anyone think there may be a better approach to doing this? Perhaps a different type of model, or a different approach to using multiclass classification?
Thanks for any help or ideas – I’ve found the fast.ai community to be incredibly kind and helpful, and I appreciate any advice!
The image having multiple labels(‘button-up’, ‘light blue color’, etc) is an example of a multi-label problem. If you want to continue with the course you’ll be shown how to solve it on lecture 3.
is code for the multilabel classification problem.
Thanks @denizkavi yes, I’ve tried that notebook model but it didn’t do very well at all– I’m assuming because instead of 10-12 classes to classify from, there are 3000-4000 classes.
Given the large number of classes, I’m wondering if there is a better approach than normal multi-class classification? Has anyone had an instance where they needed to classify so many different features?
It might because there are too few images per class. You can look into implementing mixup data augmentation or if the labels aren’t entirely correct you can check out label smoothing(which works well with mixup)
Here’s how you implement mixup
Lesson 12 talks about these two, but implementing mixup is pretty easy.
It’d just be:
learn = cnn_learner(data, arch, metrics=accuracy).mixup()
If there would be difference in your images if they were rotated etc. because of the default data augmentation of fastai, you might want to look into that as well.
thanks @denizkavi! I’ll look up lesson 12 and see if I can implement some of those things
This is really good. Thank you!