Food recognition : What kind of dataset should I choose ? Single label to multi-label classification

Presentation
Hi, I’m working on a mini-project where the goal is to label the food on a picture : fruits, vegetables, eggs,… (not dishes)

The pictures would be taken like so : banana/apple/whatsoever on a table.

The dataset question
I’ve find a cool and very clean dataset of fruits on Kaggle, with 64 kinds of fruits :


I’ve run quick tests and since pictures are almost identical, the network get 97-98% accuracy in no time. But “as expected” the network is bad at generalizing to new images like the banana on the table.

I’ve then been looking at the really rich ImageNet dataset :


Pictures are really diverse (maybe too much sometimes), so I’m confident that my new NN will be able to generalize better. So, right now I consider switching to this imagenet dataset.

But I wonder if getting rid of all my clean fruits dataset from Kaggle is the way to go.
Maybe I should try to mix them both?
Training one before the other?
Using super-strong data augmentation on my clean fruits?
Or maybe I should not bother too much and just use ImageNet? :stuck_out_tongue:

I’ll be happy to hear some recommandations.

Multi-label classification from single-label images ?
Instead of just taking a picture of my banana, I would like to take a picture of my banana, my apple and my orange, together sitting on the table. But there is of course no dataset labeled like this. Instead, I would like to take my network trained for single-label and try to predict multiple-labels.

There is not so much ressource about this subject, but still I’ve find this thread.
I will try to implement the ideas if I have the time to (food recognition is just a small subset of my project).
But I just want to get some though about it, if someone have done that before or may have an interesting opinion.

Thanks :slight_smile:

1 Like

Hi Johan im working on a similar project now. Did you ever figure out the single label to multi label classification thing?
Also what did you do about the dataset?

Any help would be greatly appreciated