Pretraining on open images v4 instead of imagenet?

j.laute · September 24, 2018, 1:29pm

I thought I’d be great if for fastai_v1 the pretrained models were trained on the “open images v4” dataset instead of imagenet.

Reasons:

Open images is bigger and has more classes which might make the transfer learning work even better
Imagenet is not very diverse/inclusive, open images is supposed to be better
Tweet from Rachel
I think there is a blog post about that but can’t find it

Drawbacks:
Open images is very big
Training many models might be expensive

Any thoughts on this?

jeremy · September 24, 2018, 3:42pm

It would be an interesting research project to experiment with. Unfortunately Open Images is also somewhat biased. Also, it’s largely auto-labeled, so using it properly requires semi-supervised learning methods. It’s not a drop-in replacement for imagenet.