Pretraining on open images v4 instead of imagenet?

I thought I’d be great if for fastai_v1 the pretrained models were trained on the “open images v4” dataset instead of imagenet.

Reasons:

  1. Open images is bigger and has more classes which might make the transfer learning work even better

  2. Imagenet is not very diverse/inclusive, open images is supposed to be better
    Tweet from Rachel
    I think there is a blog post about that but can’t find it

Drawbacks:
Open images is very big
Training many models might be expensive

Any thoughts on this?

It would be an interesting research project to experiment with. Unfortunately Open Images is also somewhat biased. Also, it’s largely auto-labeled, so using it properly requires semi-supervised learning methods. It’s not a drop-in replacement for imagenet.

1 Like