Transfer learning for MNIST like dataset ? (or, is it an overkill)

suvash · March 20, 2018, 11:28pm

I’m planning to build a multi-class classifier on a MNIST like dataset. Is there any standard architecture considered good for such datasets, esp. for using the transfer learning technique ?

I’ll probably also build a not-so-deep network from scratch as a learning exercise, but curious if there’s any ‘gold standard’ architecture for such datasets (MNIST in this case) ?

Suggestions/recommendations most welcome.

svaisakh · March 21, 2018, 1:53pm

Depends on the dataset in question.

For starters, the complexity of the model depends on many factors, one of which is the variation in the dataset.

Take, for example, MNIST and CIFAR10.
Both contain similarly sized images (one b/w the other color) with the same number of classes (10).
But while the type of 5’s, say is pretty similar perceptually, the type of horses is not.
Mathematically, the learned features will lie in a more easily separable high-dimensional manifold if the data has little variation.

As an aside, this is why you’ll see GANs have great success in modeling human faces but struggle with more complex data.

If the dataset is like MNIST in complexity and perceptual similarity among classes, then indeed, transfer learning is an overkill!

Hope that helps.

suvash · March 21, 2018, 3:28pm

Thanks for the answer. That certainly helps.