Achieving 99.5% accuracy in MNIST dataset with less than 8000

Aamir · March 30, 2020, 2:21pm

For and intersnship I was given:
Write a DNN of Google Colab: 1. It must have less than 8000 parameters 2. Achieve 99.4% validation accuracy on MNIST

Lecture 7 we achieved 99.5% but with more than 41K parameters. I think it will be good learning for learning if we know how to reduce the number of parameters.
Suggestion plz

lgvaz · March 30, 2020, 4:15pm

Can you use transfer learning?

I would also take a look at self-supervised learning

Aamir · March 30, 2020, 4:17pm

Yeah there is no restriction in that

Aamir · March 30, 2020, 4:22pm

What i am doing is i am trying to reduce the conv layer, not trying to increase no of channels. With simple convolution architecture i achieved it.
The question is “Am I right”?

MarcoDE · March 31, 2020, 7:29am

What is you definition of a parameter here? Do you distinguish between trainable and non trainable parameters when counting?

As @lgvaz mentioned, you can use transfer learning and freeze (e.g.) the upper layers. Thus the parameters are not a trainable parameter anymore for your specific task. But these parameters were trained before.

Another option may be to use network with random weights and freeze them, such that they are not trainable. This technique is e.g. used in extreme learning machines.
The authors of https://arxiv.org/abs/1412.8307 report to get below to 1% error on MNIST with 10000 hidden units.

nestorDemeure · March 31, 2020, 7:55am

The easiest solution might be to train a classical model and then use sparsification techniques to reduce its number of weights a posteriori.

Aamir · April 1, 2020, 6:51am

Yes i did the same thing. But with resnet it took a lot of time.

alkzar90 · July 7, 2021, 8:30pm

Hi @Aamir .

I am curious about how you solved the problem. I am trying to increase the accuracy of the MNIST dataset and a follow up of this thread would be amazing.

best.

omkar_gurav · July 24, 2021, 6:09am

Hello, I achieved 99.52% Validation accuracy with 9928 trainable parameters, with Custom basic CNN, without transfer learning.

LazyCoder_007 · August 17, 2021, 8:16pm

I archived 99.42% Validation accuracy with 9740 parameters, without transfer learning.