For and intersnship I was given:
Write a DNN of Google Colab: 1. It must have less than 8000 parameters 2. Achieve 99.4% validation accuracy on MNIST
Lecture 7 we achieved 99.5% but with more than 41K parameters. I think it will be good learning for learning if we know how to reduce the number of parameters.
Suggestion plz
What i am doing is i am trying to reduce the conv layer, not trying to increase no of channels. With simple convolution architecture i achieved it.
The question is “Am I right”?
What is you definition of a parameter here? Do you distinguish between trainable and non trainable parameters when counting?
As @lgvaz mentioned, you can use transfer learning and freeze (e.g.) the upper layers. Thus the parameters are not a trainable parameter anymore for your specific task. But these parameters were trained before.
Another option may be to use network with random weights and freeze them, such that they are not trainable. This technique is e.g. used in extreme learning machines.
The authors of https://arxiv.org/abs/1412.8307 report to get below to 1% error on MNIST with 10000 hidden units.
I am curious about how you solved the problem. I am trying to increase the accuracy of the MNIST dataset and a follow up of this thread would be amazing.