Bugs while implementing MNIST 0-9 from scratch

aniruddh · January 25, 2021, 10:52am

For each corresponding tensor in the train_set I have used the train_y labels as [0,1,2,3,4,5,6,7,8,9]. I am confused as to how the loss function should be written in this case, I read about the categorical_cross_entropy loss function and I know that works, but I am unable to write my batch accuracy function after that.

In case of a binary classification it made sense. For 0-9, I added a stable softmax function, a categorical_cross_entropy function, but now I don’t know how my batch_accuracy function should be like. Help appreciated.

Another alternative approach which is provided here where a one hot encoding label has been added, but what if i do not want to go down that road?

jimmiemunyi · January 25, 2021, 12:18pm

For batch_accuracy, you want to use the function torch.argmax which will return the index of the largest prediction (assumes that is what your model is predicting)

You could try and see how I implemented the batch_accuracy in this notebook:

github.com

jimmiemunyi/deeplearning-experiments/blob/main/notebooks/MNIST_From_Scratch.ipynb

{
  "nbformat": 4,
  "nbformat_minor": 0,
  "metadata": {
    "colab": {
      "name": "MNIST From Scratch.ipynb",
      "provenance": [],
      "collapsed_sections": [],
      "include_colab_link": true
    },
    "kernelspec": {
      "name": "python3",
      "display_name": "Python 3"
    }
  },
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "view-in-github",

This file has been truncated. show original

aniruddh · February 5, 2021, 2:42pm

thanks @jimmiemunyi, argmax is definitely a helpful tool to keep up my sleeve. I got an increasing accuracy on my model the first time. I do not know if this is a stupid mistake, but on restarting the kernel suddenly my accuracy drops and stays constant. I tried running your exact code on my colab and I got the same error. Do not know if I should spend more time debugging this or move ahead to the next chapter.

jimmiemunyi · February 7, 2021, 6:00pm

Hello

What part exactly do you get the constant dropping error? What were you trying to implement. I have just ran my notebook again and the model is training with the loss decreasing (although the accuracy gets constant after some time on 96% which is not a bad model)

Personally, I’d suggest you move on for now then revisit the problem again when you have a fresher mindset. You can continue doing the lessons during the week then set aside the weekend to try and debug the problem. Also by continuing the lessons, you will get more knowledge (cross entropy and argmax is covered in the next lesson) and thus more understanding of the problem