Lesson 4 - Bad accuracy on full mnist dataset

Hello everyone, after several re-reads of the chapter 4 I finally jumped to the further research and I’ve modified the notebook to take care of the full mnist dataset.
I split the “training” images into a training and validation set (leaving aside the “testing” to test with images never seen by the model) but once I get to the fitting part, my accuracy is very bad (stabilize at 20%).

I’m suspecting to have missed something in creating the dataloaders but I’m not sure in how to proceed to verify that.

This is how I split and created the array of images.

imgs = []
timgs = [] # training images
vimgs = [] # validation images

for i in range(10):
    sorted_imgs = (path/'training'/str(i)).ls().sorted()
    shuffle_imgs = copy.deepcopy(sorted_imgs)

    vn = int(len(sorted_imgs) / 100 * 20)

And this is how I created the tensors for the dataset.

train_tensors = []
validation_tensors = []

train_stack = []
validation_stack = []

for i in range(10):
    train_tensors.append([tensor(Image.open(o)) for o in timgs[i]])
    validation_tensors.append([tensor(Image.open(o)) for o in vimgs[i]])


Next is basically vanilla code from the chapter. I wanted to display a confusion matrix to see where things were going wrong (I don’t know why, but that 20% accuracy is suspicious) but it throws an error when I try to to create the ClassificationInterpretation.

learn = Learner(dls, simple_net, opt_func=SGD,
                loss_func=mnist_loss, metrics=batch_accuracy)
learn.fit(40, 0.1)
interp = ClassificationInterpretation.from_learner(learn)

Any pointers?

it is occurring to me that maybe I’m wrong in creating the y tensor. Since in the example with 3 and 7 it created a tensor with a label for each element with 0 and 1. I thought I could label the full dataset with numbers from 0 to 9, but I recall (in the Titanic example) that for correctly labeling the columns it actually created a column for each class and assigned 1 to the correct one and 0 to the others… maybe it is this.

train_x = torch.cat([
    stacked_zeroes, stacked_ones, stacked_twos,
    stacked_threes, stacked_fours, stacked_fives,
    stacked_sixs, stacked_sevens, stacked_eights, stacked_nines]).view(-1, 28*28)

train_y = tensor(
    [0]*len(zeroes) + [1]*len(ones) + [2]*len(twos) +
    [3]*len(threes) + [4]*len(fours) + [5]*len(fives) +
    [6]*len(sixs) + [7]*len(sevens) + [8]*len(eights) + [9]*len(nines)).unsqueeze(1)

Turns out I was right! But now, my accuracy is flat to 0.9.
Back to experiment.