Hi there, how are you doing?

I just finished Lesson 4 MNIST and I’m going through it again doing things a bit differently. Instead of using MNIST_Sample, I used MNIST. I noticed that there are only training data in this set. So I decided to split those data into training and valid set.

However, I found out that my dataset is wrong as the result I got (using the optimized Learner module) has starting accuracy of 100% and then degraded.

I hope you could take a look of my code and help me debug this problem.

Eight_path = (path/‘training/8’).ls().sorted()

Five_path = (path/‘training/5’).ls().sorted()

Eight_list = [tensor(Image.open(i)) for i in Eight_path]

Five_list = [tensor(Image.open(i)) for i in Five_path]

Eight_tensor = torch.stack(Eight_list).float()/255

Five_tensor = torch.stack(Five_list).float()/255

Eight_tensor = Eight_tensor.view(-1,28*28)
Five_tensor = Five_tensor.view(-1,28*28)

Eight_label = tensor([1]*len(Eight_tensor))

Five_label = tensor([0]*len(Five_tensor))

Data = torch.cat([Eight_tensor,Five_tensor])

Data_label = torch.cat([Eight_label, Five_label]).unsqueeze(1)

def Data_split(Dataset, ratio):

Test_length = round(len(Dataset)*0.8)

Train_length = len(Dataset) - Test_length

return Dataset[0:Test_length-1], Dataset[Test_length-1:]

SplitRatio = 0.8

Data_train, Data_valid = Data_split(Data, 0.8)

Label_train, Label_valid = Data_split(Data_label,0.8)

Train_dset = list(zip(Data_train, Label_train))

Valid_dset = list(zip(Data_valid, Label_valid))

Train_MiniBatch = DataLoader(Train_dset, batch_size = 256)

Valid_MiniBatch = DataLoader(Valid_dset, batch_size = 256)

dls = DataLoaders(Train_MiniBatch, Valid_MiniBatch)

learn = Learner(dls, nn.Linear(28*28,1), opt_func=SGD,

loss_func = loss_fx, metrics = batch_accuracy)