I went through the Chapter 13 guide on building out my own CNN and got a learner to work for the full MNIST set and also for my color bear images. All of this relied on the fast.ai DataBlock to organize my data. I had also previously used chapter 4 to build dataloaders from scratch to work with a simple neural network for both the full mnist set and my bear images.

I’m trying to complete the circle by updating my basic custom dataloader to work with my simple convolutional neural network and am stuck. Everything in my learner works perfectly when I assemble my data with a datablock, but when I use my own dataloader the model does not learn. There are no errors but my accuracy barely moves and not necessarily in a positive direction.

Here’s the relevant code:

grizzlies = (path/‘grizzly’).ls().sorted()

blacks = (path/‘black’).ls().sorted()

teddies = (path/‘teddy’).ls().sorted()

grizzly_images = [tensor((Image.open(o).convert(‘RGB’)).resize([224, 224])) for o in grizzlies]

black_images = [tensor((Image.open(o).convert(‘RGB’)).resize([224, 224])) for o in blacks]

teddy_images = [tensor((Image.open(o).convert(‘RGB’)).resize([224, 224])) for o in teddies]

stacked_grizzlies = torch.stack(grizzly_images).float()/255

stacked_blacks = torch.stack(black_images).float()/255

stacked_teddies = torch.stack(teddy_images).float()/255

stacked_grizzlies.shape, stacked_blacks.shape, stacked_teddies.shape

valid_g_size = (int)(len(stacked_grizzlies)/5)

valid_b_size = (int)(len(stacked_blacks)/5)

valid_t_size = (int)(len(stacked_teddies)/5)

valid_g_size, valid_b_size, valid_t_size

valid_xg = tensor(stacked_grizzlies[0:valid_g_size])

train_xg = tensor(stacked_grizzlies[valid_g_size:])

valid_xb = tensor(stacked_blacks[0:valid_b_size])

train_xb = tensor(stacked_blacks[valid_b_size:])

valid_xt = tensor(stacked_teddies[0:valid_t_size])

train_xt = tensor(stacked_teddies[valid_t_size:])

train_x = torch.cat([train_xg, train_xb, train_xt]).cuda()

train_x.shape

train_x = train_x.permute(0, 3, 1, 2)

train_x.shape

train_y = tensor([0]*len(train_xg) + [1]*len(train_xb) + [2]*len(train_xt)).cuda().unsqueeze(1)

train_y.shape

dset = list(zip(train_x,train_y))

#We can take a look at the first image in our list to get a better idea of what is going on.

x,y = dset[0]

#We will see two dimensions - the image info in the x dimension and the category in the y dimension.

x.shape,y.shape

valid_x = torch.cat([valid_xg, valid_xb, valid_xt]).cuda()

valid_x = valid_x.permute(0, 3, 1, 2)

valid_y = tensor([0]*len(valid_xg) + [1]*len(valid_xb) + [2]*len(valid_xt)).cuda().unsqueeze(1)

valid_dset = list(zip(valid_x,valid_y))

dl = DataLoader(dset, batch_size=64)

valid_dl = DataLoader(valid_dset, batch_size=64)

dls = DataLoaders(dl, valid_dl)

At the end, I get the exact same tensor shape for my first batch - 64, 3, 224, 224. I can also see the first image in my dataloaders so they are still visible images. Any ideas what I am missing?

Here is a link to the full notebook: Google Colaboratory

Thanks,

Alex