I am trying to do Chapter 4 with the full MNIST set.
I did this to make my train_y and valid_y tensors:
train_y = tensor([0]*len(zeroes) +
[1]*len(ones) +
[2]*len(twos) +
[3]*len(threes) +
[4]*len(fours) +
[5]*len(fives) +
[6]*len(sixes) +
[7]*len(sevens) +
[8]*len(eights) +
[9]*len(nines)).unsqueeze(1)
train_x.shape,train_y.shape
(torch.Size([60414, 784]), torch.Size([60000, 1]))
valid_x = torch.cat([valid_0_tens,
valid_1_tens,
valid_2_tens,
valid_4_tens,
valid_5_tens,
valid_6_tens,
valid_7_tens,
valid_8_tens,
valid_9_tens,
]).view(-1, 28*28)
valid_y = tensor([0]*len(valid_0_tens) +
[1]*len(valid_1_tens) +
[2]*len(valid_2_tens) +
[3]*len(valid_3_tens) +
[4]*len(valid_4_tens) +
[5]*len(valid_5_tens) +
[6]*len(valid_6_tens) +
[7]*len(valid_7_tens) +
[8]*len(valid_8_tens) +
[9]*len(valid_9_tens)).unsqueeze(1)
valid_dset = list(zip(valid_x,valid_y))
(torch.Size([8990, 784]), torch.Size([10000, 1]))
As you can see, the operation seemed to have “removed entries” (for lack of a better word) from train_x in making train_y, and likewise “added entries” to valid_x when making valid_y
Why is it either returning more or less entries in train_y then in train_x? As I understand it, it’s the exact same operation as in the tutorial proper with the same tensor shape so it shouldn’t change anything. I suspect there’s some kind of default limitations since the cutoff is so clean, but the pytorch documentation is not too illuminating. Any suggestions? I feel rather dumb for asking but I am stuck.