# Part 1, Chapt 4 newbie question

I am trying to do Chapter 4 with the full MNIST set.

I did this to make my train_y and valid_y tensors:

train_y = tensor([0]*len(zeroes) +

``````            [1]*len(ones) +

[2]*len(twos) +

[3]*len(threes) +

[4]*len(fours) +

[5]*len(fives) +

[6]*len(sixes) +

[7]*len(sevens) +

[8]*len(eights) +

[9]*len(nines)).unsqueeze(1)
``````

train_x.shape,train_y.shape

(torch.Size([60414, 784]), torch.Size([60000, 1]))

valid_x = torch.cat([valid_0_tens,

``````                 valid_1_tens,

valid_2_tens,

valid_4_tens,

valid_5_tens,

valid_6_tens,

valid_7_tens,

valid_8_tens,

valid_9_tens,

]).view(-1, 28*28)
``````

valid_y = tensor([0]*len(valid_0_tens) +

``````             [1]*len(valid_1_tens) +

[2]*len(valid_2_tens) +

[3]*len(valid_3_tens) +

[4]*len(valid_4_tens) +

[5]*len(valid_5_tens) +

[6]*len(valid_6_tens) +

[7]*len(valid_7_tens) +

[8]*len(valid_8_tens) +

[9]*len(valid_9_tens)).unsqueeze(1)
``````

valid_dset = list(zip(valid_x,valid_y))

(torch.Size([8990, 784]), torch.Size([10000, 1]))

As you can see, the operation seemed to have â€śremoved entriesâ€ť (for lack of a better word) from train_x in making train_y, and likewise â€śadded entriesâ€ť to valid_x when making valid_y

Why is it either returning more or less entries in train_y then in train_x? As I understand it, itâ€™s the exact same operation as in the tutorial proper with the same tensor shape so it shouldnâ€™t change anything. I suspect thereâ€™s some kind of default limitations since the cutoff is so clean, but the pytorch documentation is not too illuminating. Any suggestions? I feel rather dumb for asking but I am stuck.

Hi and welcome! All I can say is the sizes of your target tensors are correct. MNIST has 60.000 training items and 10.000 validation items, thatâ€™s where the clean numbers come from, itâ€™s not a PyTorch limitation.

Can you maybe share your notebook? Itâ€™s hard to tell where the error comes from.

2 Likes

Hi Johannes,
Yes absolutly:

Thank you for the assist

Found the errors

When converting the images to tensors, youâ€™re using the â€śsevenâ€ť images for the â€śeightâ€ť tensors.

In the validation set, you missed the â€śthreesâ€ť.

That should fix it. Maybe you can figure out a way to load the images in a loop instead of writing all the code by hand, which is just asking for trouble

2 Likes

It works ! and yeah ,loop does make a lot more sense

Thanks again ^_^.