MNIST Sample with Cross Entropy

Hello

Working on chapter 5, Multi-Class classification with Cross-Entropy

Took Mnist Sample data, and try to classify (3,7) with cross-entropy

Kindly find the code here in google colab mnist_with_cross_entropy

Code

from fastai.vision.all import *
path = untar_data(URLs.MNIST_SAMPLE)
path
Path('D:/DATA/y.iqbal/.fastai/data/mnist_sample')
stacked_threes = torch.stack([tensor(Image.open(o))/255 for o in (path/'train/3').ls()])
stacked_sevens = torch.stack([tensor(Image.open(o))/255 for o in (path/'train/7').ls()])
stacked_threes.shape, stacked_sevens.shape
(torch.Size([6131, 28, 28]), torch.Size([6265, 28, 28]))
valid_stacked_threes = torch.stack([tensor(Image.open(o))/255 for o in (path/'valid/3').ls()])
valid_stacked_sevens = torch.stack([tensor(Image.open(o))/255 for o in (path/'valid/7').ls()])
valid_stacked_threes.shape, valid_stacked_sevens.shape
(torch.Size([1010, 28, 28]), torch.Size([1028, 28, 28]))
train_x = concat(stacked_threes,stacked_sevens).view(-1,28*28)
train_y = concat(tensor([1]*len(stacked_threes)),tensor([0]*len(stacked_sevens))).unsqueeze(1)
train_x.shape,train_x.shape
(torch.Size([12396, 784]), torch.Size([12396, 784]))
valid_x = concat(valid_stacked_threes,valid_stacked_sevens).view(-1,28*28)
valid_y = concat(tensor([1]*len(valid_stacked_threes)),tensor([0]*len(valid_stacked_sevens))).unsqueeze(1)
valid_x.shape,valid_y.shape
(torch.Size([2038, 784]), torch.Size([2038, 1]))
dset = list(zip(train_x,train_y))
valid_dset = list(zip(valid_x,valid_y))
dl = DataLoader(dset,batch_size=256)
valid_dl = DataLoader(valid_dset,batch_size=256)
def init_params(size,std=1.0):
    return (torch.randn(size)*std).requires_grad_()
weights1 = init_params((28*28,5))
bias1 = init_params(5)
weights2 = init_params((5,2))
bias2 = init_params(2)
def linear1(xb):
    res = xb@weights1+bias1
    res = res.max(tensor(0.0))
    res = res@weights2+bias2
    return res
def mnist_loss(preds,targets):
    targets = targets.T.squeeze()
    return F.cross_entropy(preds,targets)
def calc_grad(xb,yb):
    preds = linear1(xb)
    loss = mnist_loss(preds,yb)
    loss.backward()
params = weights1,bias1,weights2,bias2
def train_epoch():
    for xb,yb in dl:
        calc_grad(xb,yb)
        for p in params:
            p.data -= p.grad*0.1
            p.grad.zero_()
def valid_batch_accu(xb,yb):
    preds = xb
    corrects = torch.argmax(preds,dim=1).unsqueeze(1) == yb
    return corrects.float().mean()
def validate_epoch():
    accu = [valid_batch_accu(linear1(xb),yb) for xb,yb in valid_dl]
    return torch.stack(accu).mean()
for i in range(10):
    train_epoch()
    print(validate_epoch())
tensor(0.6342)
tensor(0.6782)
tensor(0.7363)
tensor(0.7773)
tensor(0.8100)
tensor(0.8335)
tensor(0.8486)
tensor(0.8623)
tensor(0.8755)
tensor(0.8857)
a_7_image = tensor(Image.open((path/'valid/7/9711.png')))/255
ten_sor = (a_7_image).view(-1,28*28)
ten_sor.shape
torch.Size([1, 784])
linear1(ten_sor)
tensor([[ 3.6760, -2.5360]], grad_fn=<AddBackward0>)

code is giving accuracy nearly 80%.

But when test a 7 image. it’s giving wrong results. kindly check at the end of code , testing a 7 image against trained neural net

Mean’s this approach (code ) is wrong.

Can anybody identify what mistake is being done here and point in right direction

Thanks!

Hey, the code is correct :slight_smile:
Have a look at:

train_y = concat(tensor([1]*len(stacked_threes)),tensor([0]*len(stacked_sevens))).unsqueeze(1)

and

valid_y = concat(tensor([1]*len(valid_stacked_threes)),tensor([0]*len(valid_stacked_sevens))).unsqueeze(1)

You are assigning the label 1 to all 3s and all 7s get the label 0. The result of you prediction was:

tensor([[ 3.6760, -2.5360]], grad_fn=<AddBackward0>)

The first value is bigger than the second and we would interpret that as: the model predicts the first label. Python arrays start at 0, so the label you are predicting is 0, which is the label for 7s.

Hope that makes sense.

2 Likes

yes it totally make sense

and unable to see so simple interpretation of prediction.

Thank You :slightly_smiling_face:

1 Like