I am trying to figure out how to use an appropriate loss function in the full implementation of the MNIST dataset.

From my research I believe I have to change the original function to use `nn.CrossEntropyLoss()`

for a classification problem.

This is my setup:

```
# Train dataset
train_x = torch.cat(train_stacked_nums).view(-1,28*28)
train_y = torch.cat([tensor([i]*len(t)) for i,t in enumerate(train_stacked_nums)]).unsqueeze(1)
train_dset = list(zip(train_x, train_y))
# Validation dataset
valid_x = torch.cat(valid_stacked_nums).view(-1,28*28)
valid_y = torch.cat([tensor([i]*len(t)) for i,t in enumerate(valid_stacked_nums)]).unsqueeze(1)
valid_dset = list(zip(valid_x, valid_y))
print(train_x.shape, train_y.shape, valid_x.shape, valid_y.shape)
```

The shape of my training and validation Xs and Ys is:

```
torch.Size([60000, 784]), torch.Size([60000, 1]), torch.Size([10000, 784]), torch.Size([10000, 1])
```

I am using the same linear function:

```
# Create a function for matrix multiplication
def linear1 (xb): return xb@weights + bias
preds = linear1(train_x)
preds.shape, preds[0]
```

The result of the shape of predictions and a sample of the first one is:

```
(torch.Size([60000, 1]), tensor([16.6286], grad_fn=<SelectBackward0>))
```

I thought of defining the loss function as:

```
# Loss function
def mnist_loss_cel(preds, targs):
l = nn.CrossEntropyLoss()
return l(preds, targs.squeeze())
```

But when I run a test it doesn’t seem to be working

```
# Initialise the weights
weights = init_params((28*28,1))
bias = init_params(1)
## Creating the data loaders
# Training data loader
dl = DataLoader(train_dset, batch_size=256)
# Validation data loader
valid_dl = DataLoader(valid_dset, batch_size=256)
# Create a batch for testing
batch = train_x[:4]
# Predict the result
preds = linear1(batch)
```

The result of preds is:

```
tensor([[ 9.1690],
[-1.9303],
[ 8.9400],
[ 2.8060]], grad_fn=<AddBackward0>)
```

But the result of my loss is always zero

```
loss = mnist_loss_cel(preds,train_y[:4])
loss
```

The result I keep getting is:

```
tensor(0., grad_fn=<NllLossBackward0>)
```

I ran a test with the original loss function:

```
# Original loss function
def mnist_loss(preds, targs):
preds = preds.sigmoid()
return torch.where(targs==1, 1-preds, preds).mean()
# Calculate the loss
loss = mnist_loss(preds,train_y[:4])
loss
```

and I got what I would think is an appropriate result:

```
tensor(0.5406, grad_fn=<MeanBackward0>)
```

What am I missing and why isn’t it working properly?