Lesson 4: pytorch, ImageDataLoaders and Categories

Hello!

I’ve been going through the course and ended up potentially painting myself into a corner trying to get the full mnist dataset running against a pytorch nn.

Let me go through what I’ve got thus far :smiley:

# Get the mnist dataset
path = untar_data(URLs.MNIST)
# Use ImageDataLoaders to get a DataLoader. 
# Customize the path variables, and the image class to make it not be RGB
images = ImageDataLoaders.from_folder(path, train='training', valid='testing', img_cls=PILImageBW)
# Create a pytorch nn
mnist_net = nn.Sequential(
    # Original dataset is still 28x28, we want it to be 1x28*28
    nn.Flatten(),
    nn.Linear(28*28, 30),
    nn.ReLU(),
    nn.Linear(30, 10),
    nn.ReLU(),
)

def batch_accuracy(xb, yb):
    # use mse_loss from pytorch
    # Cast TensorImageBW into Tensor so we can actually use it for math
    # Transform Category into one_hot so that we can have a function with a gradient
    return F.mse_loss(xb.as_subclass(Tensor), F.one_hot(yb, 10))

# Create a learner and train
learn = Learner(images, mnist_net, opt_func=SGD, loss_func=mnist_loss, metrics=accuracy)

learn.fit(2, .1)

This all proceeds to breaking down and failing with the error message

RuntimeError: Found dtype Long but expected Float

Which I to some degree get but I’m not sure where this Long is coming from. I suspect it’s from my being unable to cast Category into a one_hot earlier in processing the dataset but I’m not sure.

So, I could use some help untangling all this. Also, the amount of casting and contorting I’ve ended up doing here leads me to understand that this isn’t necessarily the path of least resistance, but my understanding that it should still be doable.

My possibly misguided but more tactical questions here are:

  • Is there any way to make the DataLoader turn the category into a one_hot before I feed it to the nn ?
  • Can I have the DataLoader flatten the images on its own so that I can ditch the Flattening layer on the nn ?

And I guess the overarching question to all this: what else am I doing wrong? :smiley:

Thank you

Hi,

Sorry, I do not have answers to your questions, but I did the similar thing as you did. I trained the MNIST dataset using pytorch and wrote a blog.

I recommend you to take time and build up the model slowly and check the output of every function you are using. If it is too overwhelming, you can just skip this part and come back later.

Have fun.

I tried to reproduce the error, but running your code I found something else.

RuntimeError: The size of tensor a (64) must match the size of tensor b (10) at non-singleton dimension 1
Cell In[7], line 28 learn.fit(2, .1)

This is because you specified nn.Linear(30, 10) instead of nn.Linear(30, 1), running the following code works for me

def mnist_loss(predictions, targets):
    return torch.where(targets == 1, 1-predictions, predictions).mean()

# Get the mnist dataset
path = untar_data(URLs.MNIST)
# Use ImageDataLoaders to get a DataLoader. 
# Customize the path variables, and the image class to make it not be RGB
images = ImageDataLoaders.from_folder(path, train='training', valid='testing', img_cls=PILImageBW)
# Create a pytorch nn
mnist_net = nn.Sequential(
    # Original dataset is still 28x28, we want it to be 1x28*28
    nn.Flatten(),
    nn.Linear(28*28, 30),
    nn.ReLU(),
    nn.Linear(30, 10),
    nn.ReLU(),
)

def batch_accuracy(xb, yb):
    # use mse_loss from pytorch
    # Cast TensorImageBW into Tensor so we can actually use it for math
    # Transform Category into one_hot so that we can have a function with a gradient
    return F.mse_loss(xb.as_subclass(Tensor), F.one_hot(yb, 10))

# Create a learner and train
learn = Learner(images, mnist_net, opt_func=SGD, loss_func=mnist_loss, metrics=accuracy)

learn.fit(2, .1)

Now, this probably isn’t what you need, but running this locally on my pc, I couldn’t reproduce the error. Maybe you could update your post if it turns out that it’s missing some information, and I could look into it again.

Hey friends, thank you so much for the help here.

Coming back to this, realized that I should try to get ChatGPT to help with this as well, and that ended up highlighting that i’d botched both the loss and accuracy functions.

Went with

def batch_accuracy(preds, targets):
    _, preds = torch.max(preds, dim=1)
    return (preds == targets).float().mean()

def mnist_loss(preds, targets):
    return nn.CrossEntropyLoss()(preds, targets)

and that seemed to work well enough :smiley: