Chapter 4 MNIST digit recogniser

anasjehani · October 1, 2023, 9:32pm

Hi Everyone.

As recommended by the chapter in further research, I tried to make a digit recogniser that works for all 9 digits. I tried the following.

df = pd.read_csv('/kaggle/input/digit-recognizer/train.csv')
dblock = DataBlock(get_x = lambda x: x[1:].values, get_y = lambda y: y[0])
dls = dblock.dataloaders(df,batch_size=34)
learn = cnn_learner(dls, resnet34,loss_func = nn.CrossEntropyLoss(),metrics=error_rate,n_out=9)
learn.fit(10, lr=1e-5)

I received this error

--------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In[241], line 1
----> 1 learn.fit(10, lr=1e-5)

File /opt/conda/lib/python3.10/site-packages/fastai/learner.py:264, in Learner.fit(self, n_epoch, lr, wd, cbs, reset_opt, start_epoch)
    262 self.opt.set_hypers(lr=self.lr if lr is None else lr)
    263 self.n_epoch = n_epoch
--> 264 self._with_events(self._do_fit, 'fit', CancelFitException, self._end_cleanup)

File /opt/conda/lib/python3.10/site-packages/fastai/learner.py:199, in Learner._with_events(self, f, event_type, ex, final)
    198 def _with_events(self, f, event_type, ex, final=noop):
--> 199     try: self(f'before_{event_type}');  f()
    200     except ex: self(f'after_cancel_{event_type}')
    201     self(f'after_{event_type}');  final()

File /opt/conda/lib/python3.10/site-packages/fastai/learner.py:253, in Learner._do_fit(self)
    251 for epoch in range(self.n_epoch):
    252     self.epoch=epoch
--> 253     self._with_events(self._do_epoch, 'epoch', CancelEpochException)

File /opt/conda/lib/python3.10/site-packages/fastai/learner.py:199, in Learner._with_events(self, f, event_type, ex, final)
    198 def _with_events(self, f, event_type, ex, final=noop):
--> 199     try: self(f'before_{event_type}');  f()
    200     except ex: self(f'after_cancel_{event_type}')
    201     self(f'after_{event_type}');  final()

File /opt/conda/lib/python3.10/site-packages/fastai/learner.py:247, in Learner._do_epoch(self)
    246 def _do_epoch(self):
--> 247     self._do_epoch_train()
    248     self._do_epoch_validate()

File /opt/conda/lib/python3.10/site-packages/fastai/learner.py:239, in Learner._do_epoch_train(self)
    237 def _do_epoch_train(self):
    238     self.dl = self.dls.train
--> 239     self._with_events(self.all_batches, 'train', CancelTrainException)

File /opt/conda/lib/python3.10/site-packages/fastai/learner.py:199, in Learner._with_events(self, f, event_type, ex, final)
    198 def _with_events(self, f, event_type, ex, final=noop):
--> 199     try: self(f'before_{event_type}');  f()
    200     except ex: self(f'after_cancel_{event_type}')
    201     self(f'after_{event_type}');  final()

File /opt/conda/lib/python3.10/site-packages/fastai/learner.py:205, in Learner.all_batches(self)
    203 def all_batches(self):
    204     self.n_iter = len(self.dl)
--> 205     for o in enumerate(self.dl): self.one_batch(*o)

File /opt/conda/lib/python3.10/site-packages/fastai/learner.py:235, in Learner.one_batch(self, i, b)
    233 b = self._set_device(b)
    234 self._split(b)
--> 235 self._with_events(self._do_one_batch, 'batch', CancelBatchException)

File /opt/conda/lib/python3.10/site-packages/fastai/learner.py:199, in Learner._with_events(self, f, event_type, ex, final)
    198 def _with_events(self, f, event_type, ex, final=noop):
--> 199     try: self(f'before_{event_type}');  f()
    200     except ex: self(f'after_cancel_{event_type}')
    201     self(f'after_{event_type}');  final()

File /opt/conda/lib/python3.10/site-packages/fastai/learner.py:216, in Learner._do_one_batch(self)
    215 def _do_one_batch(self):
--> 216     self.pred = self.model(*self.xb)
    217     self('after_pred')
    218     if len(self.yb):

File /opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py:1501, in Module._call_impl(self, *args, **kwargs)
   1496 # If we don't have any hooks, we want to skip the rest of the logic in
   1497 # this function, and just call forward.
   1498 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
   1499         or _global_backward_pre_hooks or _global_backward_hooks
   1500         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1501     return forward_call(*args, **kwargs)
   1502 # Do not call functions when jit is used
   1503 full_backward_hooks, non_full_backward_hooks = [], []

File /opt/conda/lib/python3.10/site-packages/torch/nn/modules/container.py:217, in Sequential.forward(self, input)
    215 def forward(self, input):
    216     for module in self:
--> 217         input = module(input)
    218     return input

File /opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py:1501, in Module._call_impl(self, *args, **kwargs)
   1496 # If we don't have any hooks, we want to skip the rest of the logic in
   1497 # this function, and just call forward.
   1498 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
   1499         or _global_backward_pre_hooks or _global_backward_hooks
   1500         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1501     return forward_call(*args, **kwargs)
   1502 # Do not call functions when jit is used
   1503 full_backward_hooks, non_full_backward_hooks = [], []

File /opt/conda/lib/python3.10/site-packages/torch/nn/modules/container.py:217, in Sequential.forward(self, input)
    215 def forward(self, input):
    216     for module in self:
--> 217         input = module(input)
    218     return input

File /opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py:1501, in Module._call_impl(self, *args, **kwargs)
   1496 # If we don't have any hooks, we want to skip the rest of the logic in
   1497 # this function, and just call forward.
   1498 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
   1499         or _global_backward_pre_hooks or _global_backward_hooks
   1500         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1501     return forward_call(*args, **kwargs)
   1502 # Do not call functions when jit is used
   1503 full_backward_hooks, non_full_backward_hooks = [], []

File /opt/conda/lib/python3.10/site-packages/torch/nn/modules/conv.py:463, in Conv2d.forward(self, input)
    462 def forward(self, input: Tensor) -> Tensor:
--> 463     return self._conv_forward(input, self.weight, self.bias)

File /opt/conda/lib/python3.10/site-packages/torch/nn/modules/conv.py:459, in Conv2d._conv_forward(self, input, weight, bias)
    455 if self.padding_mode != 'zeros':
    456     return F.conv2d(F.pad(input, self._reversed_padding_repeated_twice, mode=self.padding_mode),
    457                     weight, bias, self.stride,
    458                     _pair(0), self.dilation, self.groups)
--> 459 return F.conv2d(input, weight, bias, self.stride,
    460                 self.padding, self.dilation, self.groups)

RuntimeError: Expected 3D (unbatched) or 4D (batched) input to conv2d, but got input of size: [34, 784

Do you know how I can fix this error? As an aside should I be using cnn_learner for pixel data?

BobMcDear · October 2, 2023, 12:49pm

Hello,

There are a few issues with the snippet you have provided. First, ResNet-34 is a convolutional neural net that expects a channel dimension and two spatial dimensions (height and width) per sample. However, your data loader supplies it with flattened MNIST images, which are 784-dimensional vectors and should be reshaped to [1, 28, 28]. That can be achieved by replacing the get_x function with lambda x: x[1:].values.reshape(1, 28, 28).

Second, ResNet-34 operates on three-channelled data specifically, but MNIST contains only one channel because it is not coloured. cnn_learner must be informed of this disparity by passing n_in = 1 to it.

Next, Kaggle’s MNIST dataset is stored as integers in the range [0, 255], whereas the network requires them to be floats scaled down to the range [0, 1]. Therefore, get_x must also convert the data type using astype(np.float32) and thereafter divide by 255.

Finally, there are actually 10 digits in MNIST (0 through 9), so n_out should be 10 as well.

Putting together the aforementioned alterations, the revised code would be,

df = pd.read_csv('/kaggle/input/digit-recognizer/train.csv')
dblock = DataBlock(get_x = lambda x: x[1:].values.reshape(1, 28, 28).astype(np.float32) / 255,
                   get_y = lambda y: y[0])
dls = dblock.dataloaders(df,batch_size=34)
learn = cnn_learner(dls, resnet34,loss_func = nn.CrossEntropyLoss(),metrics=error_rate,n_out=10,
                    n_in=1)
learn.fit(10, lr=1e-5)

Don’t hesitate to reach out should you encounter further problems.

AllenK · October 2, 2023, 7:08pm

cnn_learner is also deprecated. Use vision_learner. if working through the book, you can find the notebooks in the repo. GitHub - fastai/fastbook: The fastai book, published as Jupyter Notebooks (As far I can see, they have been updated to use vision_learner.). Also the repo for the current Part 1 course is here. GitHub - fastai/course22: The fast.ai course notebooks

Vision tutorial for multiclass. fastai - Computer vision intro
Datasets including full MNIST: fastai - External data

anasjehani · October 2, 2023, 10:15pm

I really appreciate your response, it was very thorough and clear. Do you know whether fastai favours images over dataframes? In my code I convert the csv file into a dataframe but it’s very slow and was unwieldy for me to figure out. I saw this Digit Recognizer - FastAI v2 - 2020 | Kaggle , where the person is able to fine-tune much faster by converting to PILImages. Thanks again for your response!

anasjehani · October 2, 2023, 10:16pm

Thanks a lot!

AllenK · October 2, 2023, 11:06pm

Yes. Images will be faster for image data, rather than reading the equivalent data from a dataframe.

For example, this walkthrough downloads files into folders and then uses the filenames to load the image data and the folders to determine the categories/classes.

Loading images from a file listing and converting to image tensors via PIL is handled in the background for you in the datablock layer, via ImageBlock. You can dig deeper into that in the intermediate tutorials. fastai - Data block tutorial

—-

(Historically, having MNIST data in csv is relatively framework & system agnostic; as well as easily shareable. Thankfully, most systems/tools can now handle reading image data from image files)

anasjehani · October 2, 2023, 11:32pm

Thanks a lot! I’ve been struggling now for a few days to try to get my learner to cooperate with dataframes but it was a lot easier to just convert to a PILImage. The book, at least in the beginning, only really shows how to handle images; often starting from untar_data() rather than a csv file. Thanks a lot for the help again, far better than chatgpt which just sucked me into a rabbit hole of nonsense.

BobMcDear · October 3, 2023, 11:59am

To add to AllenK’s answer, it is also helpful to bear in mind that for smaller datasets such as MNIST, it is substantially faster to store the data in memory and read slices of it during training as needed if resources permit.