Training MNIST dataset

Hey guys!

I have completed lesson 1 and since then I try to use different datasets.
Currently I’m trying to train MNIST dataset but it is extremely slow.

I simply download all the dataset using the untar_data function.

Then I do this
learn = cnn_learner(data, models.resnet34, metrics=error_rate)

The training process is slow. It takes more than an hour just to pass 10%.

I tried to change the bs parameter in the ImageDataBunch but it’s not working

My question is: how can I train this dataset faster?

What environment are you using? Are you enabling the GPU?

Yes. I’m using Google Colab and GPU is enabled

Interesting. I’ve done plenty faster with colab. Can you share the notebook?

Yes. Here is the notebook:

%reload_ext autoreload
%autoreload 2
%matplotlib inline

from import *
from fastai.metrics import error_rate

from google.colab import drive
drive.mount('/content/gdrive', force_remount=True)
root_dir = "/content/gdrive/My Drive/"
base_dir = root_dir + 'fastai-v3/mnist'

path = untar_data(URLs.MNIST_SAMPLE, None, base_dir + '/data')

tfms = get_transforms(do_flip=False)
data = ImageDataBunch.from_folder(path, ds_tfms=tfms, size=26, bs=20)

learn = cnn_learner(data, models.resnet34, metrics=error_rate)

GPU is enabled

1 Like

Try increasing your batch size. Else I’ll run it myself in a moment

Ok I tried to increase the batch size but it still quite slow. The numbers I tried are 64 and 256.
Is it supposed to be slow for this dataset?

No, normally it’s fast. Give me a little and I’ll run it myself to see if I can debug it.

I could not reproduce it with exactly what you used. One epoch took 19 seconds. You went to ‘Runtime’, ‘Change Runtime Type’ and you selected GPU? Without GPU it took ~3 minutes.

Here’s the code I used:

from import *
from fastai.metrics import error_rate

path = untar_data(URLs.MNIST_SAMPLE)
data = ImageDataBunch.from_folder(path, ds_tfms=get_transforms(do_flip=False),
learn = cnn_learner(data, models.resnet34, metrics=error_rate)

Ok for some reason, when I restarted the environment it suddenly became fast. Thank you for the help

No problem! Glad we got it sorted out :slight_smile:

1 Like

Does has fashion MNIST data? If not how would I create data bunch from image and label in csv file?

Fastai has the MNIST handwritten digit, and it uses CSV file.
Here is the notebook for you to try. You see download the csv file to see how it work.

For more detail, please read the part1 videos or watch the doc.
You can see the labels csv file is in the minist_sample folder. The minist_sample file has train and valid folder. Both of them contain the folders of your classes. For example, 3 folder has images of 3.

In the end, you need to make sure inside the csv file has this format:![image|690x377]
for this case, 0 is for class 3 and 1 is for class 7.
or this:
For this, I think you understand why there is a label 3 under folder 3, and it is the same with 7.