Cnn_learner is not learning (losses aren't changing)

Hi all. I’m trying to run my own version of the first exercise and am running in to the following problem.
I can input my data in an ImageDataLoader but when I run the cnn_learner on that data the losses don;t change and my training takes 1 sec. Any thoughts? I’ve included the code below to help with diagnosing the problem.

path = ‘…/input/parrotfish/parrotfish’
fish = ImageDataLoaders.from_folder(path, valid_pct=0.2, seed=3, item_tfms = Resize(224))
len(fish.valid_ds), len(fish.train_ds)

(this returns lengths of 13,54 so the images are recognized).

fish.valid.show_batch(max_n=3, nrows=1) - just checking to see if images are showing up. So far so good.

learn = cnn_learner(fish, resnet34, metrics=error_rate)

And this happens:

epoch train_loss valid_loss error_rate time
0 nan 0.000000 0.000000 00:01
epoch train_loss valid_loss error_rate time
0 nan 0.000000 0.000000 00:01

Sorry for the formatting. Any thoughts out there? Thanks.

Hmm, I am not really sure, but one idea worth checking out would be printing the results of

len(fish.train), len(fish.valid)

That will show the number of batches in the train and valid DataLoaders. I’m interested in that because the default batch size is 64, and there’s less than 64 images in your train and valid datasets, so maybe it’s not forming batches correctly. Another thing you could try would be passing in a smaller batch size to ImageDataLoaders, like this:

ImageDataLoaders.from_folder(path, bs=4, valid_pct=0.2, seed=3, item_tfms=Resize(224))

Also, just so you know, you can format code in posts by putting 3 backticks around codeblocks, like this:

Thanks. I figured it out. My training labels weren’t loading properly so there was literally nothing to learn. Thanks in any case.

1 Like