#na# value in valid_loss when doing fit_one_cycle after ImageCleaner

Hello everyone,

I used the ImageCleaner for cleaning up the data as suggested in lesson 2. However, when I try to fit the learner with the new databunch I have two issues.
Issue 1: as the model is fitting the valid_loss column shows #na# values. That does not happen with the data bunch previous to using ImageCleaner.
Issue 2: if I ignore the #na# values and try to use ClassificationInterpretation.from_learner(), then I get: “IndexError: index 0 is out of bounds for axis 0 with size 0”.

has anyone have the same problem?

2 Likes

If you are getting nan values in valid loss, then maybe you should manually check the validation set.

Thanks! I will try to figure out how to do that.

Hello Monica ,i meet the same problem that all the valid_loss are #na#,can i ask you how to solve this problem
i use ImageDataBunch to get the dataset cifar10

I am also facing the same issue. @MDGODanille are you able to solve it?

I am able to solve the issue. I was using

.split_none()

while creating databunch using

imageList.from_df

instead of

split_by_rand_pct(0.2)

I hope this might help you as well.

4 Likes

Sorry Karana, I was very busy the past few weeks and could not look into the code. But Jasmeet suggested something below that we could try.

Sorry I could not look into this before, I was busy for the past weeks, but thank you for your suggestion. I will give it a try

Hi Jasmeet, I just tried your solution and it did work as well!, Thank you very much!

1 Like

Great! it work for you as well :slight_smile:

im having the same issue. how would you recommend manually checking the validation set?
I used my_data.valid_ds and my_data.valid_dl methods but I don’t understand the outputs enough for that to be helpful. Do you have another suggestion?

im using

np.random.seed(42)
data = ImageDataBunch.from_folder(path, train=".", valid_pct=0.2,
ds_tfms=get_transforms(), size=224, num_workers=4).normalize(imagenet_stats)

learn = cnn_learner(data, models.resnet34, metrics=error_rate)

learn.fit_one_cycle(4)

my cnn seems to work fine with the fit_one_cycle but when I try to learn.lr_find() I get #na# values in the valid_loss column.

3 Likes

You can refer to this post for some details.

In my case, I did not undertand the concept of validation. By this mistake now I learnt it. If it happens to you, read about validation. You would need to have a folder created for validation. Now it works for me.

I am having exactly the same problem as you. Did you find any solution to that ? I have tried tweaking almost everything but it keeps on happening (even on different datasets). Someone help me please…

Hey,

I read the whole thread and I looked up my validation set to make sure it was consistent but the problem remains the same for me.

Whenever I unfreeze my learner (even before using the cleaner), lr_find() pops #na# valid_loss run by run, while train loss seems to be computing fine.

However, fit_one_cycle() computes the validation loss just fine !

I explored the documentation for lr_find and other thread but I can’t find a proper and definitive answer as to why lr_find would compute valid_loss with #na#.

Anybody covered this already ?

3 Likes

I have the same issue, although I’m still able to run the later lines of code successfully. Can anyone else verify? It might just be a benign bug.

I had the same issue and I was using from_folder method. upon checking I found that the valid LabelList had zero items. so I checked my validation folder and found a file which was not an image. the .DS_store of mac computers. I deleted it and everything worked perfectly. so my advice, check if the folder that contains the validation data has non image files. these files could be hidden. maybe use the terminal with ls -la to list all of them.

1 Like

Hi elie, thank you for your input. Do you mind sharing how you did find the validation folder? I am on colab and I am not able to see any validation folders created :confused:

Hi @GrigorijSchleifer. I first connect colab to my google drive using the following:

from google.colab import drive
drive.mount(’/content/drive’)

Then I change the path of download to be one of the directories in my drive. in your case, the valid folder will be created in your drive and you can see it listed. here is an example of how I downloaded orange dataset in a folder that I named navel inside a folder called oranges:

folder = ‘navel’
file = ‘navel.csv’
path = Path(’/content/drive/My Drive/FASTAI-DL/course-v3/nbs/dl1/data/oranges’)
dest = path/folder
dest.mkdir(parents=True, exist_ok=True)
download_images(path/file, dest, max_pics=200)

Alternatively, you can change fast AI config to let it use your drive directories instead of the default ones.

Config.DEFAULT_CONFIG = {
‘data_path’: ‘/content/drive/My Drive/your folder to data’,
‘model_path’: ‘/content/drive/My Drive/folder where you want to keep the model’
}

You can then save the configurations in a config file so that you don’t have to repeat the above steps.

Config.create(’/content/drive/My Drive/NLPKIN/myconfig.yml’)
Config.DEFAULT_CONFIG_PATH = ‘/content/drive/My Drive/NLPKIN/myconfig.yml’

1 Like