Failing to complete training in Lesson 1 for own data

Hi all – I downloaded about 500 train and images for each category, and 50 validation images from google image search. I found some corrupted ones after some initial training hangs at x%, and was able to get it to progress further. However, I am now hanging like this (below) and cannot complete training. Suggestions?

53%|█████▎ | 8/15 [00:03<00:03, 2.32it/s]

AttributeError Traceback (most recent call last)
~/fastai/courses/dl1/fastai/dataset.py in open_image(fn)
227 try:
–> 228 im = cv2.imread(str(fn), flags).astype(np.float32)/255
229 if im is None: raise OSError(f’File not recognized by opencv: {fn}’)

AttributeError: ‘NoneType’ object has no attribute ‘astype’

The above exception was the direct cause of the following exception:

OSError Traceback (most recent call last)
in ()
1 arch=resnet34
2 data = ImageClassifierData.from_paths(PATH, tfms=tfms_from_model(arch, sz))
----> 3 learn = ConvLearner.pretrained(arch, data, precompute=True)
4 learn.fit(0.01, 4)

Thanks for any help.

_L

1 Like

Hope bumping isn’t against forum rules (I looked but didn’t see anything clarifying it.) Going to be spending some time on fast.ai tonight and would love any suggestions on next steps (other than get a different image set.)

Thanks!

-L

To me it looks like another corrupted or missing image. cv2.imread returns none instead of a numpy array of the image. You could try setting the pdb magic command

%pdb on

which will enter a debugging session after the exception occurs. You should be able to look at the variable fn and figure out which image is the offending one. If that doesn’t work try writing a for loop over all the images that you downloaded reading every one using

for filename in filenamelist:
    print(filename)
    cv2.imread(filename).astype(np.float32)/255

This will crash when you encounter problematic images and you will know the filename.

1 Like

Perfect, exactly what I needed, thanks. I had started down the imagemagick rabbit hole, but wasn’t sure how to get just the fails outputted in a python context – I’m coming from R so there’s some adjustment.

@bbrandt – Thank you for the help.

I purposefully chose data I thought would give poor results… Two families of Arum that are used as aquarium plants – Anubias and Bucephelandra. They look very similar, have identical structure and there are certain species that I mistake for each other all the time.

With ~ 450 images of each, 7 epochs, the model returned 90% accuracy (with some images that are just wrong – full aquarium shots, etc mixed in)… I haven’t done tweaking, augmentation, etc yet… I’ve doing data science professionally for some time but just… wow. See below for how visually similar there things are.

Thanks for putting this course together @jeremy I’m hooked.

1 Like

That’s an interesting data set, thanks for sharing the results.

I tried this but all my images are crashing. I used PIL to make Jpeg instead of jpg. Could this be the issue?

I am having the error AttributeError: ‘NoneType’ object has no attribute ‘astype’ . I used Google-image-download to collect my images but I can’t find out if they are corrupt or not.

Here is my my code https://github.com/KeeganJustis/Fast.ai2/blob/master/courses/dl1/lesson1.ipynb

Hey Keegan, is your path extension right? there’s no prefix to your filename


might be worth trying this instead (assuming you’re in dl1):
`cv2.imread(‘downloads/train/pho/’+filename).astype(np.float32)/255’

1 Like

That was it. Thanks Sam!

No worries, hope it goes well. it looks an interesting dataset

Sam and Keegan, that fix worked wonderfully for me too. Thanks for solving this!

Hey @bbrandt , is there a way to add a line to remove the corrupted / missing image ? I used batch downloaded 1000 files and now am manually removing each bad image …

lets say you saved the corrupted data path in variable named badpath,
then in the jupyter cell you can write:
!rm {badpath}

Hi I am also having this issue - my path variable is correct and running cv2.imread using listdir works - how can I use this path with the fastai methods:
arch=resnet34
data= ImageClassifierData.from_paths(PATH,tfms=tfms_from_model(arch,sz))
learn= ConvLearner.pretrained(arch,data,precompute=True)
learn.fit(0.01,3)