Using fastai's code as an example fails to predict correctly

I need to understand what is going wrong when I use the code from the lessons for my experiments. I use the following code to train:

1 from fastai.vision.all import *
2 if __name__ == '__main__':
3     path = 'C:/training/cats/'
4     #the following line was added because of Microsoft Windows and computer 
5     num_workers=0
6     silence = DataBlock(
7        blocks=(ImageBlock, CategoryBlock),
8        get_items=get_image_files,
9        splitter=RandomSplitter(valid_pct=0.2, seed=42),
10       get_y=parent_label,
11       item_tfms=Resize(224, ResizeMethod.Pad, pad_mode='zeros'))
12   dls = silence.dataloaders(path)
13   learn = cnn_learner(dls, resnet18,
14       model_dir='C:/Users/jbiss/workspace/AI/VoiceRecognition/models/',
15       metrics=error_rate)
16   learn.fine_tune(4)
17   dls.valid.show_batch(max_n=4, nrows=1)
18   dls.train.show_batch(max_n=4, nrows=1)
19   learn.export()
  • Line 3: trains on Abyssinian cats from oxford-iiit-pet downloaded in the class
  • Lines 6-11: my DataBlock copied from the 02_production.ipynb file
  • Line 11: alternate transform due to some problem that the process had when working with B&W images that were all resized to 224x224

Training results in the following:

epoch	train_loss	valid_loss	error_rate	time
0		0.000000	0.000000	0.000000	00:39
epoch	train_loss	valid_loss	error_rate	time
0		0.000000	0.000000	0.000000	00:36
1		0.000000	0.000000	0.000000	00:37
2		0.000000	0.000000	0.000000	00:37
3		0.000000	0.000000	0.000000	00:37

cat_train_output
cat_valid_output

I test that model using the following image:
speech_0

with the following code:

from fastai.vision.all import *
from PIL import Image
import os
  
print('__name__ is: ', __name__)
 
if __name__ == '__main__':
    learn_inf = load_learner('C:/models/export.pkl')
    pred,pred_idx,probs = learn_inf.predict('C:/spectrum_files/spec_1.png')
    print('pred is: ', pred, ', pred_idx is: ', pred_idx, ' and probs is: ', probs)
else: print('cannot do it')

and get the following results:
pred is: cats , pred_idx is: tensor(0) and probs is: tensor([1.])

So, what is going on in my Notebook that results in a totally incorrect prediction but a correct prediction in the course?

what is your other labels besides cat? if i recall, parent_label relies on the folder structure to determine the label.

if the training data is classifying cats or specturms, then the model will know what to do, but if classifying dogs vs cats, and you pass in a spectrum, the model will classify on something it doesn’t understand.

The file structure is only “cats”, the first 200 Abyssinian downloaded in the lesson. There is no information in the book, at least that I see, that discusses what is actually needed for any given neural network to work. Therefore, in this case, it should not recognize anything else and submit a “False” result for anything but an Abyssinian.

for any nn to work, you need data. however, a nn classifier is good at telling this vs that, ie dogs or cats, but as good at dogs or not dogs.

if you look at https://github.com/fastai/fastbook/blob/master/02_production.ipynb, search for this line
Our folder has image files, as we’d expect:

you’ll see the paths of the folders are

bears/black
bears/teddy
bears/ …

where each folder is the label. you mention the book but the videos where I did most of the learning (that is what worked for me, I think everyone is different)
the first time I took the class, I probably rewatched each video three times.

1 Like

OK, I was wondering about whether my single directory of images, in this only Abyssinian cats, would work for the reason that you bring up, that there are a number of directories with various bears in the fastai lesson. In the case of the pets, each file is labeled as a specific breed of cat or dog that allows the NN to learn one from another, not any one in isolation from everything else in the world.

I need to experiment to see if that is what is wrong. However, I trained with one specific thing in isolation, to see if the model would correctly say whether a test image was of that thing or not, true or false, because people train their models for facial recognition. I assumed that they did it the way I am, supervised learning about one thing, faces. However, now that you bring that up it does seem to make sense that supervised learning might need “false” images for comparison. What I’ve read so far doesn’t get into that part of the practice. I’ll try again along that path.

Thanks.

Cool. Just for a sneak peek, one way to do facial stuff is a siamese network. it uses all the same stuff you are learning now, but applying it to a augmented dataset.

Well it works. Thanks for helping me readjust my thinking! I have to go back and reread to find out where the sources say that for a NN to work at least two sets of images are needed to compare. It makes sense now, but I’d like to find it.

And thanks for that deep learning link, it’s a great example.

1 Like