How to get more details on validation results

Hi guys,

In lesson 1, once we’ve got through the whole code (after unfreezing), i get a nice accuracy for my validation set. But now i would like to get mode details: can i get a list or array showing:

  • original file name
  • how the model has classified it

thanks

data.val_ds.fnames gives you file names. learn.predict does a forward pass and gives you the model prediction.

thanks @TheShadow29

So I ran data.val_ds.fnames[1] and i can see a file name, not the first in the list, seems to be in a random order.

Then I ran
log_preds = learn.predict()
log_preds[:10]

and I get:
array([[-0.22792, -1.59054],
[-0.28312, -1.40012],
[-0.1861 , -1.77307],
[-0.49857, -0.93496],
[-0.62311, -0.76846],
[-1.2065 , -0.35559],
[-1.10129, -0.40413],
[-1.28512, -0.32382],
[-1.3519 , -0.29941],
[-0.5383 , -0.87644]], dtype=float32)

Questions:

  • what are these 2 values ? it does not look like my classification, which should be between 0 and 1
  • and how can I link the classification and a file name ?

Check out my post here How to predict on the test set

Python has 0 indexing. So the first in the list would be data.val_ds.fnames[0]. However if you are using Image Classifier from paths the order could be arbitrary.
Log preds are the log probabilities. You need to do np.exp to get the probabilities of the particular image to the particular class. See @stephenjohnson solution as well.

thanks @stephenjohnson

When I run your code:
log_preds = learn.predict(is_test=True)
after training my model for the first time in lesson 1, I get this error message
'NoneType' object is not iterable

Complete error message is:
TypeError Traceback (most recent call last)
in ()
----> 1 log_preds = learn.predict(is_test=True)
2 preds = np.argmax(log_preds, axis=1)

~/fastai/courses/dl1/fastai/learner.py in predict(self, is_test, use_swa)
355 dl = self.data.test_dl if is_test else self.data.val_dl
356 m = self.swa_model if use_swa else self.model
–> 357 return predict(m, dl)
358
359 def predict_with_targs(self, is_test=False, use_swa=False):

~/fastai/courses/dl1/fastai/model.py in predict(m, dl)
232
233 def predict(m, dl):
–> 234 preda,_ = predict_with_targs_(m, dl)
235 return np.concatenate(preda)
236

~/fastai/courses/dl1/fastai/model.py in predict_with_targs_(m, dl)
244 if hasattr(m, ‘reset’): m.reset()
245 res = []
–> 246 for *x,y in iter(dl): res.append([get_prediction(to_np(m(*VV(x)))),to_np(y)])
247 return zip(*res)
248

TypeError: ‘NoneType’ object is not iterable

That is because you need to do on validation and not test. Do learn.predict(is_test=False) or equivalently just learn.predict()

ok, I’m almost there, I’m starting to get the list I want, although some questions remain.
This is the code I run in Lesson 1 after the first training round:

Which gives an output per following:
(‘valid/down/snaps177787_-1.png’, ‘down’, array([0.93993, 0.06007], dtype=float32))
(‘valid/down/snaps177787_-1.png’, ‘down’, array([0.93993, 0.06007], dtype=float32))
…(one line for each file)

Note that the file name is always the same, which is not correct because there are 684 files in this directory. The second parameter is sometimes ‘down’, sometimes ‘up’ , which is possible, the last parameter, array, is only giving two pairs of numbers, which does not seem right.

If I modify my code by increasing the indent of the last line (itemIndex = itemIndex + 1), then I get the following output:
(‘valid/down/snaps177787_-1.png’, ‘down’, array([0.93993, 0.06007], dtype=float32))
(‘valid/down/snaps101057_-1.png’, ‘down’, array([0.93993, 0.06007], dty

Which seems better because the file name is changing, although the array parameters are not, I also get an error message after 5 lines:
IndexError: index 5 is out of bounds for axis 0 with size 5

I’m getting close I think.

1.) Yes the line itemIndex = itemIndex + 1 must be indented. That was a cut/paste typo in my original post. If it isn’t indented then itemIndex isn’t incremented in the loop which it needs to be.

2.) You’ve inserted the lines log_preds,y = learn.TTA() and probs = np.mean(np.exp(log_preds),0) into the middle of the code snippets from my post so the variable log_preds is being modified. Try removing those lines.

learn.TTA() returns 5 x num_items x num_classes. TTA is test time augmentations, so you would need to average along those 5 values.

So it should be something like:

log_preds, y = learn.TTA()
log_preds = np.mean(log_preds, axis=0)
probs = np.mean(log_preds)
preds = np.argmax(probs, axis=1)
for ind, maxInd in enumerate(preds):
    print((data.val_ds.fnames[ind], data.classes[maxInd], np.exp(log_preds[ind][maxInd])))

The error you are getting is because 1. last line isn’t indented properly, 2. your log_preds have dimension as said above. First get the log_preds as the mean of the 5 values and the other code should work as is.

1 Like

Indeed, copy/paste error from mine, I was testing different solutions I got from other posts, sorry for that.

Now this code is doing what I wanted in my OP:

So the last number given by:
np.exp(log_preds[itemIndex][maxIndex]))
is the probability that the picture is part of the class given in the second parameter, right ?

Thanks a lot for all your help guys !

1 Like

@Hugues1965 yes this is correct.