How to get more details on validation results

Hugues1965 · August 4, 2018, 5:35pm

Hi guys,

In lesson 1, once we’ve got through the whole code (after unfreezing), i get a nice accuracy for my validation set. But now i would like to get mode details: can i get a list or array showing:

original file name
how the model has classified it

thanks

TheShadow29 · August 5, 2018, 4:53am

data.val_ds.fnames gives you file names. learn.predict does a forward pass and gives you the model prediction.

Hugues1965 · August 5, 2018, 3:53pm

thanks @TheShadow29

So I ran data.val_ds.fnames[1] and i can see a file name, not the first in the list, seems to be in a random order.

Then I ran
log_preds = learn.predict()
log_preds[:10]

and I get:
array([[-0.22792, -1.59054],
[-0.28312, -1.40012],
[-0.1861 , -1.77307],
[-0.49857, -0.93496],
[-0.62311, -0.76846],
[-1.2065 , -0.35559],
[-1.10129, -0.40413],
[-1.28512, -0.32382],
[-1.3519 , -0.29941],
[-0.5383 , -0.87644]], dtype=float32)

Questions:

what are these 2 values ? it does not look like my classification, which should be between 0 and 1
and how can I link the classification and a file name ?

stephenjohnson · August 5, 2018, 6:59pm

Check out my post here How to predict on the test set

TheShadow29 · August 6, 2018, 3:26am

Python has 0 indexing. So the first in the list would be data.val_ds.fnames[0]. However if you are using Image Classifier from paths the order could be arbitrary.
Log preds are the log probabilities. You need to do np.exp to get the probabilities of the particular image to the particular class. See @stephenjohnson solution as well.

Hugues1965 · August 6, 2018, 7:13am

thanks @stephenjohnson

When I run your code:
log_preds = learn.predict(is_test=True)
after training my model for the first time in lesson 1, I get this error message
'NoneType' object is not iterable

Complete error message is:
TypeError Traceback (most recent call last)
in ()
----> 1 log_preds = learn.predict(is_test=True)
2 preds = np.argmax(log_preds, axis=1)

~/fastai/courses/dl1/fastai/learner.py in predict(self, is_test, use_swa)
355 dl = self.data.test_dl if is_test else self.data.val_dl
356 m = self.swa_model if use_swa else self.model
–> 357 return predict(m, dl)
358
359 def predict_with_targs(self, is_test=False, use_swa=False):

~/fastai/courses/dl1/fastai/model.py in predict(m, dl)
232
233 def predict(m, dl):
–> 234 preda,_ = predict_with_targs_(m, dl)
235 return np.concatenate(preda)
236

~/fastai/courses/dl1/fastai/model.py in predict_with_targs_(m, dl)
244 if hasattr(m, ‘reset’): m.reset()
245 res = []
–> 246 for *x,y in iter(dl): res.append([get_prediction(to_np(m(*VV(x)))),to_np(y)])
247 return zip(*res)
248

TypeError: ‘NoneType’ object is not iterable

TheShadow29 · August 6, 2018, 9:47am

That is because you need to do on validation and not test. Do learn.predict(is_test=False) or equivalently just learn.predict()

Hugues1965 · August 6, 2018, 3:01pm

ok, I’m almost there, I’m starting to get the list I want, although some questions remain.
This is the code I run in Lesson 1 after the first training round:

Which gives an output per following:
(‘valid/down/snaps177787_-1.png’, ‘down’, array([0.93993, 0.06007], dtype=float32))
(‘valid/down/snaps177787_-1.png’, ‘down’, array([0.93993, 0.06007], dtype=float32))
…(one line for each file)

Note that the file name is always the same, which is not correct because there are 684 files in this directory. The second parameter is sometimes ‘down’, sometimes ‘up’ , which is possible, the last parameter, array, is only giving two pairs of numbers, which does not seem right.

If I modify my code by increasing the indent of the last line (itemIndex = itemIndex + 1), then I get the following output:
(‘valid/down/snaps177787_-1.png’, ‘down’, array([0.93993, 0.06007], dtype=float32))
(‘valid/down/snaps101057_-1.png’, ‘down’, array([0.93993, 0.06007], dty

Which seems better because the file name is changing, although the array parameters are not, I also get an error message after 5 lines:
IndexError: index 5 is out of bounds for axis 0 with size 5

I’m getting close I think.

stephenjohnson · August 6, 2018, 7:39pm

1.) Yes the line itemIndex = itemIndex + 1 must be indented. That was a cut/paste typo in my original post. If it isn’t indented then itemIndex isn’t incremented in the loop which it needs to be.

2.) You’ve inserted the lines log_preds,y = learn.TTA() and probs = np.mean(np.exp(log_preds),0) into the middle of the code snippets from my post so the variable log_preds is being modified. Try removing those lines.

TheShadow29 · August 6, 2018, 8:16pm

learn.TTA() returns 5 x num_items x num_classes. TTA is test time augmentations, so you would need to average along those 5 values.

So it should be something like:

log_preds, y = learn.TTA()
log_preds = np.mean(log_preds, axis=0)
probs = np.mean(log_preds)
preds = np.argmax(probs, axis=1)
for ind, maxInd in enumerate(preds):
    print((data.val_ds.fnames[ind], data.classes[maxInd], np.exp(log_preds[ind][maxInd])))

The error you are getting is because 1. last line isn’t indented properly, 2. your log_preds have dimension as said above. First get the log_preds as the mean of the 5 values and the other code should work as is.

Hugues1965 · August 7, 2018, 9:24am

Indeed, copy/paste error from mine, I was testing different solutions I got from other posts, sorry for that.

Now this code is doing what I wanted in my OP:

So the last number given by:
np.exp(log_preds[itemIndex][maxIndex]))
is the probability that the picture is part of the class given in the second parameter, right ?

Thanks a lot for all your help guys !

TheShadow29 · August 7, 2018, 9:37am

@Hugues1965 yes this is correct.