Learner: predict_with_targs vs. predict - what's the difference?

dmg · April 26, 2018, 8:51pm

I see the learner has 2 predict functions. predict_with_targs & predict. What is the difference between the two?

Follow up question while I was trying to understand that myself opening the learner.py: Is this function recursively calling itself? I know it should be calling the model.py’s predict function at the end.

def predict(self, is_test=False, use_swa=False):
    dl = self.data.test_dl if is_test else self.data.val_dl
    m = self.swa_model if use_swa else self.model
    return predict(m, dl)

wdhorton · April 27, 2018, 6:01pm

It’s not recursively calling itself, because in Python to refer to the method you’d need to use self.predict. (I know in other languages like Java that’s not the case, which is probably the source of the confusion). Instead it’s using the function predict that came from the imports.

ramesh · April 27, 2018, 6:51pm

predict gives you only the predictions. predict_with_targs gives you both predictions and true Labels. That you can then use to measure Accuracy or feed to confusion matrix. Otherwise the code is the same -

def predict_with_targs(m, dl):
    preda,targa = predict_with_targs_(m, dl)
    return to_np(torch.cat(preda)), to_np(torch.cat(targa))


def predict(m, dl):
    preda,_ = predict_with_targs_(m, dl)
    return to_np(torch.cat(preda))

dmg · April 28, 2018, 2:00am

That’s very helpful, Thank you William!

dmg · April 28, 2018, 2:01am

That helps, Thank you Ramesh!

echan00 · May 29, 2018, 8:36am

Weirdly, I am receiving two different length np’s from predict_with_targs

I am using predict_with_targs on the IMDB sentiment classification from lesson 10:
log_preds, y = learn.predict_with_targs()

log_preds.shape is (39780, 2)
y.shape is (2699424,)

What am I doing wrong?

Also including my model summary below:

<bound method Learner.summary of SequentialRNN(
  (0): MultiBatchRNN(
(encoder): Embedding(60002, 400, padding_idx=1)
(encoder_with_dropout): EmbeddingDropout(
  (embed): Embedding(60002, 400, padding_idx=1)
)
(rnns): ModuleList(
  (0): WeightDrop(
    (module): LSTM(400, 1150)
  )
  (1): WeightDrop(
    (module): LSTM(1150, 1150)
  )
  (2): WeightDrop(
    (module): LSTM(1150, 400)
  )
)
(dropouti): LockedDropout(
)
(dropouths): ModuleList(
  (0): LockedDropout(
  )
  (1): LockedDropout(
  )
  (2): LockedDropout(
  )
)
  )
  (1): PoolingLinearClassifier(
(layers): ModuleList(
  (0): LinearBlock(
    (lin): Linear(in_features=1200, out_features=50, bias=True)
    (drop): Dropout(p=0.2)
    (bn): BatchNorm1d(1200, eps=1e-05, momentum=0.1, affine=True)
  )
  (1): LinearBlock(
    (lin): Linear(in_features=50, out_features=2, bias=True)
    (drop): Dropout(p=0.1)
    (bn): BatchNorm1d(50, eps=1e-05, momentum=0.1, affine=True)
  )
)
  )
)>

jeremy · May 29, 2018, 6:48pm

Sounds like you might have language model data, instead of classification data.

echan00 · May 30, 2018, 8:14am

Thanks, that was it! A bug in my code

cayman · August 15, 2018, 2:17pm

Hi, sorry to bother you but could you share the code you used to create the confusion matrix for the ULMFiT results? Thanks!

echan00 · August 15, 2018, 6:38pm

@cayman I don’t have the code handy with me now, but if you look at lesson 2 of fast.ai or the notebook that goes with it you will find it:

cm = confusion_matrix(val_classes, preds)
plot_confusion_matrix(cm, val_batches.class_indices)