I am using the following at the Evaluate section of the imdb lesson 10 notebook (on another text dataset):
pre_preds, actuals = learn.predict_with_targs()
dl = learn.data.val_dl
x_gt = []
y_preds = []
for x,y in iter(dl):
x_gt.append(x)
y_preds.append(y)
pre_preds.shape, len(actuals), len(y_preds)*len(y_preds[0]), len(x_gt)*x_gt[0].shape[1]
>>((2552, 3), 2552, 2568, 2568)
pred with targs iterates over the same dl (as below), so why do I get an extra 16 text sequences from the dataloader above?
def predict_with_targs_(m, dl):
m.eval()
if hasattr(m, 'reset'): m.reset()
res = []
for *x,y in iter(dl): res.append([get_prediction(to_np(m(*VV(x)))),to_np(y)])
return zip(*res)
I want to check predictions against actual text, hence I want to ensure alignment between the two.