Accuracy calculation problem


#1

I am trying to reproduce the learn.fit reported accuracy (in particular for the imdb classification notebook:

log_preds = learn.predict()
probs = np.exp(log_preds)
y_true = learn.data.val_y
y_pred = np.argmax(probs, axis=1)
acc = (y_true == y_pred).mean()
print('Accuracy:', acc)

Thus calculated accuracy on the validation set is much smaller than the accuracy reported by the learner, e.g.:

epoch      trn_loss   val_loss   accuracy                      
    0      0.218912   0.163022   0.9412    
    1      0.20981    0.169191   0.94068                       
    2      0.17804    0.157355   0.94472           

but the code above gives 0.5006 :frowning:

Any idea what is wrong? I would like to calculate different performance metrics like recall, precision, confusion matrix, etc. for multiclass problem.


(Even Oldridge) #2

If you mean the IMDB sentiment, given your accuracy is around 0.5 you’re looking at a random accuracy. Most likely the order of your dataset is getting mixed up somewhere and you’re comparing to the wrong prediction.


#3

Yes, IMDB sentiments. I run the notebook without other changes. Just adding the code for the accuracy evaluation at the bottom. Can someone else try this?


#4

Solved - it looks like the problem is in the sampling of the dataset: val_dl = DataLoader(val_ds, bs, transpose=True, num_workers=1, pad_idx=1, sampler=val_samp).

The following code with removed sampler gives the expected results:

val_dl = DataLoader(val_ds, bs, transpose=True, num_workers=1, pad_idx=1)
log_preds = predict(learn.model, val_dl)
y_pred = np.argmax(log_preds, axis=1)
y_true = md.val_y
print('Accuracy:', (y_pred == y_true).mean() )