I am trying to reproduce the **learn.fit** reported accuracy (in particular for the imdb classification notebook:

```
log_preds = learn.predict()
probs = np.exp(log_preds)
y_true = learn.data.val_y
y_pred = np.argmax(probs, axis=1)
acc = (y_true == y_pred).mean()
print('Accuracy:', acc)
```

Thus calculated accuracy on the validation set is much smaller than the accuracy reported by the learner, e.g.:

```
epoch trn_loss val_loss accuracy
0 0.218912 0.163022 0.9412
1 0.20981 0.169191 0.94068
2 0.17804 0.157355 0.94472
```

but the code above gives 0.5006

Any idea what is wrong? I would like to calculate different performance metrics like recall, precision, confusion matrix, etc. for multiclass problem.