Following Jeremy’s notebook from lesson 1 and 2 (PETS dataset), my main purpose is to find out which training files(i.e. filename) are classified incorrectly during training process. I’d like to get the loss value, the probability for each class, and the actual label for the training data. I can accomplish this in two ways:
- ClassificationInterpretation.top_losses() , which return loss and index.
- DatasetFormatter().from_toplosses(), which return dataset and index.
After training a model, i would create an ClassificationInterpretation object by
Where learn is a resnet34 model i’ve fitted with 4 epoch. What’s weird is:
- Everytime I rerun the from_learner() method, I might not get the same number of misclassified images from the confusion matrix. Since the learner is done learning, and the dataset is the same, shouldn’t the number of misclassified images be the same every time?
- The number of losses and indexes return from the top_losses() method doesn’t seem to match the actual number of training dataset.
- Everytime I rerun the from_learner() method, and the top_losses() method will return a different set of index. I’m not sure how to tie it back to the original filename since the index value keep changing
- similar behavior when I call the
the index(argmax) where the maximum value of the loss value is stored, changes every time the method is called.
Can anybody help me with these?
i shared my notebook here.
This problem doesn’t seem to happen when working on validation portion of the data.