Calculating Accuracy on Test Set with Labels

I am following the basic formula of the dog breeds example to test a bunch of models on a large dataset I generated. Because I generated the dataset myself, I do have groundtruth labels for the test set.

I am wondering what the most idiosyncratic method of calculating accuracy on the test set would be, according to the fastai framework.

What I’ve done is to merge a ‘testlabels.csv’ with a dataframe made from the probs output from the test dataloader and then calculate accuracy by comparing columns in the merged dataframe, but I reckon there’s probably a more elegant way of doing this.

Here’s what I’ve done so far:

Any suggestions would be much appreciated!

You can use the accuracy_np(preds, targs) function. That function is in the dog breed notebook but in that case there’s no label for the test set so it’s not used (it’s commented out).

Thanks for the reply. When I do this, though, passing in the vector of truth values (0 or 1) from my csv file, I get around 50% accuracy, which suggests that some sort of scrambling is happening. (Training and validation accuracy are close to 90%, and the test set should at least be a few percentage points above chance–and according to the evaluation I ran, at least, seemingly are).

Where might this scrambling have happened?

The following is the source code for accuracy_np :

def accuracy_np(preds, targs):
    preds = np.argmax(preds, 1)
    return (preds==targs).mean()

I’m not sure what’s going on, can you double check that your preds and targs vectors are what they should be in order for the above function to work as intended ?

After some more finagling, I got the results of accuracy_np to match my own calculations. It definitely was some sort of scrambling that happened. I have no idea why. Thanks for the help!

1 Like

Great job, congrats !