Calculating Accuracy on Test Set with Labels

colinconwell · November 19, 2018, 2:03am

I am following the basic formula of the dog breeds example to test a bunch of models on a large dataset I generated. Because I generated the dataset myself, I do have groundtruth labels for the test set.

I am wondering what the most idiosyncratic method of calculating accuracy on the test set would be, according to the fastai framework.

What I’ve done is to merge a ‘testlabels.csv’ with a dataframe made from the probs output from the test dataloader and then calculate accuracy by comparing columns in the merged dataframe, but I reckon there’s probably a more elegant way of doing this.

Here’s what I’ve done so far:

Any suggestions would be much appreciated!

PierreO · November 19, 2018, 2:19am

You can use the accuracy_np(preds, targs) function. That function is in the dog breed notebook but in that case there’s no label for the test set so it’s not used (it’s commented out).

colinconwell · November 19, 2018, 6:32am

Thanks for the reply. When I do this, though, passing in the vector of truth values (0 or 1) from my csv file, I get around 50% accuracy, which suggests that some sort of scrambling is happening. (Training and validation accuracy are close to 90%, and the test set should at least be a few percentage points above chance–and according to the evaluation I ran, at least, seemingly are).

Where might this scrambling have happened?

PierreO · November 19, 2018, 12:04pm

The following is the source code for accuracy_np :

def accuracy_np(preds, targs):
    preds = np.argmax(preds, 1)
    return (preds==targs).mean()

I’m not sure what’s going on, can you double check that your preds and targs vectors are what they should be in order for the above function to work as intended ?

colinconwell · November 20, 2018, 4:47am

After some more finagling, I got the results of accuracy_np to match my own calculations. It definitely was some sort of scrambling that happened. I have no idea why. Thanks for the help!

PierreO · November 20, 2018, 4:49am

Great job, congrats !