Very low score achieved while using fastai library for load and train

adwivedi · December 31, 2018, 12:47pm

I have tried to replicate the CNN text model implemented here using the fastai library to load the data and train
However I am getting a very low accuracy of 20.25% as compared to what is achieved in the original implementation (97.61%)

Can someone help me figure out what am I missing while using the fastai library?
I feel there’s something incorrect in the way I am loading the data using the TextDataBunch.from_df but I don’t know fastai well enough to understand what’s wrong with my implementation.

My implementation is available at : https://gist.github.com/aashudwivedi/ee793f77c7c8efd6e63f1282171a4904
It’s a google colab notebook which you can also run with your own google account.

rohit_gr · January 1, 2019, 6:20am

Correct print_scores() :

def print_scores(learner, data):
  y_probs, y_target = learner.get_preds()
  y_preds = torch.argmax(y_probs, dim=1)
  print('loss = {}'.format(log_loss(y_target, y_probs)))
  print('accuracy = {}'.format(accuracy_score(y_preds, y_target)))

Output I got:

get_preds() returns predictions (or probs) and targets. The reason y_true is not equal to y_target is that maybe the data is shuffled while predicting the validation data.

adwivedi · January 1, 2019, 9:20am

Thanks Rohit, I figured that that there’s something wrong with the way I am calculating the accuracy. Because the validation accuracy during the training time was much higher.

Thanks a ton again for your clear and concise answer this helps a lot with not loosing motivation on the first day of the year.

Have a great year ahead

rohit_gr · January 1, 2019, 11:55am

Also either remove the F.log_softmax(out,1) from your cnn architecture and return just the output(out) because you are using nn.CrossEntropyLoss which does softmax step automatically or use nn.NLLLoss.