Classifier always predicting one class

jdb100 · June 19, 2019, 2:44pm

I am trying to classify tickets based on their description using ULMFiT. Using methods such as gradient boosting I am able to get 85% accuracy and I want to try using fast ai to see if I can reach above 85%. The problem I keep running into is the classifier always predicting one class which I suspect is the neural network getting stuck in a local minima. While training I noticed some confusing behaviour. Upon training the language model the learner has an accuracy of 61% and does an epoch in about 1 minute however after unfreezing the learner has an accuracy of 42% after 1 epoch which takes about 1 minute. This appears normal so far however training another epoch drops the accuracy to 21.8475% and takes about 30 minutes for 1 epoch. The training and validation loss also increase from around 3.5 to 5 and I trained for another 15 epochs to see if anything would happen however the accuracy stayed exactly 21.8475% through every epoch and the training and validation loss stayed approximately 5. I decided that despite the confusing results to train my classifier to see if anything would happen. The classifier after 1 epoch while frozen gave a result of 53.8462% and the classifier would never change from this point. When I looked at the confusion matrix I noticed that the classifier was simply always predicting one class. I am assuming this is caused by something wrong with my training of the language model. The 2 classes are all balanced as well.

data_lm = (TextList.from_df(queried_dataframe, cols="Combined").use_partial_data(sample_pct=0.1, seed=42).split_by_rand_pct(0.1).label_for_lm().databunch(bs=30))
learn = language_model_learner(data_lm, AWD_LSTM, drop_mult=0.3)
learn.fit_one_cycle(1, 1e-1, moms=(0.8,0.7))
learn.unfreeze()
learn.fit_one_cycle(6,5e-2, moms=(0.8,0.7))
learn.save_encoder('Ticket_Language_Model_Encoder')
labelled_dataframe = read_csv('Labelled.csv', index_col=0, encoding = 'ISO-8859-1').dropna().astype({"ID":int,"Quality":int})
labelled_dataframe["Combined"] = labelled_dataframe["DESCRIPTION"] + ' ' + labelled_dataframe["TEXTVALUE"]
data_classifier = (TextList.from_df(labelled_dataframe, vocab=data_lm.vocab, cols="Combined").split_by_rand_pct(seed=42).label_from_df(cols="Quality").databunch(bs=30))
learn_classifier = text_classifier_learner(data_classifier, AWD_LSTM, drop_mult=0.5, pretrained=False, metrics=accuracy)
learn_classifier.load_encoder('Ticket_Language_Model_Encoder')
learn_classifier.freeze()
learn_classifier.fit_one_cycle(1,1e-1, moms=(0.8,0.7))
learn_classifier.freeze_to(-2)
learn_classifier.fit_one_cycle(1,slice(5e-2/(2.6**4),5e-2), moms=(0.8,0.7))
learn_classifier.unfreeze()
learn_classifier.fit_one_cycle(8,slice(1e-2/(2.6**4),1e-2),moms=(0.8,0.7))
preds, targets = learn_classifier.get_preds()
predictions = numpy.argmax(preds, axis=1)
pandas.crosstab(predictions, targets)