Testing classification NLP model on a CSV file or dataframe

Harry_Leafe · August 29, 2019, 8:38am

Hi There

I have a trained classification model that I want to test on unseen data. I have a csv file containing the texts I want it to predict labels for. I have tried many commands to try and do this in one go (using learn.predict would be much too slow) but am always presented with errors. I have tried using databunch to format the data appropriately but this commonly gives me errors as well (there is no labels column as this is what i want the model to predict)

What is the easiest way to get an NLP classification model to predict on a set of unseen data?

Thanks for you help!

gustav · September 2, 2019, 10:50am

This is how I did it. Kinda clunky but it works.

First read the csv file to a dataframe. Then put the text column in a textlist.

test_tl = TextList(df_dataset.text, vocab=vocab)

predictor = load_learner(model_path,
file=model_fn, test=test_tl, bs=128, num_workers=0)

preds = predictor.get_preds(ds_type=DatasetType.Test, ordered=True)