I’m using fastai v2. I have trained a text classification model and want to use this model to predict on a new test dataset. The dataset is a dataframe with only one column ‘Text’ and no label column.
How do I create a data loaders object from this dataframe so I can predict on this dataset? I always encounter an error when using TextDataLoaders.from_df() method. The error message and code is attached below. Any help would be appreciated, thank you! @muellerzr Not sure if I’m allowed to tag you here, but I would really appreciate any advice you could give me. Thank you.
/usr/local/lib/python3.6/dist-packages/numpy/core/_asarray.py:83: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify ‘dtype=object’ when creating the ndarray
return array(a, dtype, copy=False, order=order)
@anshu1
Since you have already trained the model, the steps that you need to do (more complete example in [1])
loading the learner
tokenize the text columns in your test dataframe
initialize the test dataloader in the learner with the above tokenized dataframe
learn = load_learner('SAVED_MODEL')
tokenized_df = tokenize_df(test_df, text_cols='Text', tok_text_col='text') #returns a tuple
test_dl = learn.dls.test_dl(tokenized_df[0], with_labels=False) #initialize the test dataloader
learn.get_preds(dl=test_dl) # Get the predictions on your test data loader using [2]
@msivanes
Happy New Year and thank you SO much for your response - your solution worked!! I did have a follow-up question about getting predictions using get_preds(). I’m using the following code below to save my predictions for the test dataset.
@muellerzr Thank you so much for your answer - the preds.argmax worked. However, I am still getting an error when I run the code with “dls.vocab” (‘dls’ is not defined).
I just wanted to clarify if ‘dls’ here refers to the test dataloaders object or the model itself.