How can I then use it to predict new previously unseen data? (I don’t have access to the data at the point in time when building the classifier). There is a learn.predict() but I don’t know how to use it if I have multiple text columns (title and text in the example above).
So the best I came up with myself was to make a " ".join() on the title and the text on the new, previously unseen, data and use this as input to the predict()-method. If anyone have a better or cleaner way to this, please let me know.
brother predict take a lot of time i just put all my text into list around 10,000 of length and it take a lot
time as off typing this it did’t predict is your prediction is done and how much time it takes
I have the same issue a couple of years later now. For anyone reading this thread, I’m not confident that just joining the two text fields will provide the same result as I can see that the dataloaders insert xxfld 1 and xxfld 2 into the text to indicate the fields - but I don’t know how to do that in the prediction function.
For fastai==2.5.3, it looks like this is done in fastai.text.core:197 or thereabouts by _join_texts. It looks like it really does just add xxfld with an increasing index for each text column (ie xxfld 1, xxfld 2, etc) - so that’s what I’m going to do.
I’m hoping that these fields won’t be stripped out before numericalisation, I might double check by either comparing the numericalised inputs to the dataloader or by just checking that my performance matches expectations on the validation set.