Hello,
I have recently finished the first part of the fastai course and wanted to create a NLP using the ULMFiT model as explained in lesson 3 and 4. I have a (pretty large) set of reviews that have a text and a rating, and I want to create a TextLMDataBunch from the text to train the language model learner on. I run the following code:
bs=64
data_lm = (TextLMDataBunch.from_csv(path, 'valid.csv', text_cols='text')
.split_by_rand_pct(0.1)
.label_for_lm()
.databunch(bs=bs))
after which I call data_lm.show_batch(). However this returns many numbers and xxunk tokens instead of the text that should be returned. Does anyone know why this is the case?