Fastai Docs about text

balnazzar · November 8, 2018, 4:36pm

I was reading the documentation about text processing, and trying to run the toy example provided (IMDB).

The dataset is correctly downloaded, untarred, and shown as pandas dataframe.

But as I try to instantiate the databunch for the language model, it searches for train.csv in the same dir where it untarred the sample. But no train.csv is present:

data_lm = TextLMDataBunch.from_csv(path, 'texts.csv')

produces:

FileNotFoundError: File b'/home/poko/.fastai/data/imdb_sample/train.csv' does not exist

In that directory, only texts.csv does actually exist.

Note that I’m following the tutorial step by step here.

Should we split texts.csv manually?

sgugger · November 9, 2018, 1:14am

You need the latest version of fastai

balnazzar · November 9, 2018, 1:29am

Heck, I upgraded just a few days ago.

But I realize you all are working continuously to improve the library, which is a commendable effort

Thanks!