ULMFit - Portuguese

I am trying to reproduce the results but I encounter an error here:

dest = path/'corpus2_100' (dest/'tmp').ls()

error:

FileNotFoundError: [Errno 2] No such file or directory: '/home/user/.fastai/data/ptwiki/corpus2_100/tmp'

Hi @tbone, thanks for using my models.

Your question is about the following lines of my notebook lm3-portuguese-classifier-TCU-jurisprudencia.ipynb (nbviewer):

O folder tmp with the 2 files spm.model and spm.vocab was created in the previous notebook lm3-portuguese.ipynb when the tokenizer SentencePiece was used the first time to create the Databunch of the Portuguese LM.

To get these 2 files, you need to download them through this page: https://github.com/piegu/language-models/tree/master/models

Hello everyone,

First of all, thank you very much for sharing such a valuable work!

I have a question regarding the notebook lm3-portuguese-classifier-TCU-jurisprudencia.ipynb
I am quite new to the fastai lib, so my question may be a bit silly. I am looking for the part where you
use the test database to evaluate your models (bwd, fwd and ensemble) building the confusion
matrix. As far as I understood, you loaded the database for each of the tests using the following
line:

data_clas = load_data(path, f’{lang}_textlist_class_tcu_jurisp_reduzido_sp15_multifit_v2’, bs=bs, num_workers=1)

This database was created with the following lines:

data_clas = (TextList.from_df(df_trn_val, path, vocab=data_lm.vocab, cols=reviews, processor=SPProcessor.load(dest))
.split_by_rand_pct(0.1, seed=42)
.label_from_df(cols=label)
.databunch(bs=bs, num_workers=1))

and then saved it as f’{lang}_textlist_class_tcu_jurisp_reduzido_sp15_multifit_v2’,
which means that you used df_trn_val to create it. However, this data was also used
during the training phase, which was splitted as training and validation database.

Are you using for the test phase the same data that you used for training/validation? Did I understand
anything wrong? Again, I am quite new to fastai so I may have misunderstood something.