How to change this old version code working for current version?

hi, guys,Im tring to do Tokenization with Chinese, here i found the code is a old version of fastai here:
[https://github.com/wshuyi/demo_chinese_text_classification_bert_fastai/blob/master/demo_refactored_dianping_classification_with_BERT_fastai.ipynb]
my confusion is how to rewrite this piece of code work for current fastai version:

databunch = TextClasDataBunch.from_df(path, train, valid, test,
tokenizer=bert_fastai_tokenizer,
vocab=bert_vocab,
include_bos=False,
include_eos=False,
text_cols=“comment”,
label_cols=‘sentiment’,
bs=batch_size,
collate_fn=partial(pad_collate, pad_first=False, pad_idx=0),
)

DataLoaders in fastai v2 is similar to the DataBunch from fastai v1.

It seems like what you might want to do is convert TestClasDataBunch to TextDataLoaders. See here for the API…

I will point out there is a tutorial for using the HuggingFace library with fastai. Additionally, the blurr library is a nice extension for integrating HuggingFace and fastai v2.

1 Like

thanks for help,I will read the doc again!

1 Like

blurr is great. Hopefully the community can help to continue making it better, with PRs etc.

Yijin

1 Like