hi, guys,Im tring to do Tokenization with Chinese, here i found the code is a old version of fastai here:
[demo_chinese_text_classification_bert_fastai/demo_refactored_dianping_classification_with_BERT_fastai.ipynb at master · wshuyi/demo_chinese_text_classification_bert_fastai · GitHub]
my confusion is how to rewrite this piece of code work for current fastai version:
databunch = TextClasDataBunch.from_df(path, train, valid, test,
tokenizer=bert_fastai_tokenizer,
vocab=bert_vocab,
include_bos=False,
include_eos=False,
text_cols=“comment”,
label_cols=‘sentiment’,
bs=batch_size,
collate_fn=partial(pad_collate, pad_first=False, pad_idx=0),
)