Using Dutch Roberta pretrained model with FastAI 2.3.0 fails

wdewit · April 22, 2021, 4:23pm

Hi,

I want to fit an LSTM for intent classification using the Dutch Roberta pretrained model.
I installed fastai v 2.3.0 on AWS and loaded the Dutch Roberta language model:

#Step 1: download the Dutch Bert model
from transformers import RobertaTokenizer, RobertaForSequenceClassification
dtokenizer = RobertaTokenizer.from_pretrained(“pdelobelle/robbert-v2-dutch-base”)
dmodel = RobertaForSequenceClassification.from_pretrained(“pdelobelle/robbert-v2-dutch-base”)

When I then start following the steps in the article: Using RoBERTa with fast.ai for NLP | by Dev Sharma | Analytics Vidhya | Medium to build a FastAI wrapper around the Transformers RobertaTokenizer, it gives me an

#Step 2: build a FastAI wrapper around the transfomers RobertaTokenizer (from: Using RoBERTa with fast.ai for NLP | by Dev Sharma | Analytics Vidhya | Medium)
class FastAiRobertaTokenizer(BaseTokenizer):
def init(self, tokenizer: RobertaTokenizer, max_seq_len: int=128, **kwargs):
self._pretrained_tokenizer = tokenizer
self.max_seq_len = max_seq_len
def call(self, *args, **kwargs):
return self
def tokenizer(self, t:str) → List[str]:
return [“~~”] + self._pretrained_tokenizer.tokenize(t)[:self.max_seq_len - 2] + [“~~”]

Error message:

NameError Traceback (most recent call last)
in
1 #Step 2: build a FastAI wrapper around the transfomers RobertaTokenizer
----> 2 class FastAiRobertaTokenizer(BaseTokenizer):
3 def init(self, tokenizer: RobertaTokenizer, max_seq_len: int=128, **kwargs):
4 self._pretrained_tokenizer = tokenizer
5 self.max_seq_len = max_seq_len

in FastAiRobertaTokenizer()
6 def call(self, *args, **kwargs):
7 return self
----> 8 def tokenizer(self, t:str) → List[str]:
9 return [“~~”] + self._pretrained_tokenizer.tokenize(t)[:self.max_seq_len - 2] + [“~~”]

NameError: name ‘List’ is not defined

How can I solve this ? Should I install version 1 of FastAI or can I solve it differently ?

Thanks,
Wendy

muellerzr · April 22, 2021, 4:33pm

You likely need to import List from typing in wherever that definition is

wdewit · April 28, 2021, 8:53am

Do you have any idea where to import List from ?

wdewit · April 29, 2021, 7:19am

I found where to import List from.

From typing import List

It works now