Fastai integration with huggingface pytorch-transformers?

maroberti · November 29, 2019, 8:52am

The problem here is in the Transformers library. It will use the internet every time you will use the function from_pretrained.

However, if you look at this kernel, it seems that he is able to get around the problem.

Good luck for the competition!

abhikjha · November 29, 2019, 5:13pm

Thanks @maroberti for quick reply and sharing the link of the kernel. I earlier saw this but this uses DistilBert. I was thinking to use RoBERTa which I guess is more robust and could have resulted in better predictions.

maroberti · November 29, 2019, 5:22pm

If he can do it with DistilBERT you can normally easily do the same with RoBERTa by following his process.

bluteaur · December 2, 2019, 6:18am

Hi @maroberti,

Again thanks for the article, it really helped out a lot.

Do you have more insight on how to split the models?

I tried as suggested and printed the model.
This is what I was able to come up with (for XLNet):

list_layers = [learner.model.transformer.word_embedding,
learner.model.transformer.layer[0],
learner.model.transformer.layer[1],
learner.model.transformer.layer[2],
learner.model.transformer.layer[3],
learner.model.transformer.layer[4],
learner.model.transformer.layer[5],
learner.model.transformer.layer[6],
learner.model.transformer.layer[7],
learner.model.transformer.layer[8],
learner.model.transformer.layer[9],
learner.model.transformer.layer[10],
learner.model.transformer.layer[11]]

learner.split(list_layers)

But I get this error trying it:

AttributeError Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/IPython/core/formatters.py in call(self, obj)
697 type_pprinters=self.type_printers,
698 deferred_pprinters=self.deferred_printers)
–> 699 printer.pretty(obj)
700 printer.flush()
701 return stream.getvalue()

5 frames
/usr/local/lib/python3.6/dist-packages/fastai/core.py in func_args(func)
278 def func_args(func)->bool:
279 “Return the arguments of func.”
–> 280 code = func.code
281 return code.co_varnames[:code.co_argcount]
282

AttributeError: ‘method-wrapper’ object has no attribute ‘code’

maxmatical · December 9, 2019, 8:52pm

Has anyone tried using ALBERT? When I try to train the model, I get this layer and index output

Additionally, from my experiments, I find that ALBERT performs worse than RoBERTa. Has anyone else experimented with ALBERT?

maroberti · December 20, 2019, 10:44am

Here, Melissa Rajaram found a solution to use Transformers with Fastai to the Google Quest competition.

abhikjha · December 20, 2019, 11:49am

Thanks for sharing it. I am already in touch with her on this. The idea was to use internet first and using “saving_pretrained” save the configs, model and vocab files on hard drive and then upload these as dataset on kaggle kernel in order to use them without using internet.

Did you also try using your working with OpenAI GPT2 model? I earlier tried and faced some errors while creating databunch…

maroberti · December 20, 2019, 12:51pm

Can you be more precise about the error while creating the DataBunch?

As far as I know, you would have to add a custom head layer because in transformers there is no GPT2 architecture for Sequence Classification. Normally it would not return errors during the creation of the DataBunch as it uses the same tokenizer as RoBERTa.

Also, I don’t know if OpenAI GPT2 was made for text classification. Did you find someone using GPT2 for this kind of task ?

abhikjha · December 20, 2019, 1:03pm

Thanks for quick response

Well, I used GPT2 for a classification task where labels are in 30 different columns - Google QUEST competition on kaggle. I will re-run the kernel and post exact error here in sometime but it was mainly to do with .label_from_df(cols = targets) part of the databunch. I tried BERT, RoBERTa, XLNet, Albert using your codes with some tweaks and all of them worked beautifully but GPT2 was difficult to handle.

Yes, you are absolutely right on the custom head layer for classification task, I also read this on HF’s webpage as well. But the modelling stage will come later after creating the databunch, so was not sure how to deal with this.

abhikjha · December 20, 2019, 1:13pm

Here is the error:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-34-7dbc54dd5cab> in <module>
     25 databunch_5 = (TextList.from_df(train, cols=['question_title','question_body','answer'], processor=transformer_processor_g)
     26              .split_by_rand_pct(0.1,seed=42)
---> 27              .label_from_df(cols=targets)
     28              .add_test(test)
     29              .databunch(bs=32, collate_fn=partial(pad_collate, pad_first=False, pad_idx=0)))

/opt/conda/lib/python3.6/site-packages/fastai/data_block.py in _inner(*args, **kwargs)
    478             self.valid = fv(*args, from_item_lists=True, **kwargs)
    479             self.__class__ = LabelLists
--> 480             self.process()
    481             return self
    482         return _inner

/opt/conda/lib/python3.6/site-packages/fastai/data_block.py in process(self)
    532         "Process the inner datasets."
    533         xp,yp = self.get_processors()
--> 534         for ds,n in zip(self.lists, ['train','valid','test']): ds.process(xp, yp, name=n)
    535         #progress_bar clear the outputs so in some case warnings issued during processing disappear.
    536         for ds in self.lists:

/opt/conda/lib/python3.6/site-packages/fastai/data_block.py in process(self, xp, yp, name, max_warn_items)
    712                     p.warns = []
    713                 self.x,self.y = self.x[~filt],self.y[~filt]
--> 714         self.x.process(xp)
    715         return self
    716 

/opt/conda/lib/python3.6/site-packages/fastai/data_block.py in process(self, processor)
     82         if processor is not None: self.processor = processor
     83         self.processor = listify(self.processor)
---> 84         for p in self.processor: p.process(self)
     85         return self
     86 

/opt/conda/lib/python3.6/site-packages/fastai/text/data.py in process(self, ds)
    308         if self.vocab is None: self.vocab = Vocab.create(ds.items, self.max_vocab, self.min_freq)
    309         ds.vocab = self.vocab
--> 310         super().process(ds)
    311 
    312 class OpenFileProcessor(PreProcessor):

/opt/conda/lib/python3.6/site-packages/fastai/data_block.py in process(self, ds)
     51     def __init__(self, ds:Collection=None):  self.ref_ds = ds
     52     def process_one(self, item:Any):         return item
---> 53     def process(self, ds:Collection):        ds.items = array([self.process_one(item) for item in ds.items])
     54 
     55 PreProcessors = Union[PreProcessor, Collection[PreProcessor]]

/opt/conda/lib/python3.6/site-packages/fastai/data_block.py in <listcomp>(.0)
     51     def __init__(self, ds:Collection=None):  self.ref_ds = ds
     52     def process_one(self, item:Any):         return item
---> 53     def process(self, ds:Collection):        ds.items = array([self.process_one(item) for item in ds.items])
     54 
     55 PreProcessors = Union[PreProcessor, Collection[PreProcessor]]

/opt/conda/lib/python3.6/site-packages/fastai/text/data.py in process_one(self, item)
    304         self.vocab,self.max_vocab,self.min_freq = vocab,max_vocab,min_freq
    305 
--> 306     def process_one(self,item): return np.array(self.vocab.numericalize(item), dtype=np.int64)
    307     def process(self, ds):
    308         if self.vocab is None: self.vocab = Vocab.create(ds.items, self.max_vocab, self.min_freq)

TypeError: int() argument must be a string, a bytes-like object or a number, not 'NoneType'

ninjakx · June 18, 2021, 5:51pm

Were you able to split the model using learner.split() ??