Using MixUp on NLP Models

Hello everyone!
I just finished watching Lesson 12 of part 2 and wanted to try using MixUp for a NLP Model.

I tried to accomplish this by doing mixup after calculating the embeddings and came accros several problems doing this.
First of all, the random lambda values generated were turned into Integers because the original Input is the numericalized sentence and in the MixUp callback there’s this line:
lambd = last_input.new(lambd)
I added a .float() after last_input() and it fixed the problem.
Then I had to modify the forward passes of the AWD_LSTM module and the MultiBatchEncoder Module.

After like 2 days of getting thousands of errors each time I tried to fit a model using this, I finally got it to work (or so I think).
Here is the notebook in which I implemented this in case anyone wants to use it. Please note that it’s probably a really dirty way of doing this and haven’t really experimented a lot with it.

Notebook

My next goal is to try and train the IMDB Sentiment classifier using this and see where it gets.

I hope someone finds this interesting
Cheers!

4 Likes

Hey @Fmcurti! Real nice work.

Did you get to benchmark this on IMDB?

Hi @Andreas_Daiminger ! Yes, I tried reproducing This notebook and adding MixUp to the classification part.
Although I didn’t get bad results, I wasn’t able to surpass the performance of Plain ULMFiT.
I just pushed the notebook to the repo in case you want to check it out!

2 Likes

Thanks @Fmcurti

1 Like