Lesson 4 official topic

Mattr · May 17, 2022, 9:22am

I made one small change to Jeremy’s Kaggle NLP notebook from this:
model_nm = 'microsoft/deberta-v3-small'

To this:
model_nm = 'distilroberta-base'

The tokenizer with this model is different but was still working.

But then I got an error during the first epoch after running >> trainer.train();

This reply to a week 2 question helped correct the problem.

What other issues might occur when using different models?

ilovescience · May 17, 2022, 9:23am

Continuing on the sharing of Transformers resources, I had a whole thread of awesome resources online:

JackV · May 17, 2022, 9:23am

Yes! That’s what it was! I had the wrong train.csv file.

Thank you very much all that helped! Love you all!

I might come back for more help though, stay tuned

devforfu · May 17, 2022, 9:23am

Also, an interesting post describing transformers architecture, starting from very basic things:
https://e2eml.school/transformers.html

ste · May 17, 2022, 9:32am

Love this Yannic’s classic paper review: it explains two of the fundamental concepts in transformer’s architecture:

positional encoding
key/value queries

That’s my thought: at the end of the day, positional encoding is just feature engineering on token position: same idea of breaking a date field into multiple columns (day, day_of_week, month, …).

marix1120 · May 17, 2022, 9:36am

I am trying to submit the notebook on kaggle but this error message comes up: Cannot submit Your Notebook cannot use internet access in this competition. Please disable internet in the Notebook editor and save a new version.

I have disconnected from the internet but some FASTAI functions require internet.

I tried this solution fastai_offline User Guide ✔ | Kaggle but still getting this error.

Help please!

jwuq · May 17, 2022, 9:37am

It’s telling you the problem - you n eed to disable the ‘internet’ option for this notebook. It’s in the options on the top right of the main kaggle window

KevinB · May 17, 2022, 9:37am

How do you decide whether you can get rid of outliers? When you removed those values it made your score go up, but those values will probably exist in the test set as well so do you remove the row entirely or do you use a different method for handling outliers

marix1120 · May 17, 2022, 9:38am

I did that but some FASTAI function requires internet… I think…

ilovescience · May 17, 2022, 9:40am

You should try out this guide instead:

Basically, you need to add the model as a dataset so you can use it offline.

jmp · May 17, 2022, 9:40am

Trying to run “Getting started with NLP” via Kaggle; I bump into an error at line tokz = AutoTokenizer.from_pretrained(model_nm) with a message of ValueError: Connection error, and we cannot find the requested files in the cached path. Please try again or make sure your Internet connection is on. I am a bit surprised this was not already reported.

I mostly focus the live stream rather than running, so this will wait, but mentioning in case others have the same issue.

bencoman · May 17, 2022, 9:41am

How do you go about determining and effective batchsize for GPU on whatever service we are using?

ilovescience · May 17, 2022, 9:42am

You don’t have internet connection in your notebook. You need to enable it, and you might need to verify your identity to do so.

If you are submitting your notebook, you technically need it to be offline and not accessing the internet, so you instead can add the model as a dataset to your notebook, something like this guide:

jmp · May 17, 2022, 9:42am

all right, just saw this reply from @ilovescience

ilovescience · May 17, 2022, 9:43am

Oh and just pointing out that this guide is Jeremy’s but edited to be offline by @miwojc

stantonius · May 17, 2022, 9:43am

For me its just trial and error until I get a feel for what works on the machine. Maybe someone else has a better approach?

EDIT: I should clarify - my default position is to maximise batch size (for speed and loss normalization), so the trial and error is to see what is the largest batch size I can use. But not sure if this is an entirely correct assumption

radikubwa · May 17, 2022, 9:45am

I normally try a small batch size on multiples of 2. Try small then go higher and higher. Normally you’ll get a low memory error then that’s when you start reducing.

JaviNavarro · May 17, 2022, 9:46am

Will do! Thanks yet again Nick

wgpubs · May 17, 2022, 9:46am

A lot of this is trial and error (for me personally at least).

With that said, you can explore using techniques such as mixed precision and gradient accumulation to train with bigger batch sizes regardless of whatever compute capabilities you are running with. In the end, you want to train with as big of batches as you can in general.

miwojc · May 17, 2022, 9:46am

need to add some description in the notebook what exactly was changed, which is not much. it’s all the
great work by Jeremy. i uploaded datasets package and deberta model as kaggle datasets so they can be accessed when offline.