Lesson 8 - Official topic

fastai v1 had some tools that might be similar to what you are looking for:
https://docs.fast.ai/text.interpret.html

Probably lot of hidden information is ingrained into the text during the translation process.

Would some images associated to specific sentences also help as augmentation? If so, how could that be implemented?

Is the language model trained for IMNDB that is loaded composed only the word embeddings parameters? or also also the weights of the LSTM model ?

5 Likes

Fastbook chapter 10 questionnaire solutions:

Compare your answers or feel free to contribute!

@rachel’s SciPy 2019 Keynote: https://www.youtube.com/watch?v=KChtdexd5Jo

Regarding this threat for using advanced language models to manipulate public opinion, I covered this in more detail in my SciPy keynote last summer:

Edited to add: the first half is a high-level overview of the use of transfer learning in NLP (which will be review for you now) and 2nd half is on risks for manipulating public opinion, disinfo, etc.

6 Likes

In the previous lesson MNIST example, you showed us that “under the hood” the model was learning parts of the image, like curves of a 3 or angles of a 7.

Is there a way to look under the hood of the language models to see if they are learning rules of grammar/syntax?

Would it be a good idea to fine tune models with examples of domain specific grammar/syntax (like technical manuals) or does that miss the point of having the model learn for themselves?

6 Likes

It seems some people are interested in NLP model intepretability.

Again, PyTorch Captum seems to be an amazing tool for studying this.

They even have an example for text classification over here

It seems there is a fastai2 callback for ResNet interpretation using Captum over here. Maybe this could be useful for doing something similar with NLP models?

Additionally, fastai v1 had an NLP interpretation class over here

5 Likes

Again Rachel’s mic is breaking up

2 Likes

I don’t know NLP too well, but from what I understand, there is a debate in the literature about whether attention layers serve as explanations. Note that attention in NLP gave rise to the popular Transformer models (entitled Attention Is All You Need).

For example,

4 Likes

Now is better - the mic

4 Likes

it actually sounds like robotic a little bit or with noise

One thing that it’s useful is to look at the embedding: you can for example see that words with similar or connected meaning are close to each other in the embedding space. That’s a sign that the model has learned these connections.

1 Like

Can we get data augmentation by asking our language model to translate a database of sentences from English to English?

1 Like

What do you mean English => English? I would think there’s no translation being done

Seems like a neat idea, but where would you find this database? That’s often the problem with bootstrapping these kinds of projects.

English to an intermediate language and back to English is often done for NLP data augmentation.

1 Like

My guess is that the gain was too high, causing the mic to saturate, which gives rise to distortion.

1 Like