Lesson 4 In-Class Discussion ✅

#93

is there bias in language models? such as gender, race in embeddings? how to deal w it?

5 Likes

(Rachel Thomas) #94

I will check about this next time.

0 Likes

(Soumya Gosukonda) #95

There are many algorithms - statistical as well as neural (LSTMs) that can be used to make language models.

1 Like

(Hamel Husain) #96

Does the wiki-text model and the fine-tuned version share the same vocabulary?

0 Likes

#97

As a matter of fact yes! Check the language model zoo topic :wink:

9 Likes

(Vincent) #99

I can hear you clearly.

1 Like

(Sasha) #100

Curious to see if a similar model for Russian exists. Articles on Wikipedia in other languages can be much more limited in quantity and length.

0 Likes

(Alex) #101

yes, always. there are debias techniques…

2 Likes

(Srinivasan Venugopal) #102

can the transfer learning of wiki text can be used different domains which wiki text does not have any context ?

0 Likes

(Rachel Thomas) #103

There is bias in language models. I’ve talked about it some here and here (including some approaches for dealing with it).

13 Likes

(Francisco Ingham) #104

It exists

1 Like

(George Zhang) #105

Can Jeremy mention what exact GPU he used for the IMBD notebook, especially considering the fact that many people run out of GPU memory running it using P4?

1 Like

(Pradeep Banavara) #106

So is the wikitext103 base model somewhat superior to other models such as reddit because wikitext contains more english vocabulary ?

3 Likes

(Bart Fish) #107

i believe it was a 1080Ti with 12GB

0 Likes

(Sathya Iyer) #108

Is the idea of pre-trained language model similar to BERT that i came across while looking for language model.

6 Likes

(Charlie Harrington) #111

What if you want to keep the unknown words (rather than replacing them with xxx)? As long as its used more than 1x, will it be kept?

5 Likes

(Rachel Thomas) #112

The core idea is that even if wikitext is not that similar to the corpus you are interested in, you can still pre-train with wikitext, and then fine tune using the corpus you’re interested in.

3 Likes

(Ilia) #113

Is fastai library available by default via Kaggle kernels? It seems they don’t allow to install custom packages into GPU kernels. There is Quora kernel-only competition so I wonder if it is possible to use the library here.

0 Likes

(Xiwang Li) #118

How to change the size of vocabulary? 6000 to 8000?

0 Likes

(George Zhang) #119

Can Jeremy confirm?

0 Likes