Lesson 4 In-Class Discussion ✅

chans.best · November 17, 2018, 3:22am

i was thinking about language model and how it was able to predict next word.Now idea that struck me was will it be possible to get a score for sentence out of model for use in sentence comparison.

ideally
sentence[w1…wn] ->language model-> wn+1
and
sentence[w1…wn] ->language model-> classifier+sigmoid ->0,1

could it be something like
sentence[w1…wn] ->language model-> +??? → sentence representation[1212,1521515,0212,451]

I know this is advanced topic and i found below link in advanced forum but i would like advanced users to share ideas about it in

MaheshKhatri · November 17, 2018, 4:11am

The same steps will need to be done in neural nets too.

MaheshKhatri · November 17, 2018, 4:27am

Yes. Of course.

MaheshKhatri · November 17, 2018, 4:47am

https://pytorch.org/docs/0.3.1/torch.html

lesscomfortable · November 17, 2018, 5:54am

I don’t know what this might be, I assume you ran .fit already?

ymittal23 · November 17, 2018, 6:54am

So it has a hidden meaning. It will be helpful while writing code, thanks for clearing that

edwardjross · November 17, 2018, 9:11am

I’ve run it successfully on 16GB cards (P5000 in Paperspace and a P100 in GCP) on the cloud as is.

Have you tried decreasing bptt on the learner? This helped me in an earlier version of the course. Good luck.

sgugger · November 17, 2018, 2:47pm

Pull the latest version of the course notebooks. TextFilesList has now disappeared and we always use TextList.

kofi · November 17, 2018, 3:11pm

yes I did

jcatanza · November 17, 2018, 6:18pm

Thanks @sgugger

I did refresh the repo before running the notebook, running
git pull in the /notebooks/course-v3 folder, and
pip install fastai --upgrade

Is this what you mean? If not, what do you mean?

balnazzar · November 17, 2018, 10:48pm

Try using them as a single 22gb card, with dataparallel.

FourMoBro · November 18, 2018, 1:03am

DataParallel is one of the first things I add to the notebook. I have had great success with it for images/camvid, but I am afraid it does not work for NLP. I noted this before, perhaps in a different thread.

Here it the error it throws on 1.0.27 which was the same for previous versions:

~/anaconda3/envs/course1018/lib/python3.6/site-packages/torch/nn/modules/module.py in __getattr__(self, name)
    516                 return modules[name]
    517         raise AttributeError("'{}' object has no attribute '{}'".format(
--> 518             type(self).__name__, name))
    519 
    520     def __setattr__(self, name, value):

AttributeError: 'DataParallel' object has no attribute 'reset'

nikhil.ikhar · November 18, 2018, 4:10am

Yes. When people say ai is biased. It also means data on which ai was trained is biased

balnazzar · November 18, 2018, 10:29am

I think fastai has nothing to do with this: it is pytorch stuff.

However, it could be worth to ask the developers (Jeremy, Sgugger, etc…) about such issues. AWD-lstm is truly a beast of RNN, it would be a shame not to use parallelization, it could completely hinder its usage on non-enterprise hardware.

bluesky314 · November 18, 2018, 10:33am

When I run

data_clas = (TextList.from_folder(path, vocab=data_lm.vocab) # vocab is passed in from our pretrained model so that the numerialization is exactly the same of the same words
         #grab all the text files in path
         .split_by_folder(valid='test')
         #split by train and valid folder (that only keeps 'train' and 'test' so no need to filter)
         .label_from_folder(classes=['neg', 'pos'])
         #remove docs with labels not in above list (i.e. 'unsup')
         .filter_missing_y()
         #label them all with their folders
         .databunch(bs=bs))

data_clas.save(‘tmp_clas’)

I get

TypeError                                 Traceback (most recent call last)
<ipython-input-25-ef1d6c6e4867> in <module>
  3              .split_by_folder(valid='test')
  4              #split by train and valid folder (that only keeps 'train' and 'test' so no need to filter)
----> 5              .label_from_folder(classes=['neg', 'pos'])
  6              #remove docs with labels not in above list (i.e. 'unsup')
  7              .filter_missing_y()

TypeError: 'bool' object is not callable

It loads for a bit then throws this. Any fix? I am using the latest version.

luffylucky · November 18, 2018, 11:38am

Should update the library fastai with conda: conda update -c fastai fastai.

pksahu · November 18, 2018, 12:22pm

what’s difference between fastai.tabular and fastai.structured library?

adpostma · November 18, 2018, 4:33pm

The collab notebook returns an error on te learner of the “use_nn” part . Replacing (min_sore=0., max_score=5.0.) with (y_range=(0.,5.)) repairs this error. (see also doc on collab_learner).

jcatanza · November 18, 2018, 6:19pm

Thanks for your reply @luffylucky

I’m confused about these two different methods of updating the fastai library:

pip install fastai --upgrade

and

conda update -c fastai fastai

Are the methods equivalent, or are there circumstances in which one is to be preferred over the other? Or should we use both?

I’d appreciate any light you (or anyone else) could shed on this matter.

luffylucky · November 18, 2018, 6:40pm

You can use both of them, it’s ok. But sometimes, since the code source of library changes quickly, the version of pypi (using pip) is not updated that fast. So when you use pip, maybe nothing happens.
you can follow all updated versions of pip in the site pypi/fastai. And with conda, here: conda/fastai
For an advice, I think conda way is preferred!