Part 2 Lesson 10 wiki

virtusman · December 3, 2018, 3:38pm

All you have to do is make sure you do:
from fastai import *

everything you need will be imported automatically!

gandyer · December 8, 2018, 5:50pm

Can ULMFiT pre-trained model useful in text summarization? I’ve seen works in text classification. Any examples for summarization using this pre-trained model available? If so, point me there. Thanks.

seb0 · December 18, 2018, 12:27pm

seq2seq is not part of fastai yet (afaik).

RupeshGoud · December 24, 2018, 7:10am

Im getting this error when Im running the notebook of lession 10 => ‘Tokenizer’ object has no attribute ‘proc_all_mp’,
I’ve seen the code, there is no “proc_all_mp” is implemented?? is that code has been changed??
How can I solve this?? Please help

RupeshGoud · December 26, 2018, 7:38am

I was using latest version of fastai before, this got resolved by downgrading to fastai 0.7

ant3ng · January 2, 2019, 3:45am

I don’t understand 1st issue about why range is between n_lbls+1 and len(df.columns).
Did you manage to understand that? If you did, please tell me this reason.

wyquek · January 2, 2019, 5:07am

seems like for i in range(n_lbls+1, len(df.columns)) is activated only when you have more than one text column.

Suppose if you have 4 labels,df[0] to df[3], followed by 3 text columns, df[4] to df[6]:

n_lbls=4
for i in range(n_lbls+1, len(df.columns)) will become for i in range(5,7), which will add df[5] and df[6] to text

In most cases where you have only 1 label followed by 1 text column, giving n_lbls=1
that range will be for i in range(2,2), which hence doesn’t add anything to text

ant3ng · January 2, 2019, 6:17am

I don’t come up with the case where there are more than one label.
But, your reply helps me understand that this code can be applied for not just imdb, but other cases.
Thx!!

fadiaaniket · January 4, 2019, 11:29am

I am getting the same error “ValueError: not enough values to unpack (expected 2, got 1)”. Did you find any solution for this error?

pradla · January 20, 2019, 7:15pm

Has anyone tried making predictions on a large dataset? I have a 100 gb dataset with 1 billion rows and I want the fastai model to predict the sentiment on each row. Although I have a server, and the dataset would probably fit into the ram, the kernel keeps dieing when I use the fastai dataloader.

Another thing I tried is use parquet files. However I am unable to generate an iterator from parquet files.

Any suggestions on how to proceed? Any advise is good advise

dana · January 22, 2019, 12:14am

Have you found anything for text summarization? Thanks.

mmcki · February 17, 2019, 5:05pm

Has anyone gotten colab to run the lesson 10 imdb notebook without crashing from RAM issues?

I know the solution involves doing batches and reducing the loaded RAM, but I’m not sure how to implement it and just wanted to play with NLP more than debug and fix the data loader.

nickyeolk · February 18, 2019, 1:23pm

I had a similar issue. Basically the save-and-load functions of fast.ai were somewhat broken so i was dealing with text i loaded several cells up. These were loaded using data_lm = TextDataBunch.from_csv(path, 'texts.csv'), which eventually caused a problem with my GPU not having enough memory. Changing the code to data_lm = TextLMDataBunch.from_csv(path, 'for_lm.csv') instead seemed to do the trick and I was able to proceed with the notebook.

nickyeolk · February 18, 2019, 1:34pm

I did it by loading the wikitext model, i.e. learn = language_model_learner(data_lm, AWD_LSTM, drop_mult=0.3), and then skipping to the language generation cell without running the model training cell. The predictions look very much like wikipedia text.

dreambeats · March 3, 2019, 1:33pm

I hope this isn’t a double post, but the highlighted link leads to an ‘Object not found error.’ on the Stanford website, does anyone know where it actually leads to?

cahya · March 26, 2019, 10:48am

I think this is the link http://web.stanford.edu/class/cs224n/

nhada · August 27, 2019, 11:04am

can we train this model to build hindi-english transliteration ?

fredguth · September 19, 2019, 6:25pm

Just to add my 2¢ here, you can think that word2vec is a one layer network which you will use as feature extractor. If you had multiple layers, it would be better.