Part 2 Lesson 10 wiki

(Luca) #553

All you have to do is make sure you do:
from fastai import *

everything you need will be imported automatically!


(Subash Gandyer) #554

Can ULMFiT pre-trained model useful in text summarization? I’ve seen works in text classification. Any examples for summarization using this pre-trained model available? If so, point me there. Thanks.


(Sebastian Fleck) #555

seq2seq is not part of fastai yet (afaik).


(Rupesh Goud) #556

Im getting this error when Im running the notebook of lession 10 => ‘Tokenizer’ object has no attribute ‘proc_all_mp’,
I’ve seen the code, there is no “proc_all_mp” is implemented?? is that code has been changed??
How can I solve this?? Please help


(Rupesh Goud) #557

I was using latest version of fastai before, this got resolved by downgrading to fastai 0.7


(Tomoaki Ando) #558

I don’t understand 1st issue about why range is between n_lbls+1 and len(df.columns).
Did you manage to understand that? If you did, please tell me this reason.


(魏璎珞) #559

seems like for i in range(n_lbls+1, len(df.columns)) is activated only when you have more than one text column.

Suppose if you have 4 labels,df[0] to df[3], followed by 3 text columns, df[4] to df[6]:

for i in range(n_lbls+1, len(df.columns)) will become for i in range(5,7), which will add df[5] and df[6] to text

In most cases where you have only 1 label followed by 1 text column, giving n_lbls=1
that range will be for i in range(2,2), which hence doesn’t add anything to text

1 Like

(Tomoaki Ando) #560

I don’t come up with the case where there are more than one label.
But, your reply helps me understand that this code can be applied for not just imdb, but other cases.


(Aniket Fadia) #561

I am getting the same error “ValueError: not enough values to unpack (expected 2, got 1)”. Did you find any solution for this error?


(pradla) #562

Has anyone tried making predictions on a large dataset? I have a 100 gb dataset with 1 billion rows and I want the fastai model to predict the sentiment on each row. Although I have a server, and the dataset would probably fit into the ram, the kernel keeps dieing when I use the fastai dataloader.

Another thing I tried is use parquet files. However I am unable to generate an iterator from parquet files.

Any suggestions on how to proceed? Any advise is good advise :slight_smile:


(dana) #563

Have you found anything for text summarization? Thanks.


(Maxwell McKinnon) #564

Has anyone gotten colab to run the lesson 10 imdb notebook without crashing from RAM issues?

I know the solution involves doing batches and reducing the loaded RAM, but I’m not sure how to implement it and just wanted to play with NLP more than debug and fix the data loader.



I had a similar issue. Basically the save-and-load functions of were somewhat broken so i was dealing with text i loaded several cells up. These were loaded using data_lm = TextDataBunch.from_csv(path, 'texts.csv'), which eventually caused a problem with my GPU not having enough memory. Changing the code to data_lm = TextLMDataBunch.from_csv(path, 'for_lm.csv') instead seemed to do the trick and I was able to proceed with the notebook.



I did it by loading the wikitext model, i.e. learn = language_model_learner(data_lm, AWD_LSTM, drop_mult=0.3), and then skipping to the language generation cell without running the model training cell. The predictions look very much like wikipedia text.


(JamesT) #567

I hope this isn’t a double post, but the highlighted link leads to an ‘Object not found error.’ on the Stanford website, does anyone know where it actually leads to?


(Cahya) #568

I think this is the link

1 Like