Fastai v2 chat

fmobrj75 · April 20, 2020, 4:54pm

@sgugger It seems to me that it is conflicting with splits. When I create a dataset without splits, the wgts seems to be working. Otherwise it complains about the sizes of the parameters (weighted_dataloaders):

ValueError: 'a' and 'p' must have same size

sgugger · April 20, 2020, 4:57pm

I have not designed that particular API, so there is no need to tag me. I have no more knowledge of what is causing your error than you do

fmobrj75 · April 20, 2020, 5:00pm

I did not know that. Sorry.

fmobrj75 · April 20, 2020, 5:14pm

Hi Boris.

Do you know how the weights must be passed to the weighted_dataloaders? When I do not have splits, it works fine with an array of weights. But when I have splits, I get the following error:

ValueError: 'a' and 'p' must have same size

boris · April 20, 2020, 5:34pm

They should be passed only to the training dataset. They won’t be used on the validation dataset (which will process all the items it has).

fmobrj75 · April 20, 2020, 5:37pm

Oh, ok! Thank you very much!

WaterKnight · April 20, 2020, 7:34pm

Thank you for fixing them, I found some imports being duplicated too. I opened a pull request.

My solution didn’t work. Sorry, i didn’t say nothing!

adpq · April 21, 2020, 10:37pm

I’m trying to run the fastbook notebooks. I installed fastai2 and fastcore as editable installs, but I get the following error when I try to run the first line of the notebook:

ModuleNotFoundError: No module named ‘fastai2’

How can I get fastai2 to work in these notebooks?

paidion · April 22, 2020, 7:25am

How about? !pip install fastai2

zlapp · April 23, 2020, 4:00pm

Is the Monte Carlo Dropout functionality introduced here PR: Ability to use dropout at prediction time (Monte Carlo Dropout) available in fastai v2?

much_learner · April 23, 2020, 7:31pm

What’s the point of having size argument in both Resize in item_tfms and aug_transforms in batch_tfms?

eniolasonowo · April 23, 2020, 11:23pm

Goodday , ill like to ask if you had success with it using fastai, i am stuck at trying to train, after passing it to a learner and trying to train i get a no target error.

vijayabhaskar · April 24, 2020, 1:47pm

Can someone help me with this problem?

dls_clas = DataBlock(
    blocks=[TextBlock.from_df(text_cols=["reviews"],vocab=dls_lm.vocab),RegressionBlock(n_out=1)],
    get_x    = ColReader("reviews"),
    get_y    = ColReader("sentiment"),
    splitter = RandomSplitter(0.1)
).dataloaders(traindf, bs=128, seq_len=80)

This Throws:

AttributeError: 'Series' object has no attribute 'reviews'

I’m confused how TextBlock.from_df(text_cols=["reviews"],vocab=dls_lm.vocab)
works, Any help is appreciated.

danjjohns · April 24, 2020, 5:56pm

Easiest solution is to create df[“text”] which contains your reviews/text data. I’m not sure why, but I do know this fixed my issue. Then change get_x to “text” and text_cols to “text”

vijayabhaskar · April 24, 2020, 6:17pm

Sorry, I don’t understand properly. Do you mean get_x=“reviews” instead of using colreader? And how to get the “sentiment”?

boris · April 24, 2020, 6:22pm

When you use item_tfms, resize is done on each file independently. This is needed so the data can be loaded on GPU.
After that, you can do additional transforms on GPU which are faster through batch_tfms, such as scaling/rotating and resizing again to a lower size.
In that case you would have sized to a higher size in item_tfms so that your other transforms are more accurate and include more details.

DrC · April 24, 2020, 6:31pm

How would you go about calling functions in the utils.py like search_images_bing ? (without copying it to local dir)

pip install utils doesn’t do it, it’s another pypi package.

sgugger · April 24, 2020, 8:13pm

The tokenizer will read the texts in “reviews” and tokenize them, but the result will be in a column called “texts” (unless you pass an argument to change that, should be something like output_col). So your get_x should use the column named “texts”.

sgugger · April 24, 2020, 8:14pm

It’s not a package, just an utils file. You need to have it in the same directory has the notebook you are working on.

danjjohns · April 24, 2020, 8:57pm

Sorry, that wasn’t clear. get_x=col_reader(“text”), get_y=(“label”). After it is tokenized, the text column and label column names are changed to text and label.