Fastai v2 chat

@sgugger It seems to me that it is conflicting with splits. When I create a dataset without splits, the wgts seems to be working. Otherwise it complains about the sizes of the parameters (weighted_dataloaders):

ValueError: 'a' and 'p' must have same size

I have not designed that particular API, so there is no need to tag me. I have no more knowledge of what is causing your error than you do :wink:

1 Like

I did not know that. Sorry.

Hi Boris.

Do you know how the weights must be passed to the weighted_dataloaders? When I do not have splits, it works fine with an array of weights. But when I have splits, I get the following error:

ValueError: 'a' and 'p' must have same size

They should be passed only to the training dataset. They won’t be used on the validation dataset (which will process all the items it has).

1 Like

Oh, ok! Thank you very much!

Thank you for fixing them, I found some imports being duplicated too. I opened a pull request.

My solution didn’t work. Sorry, i didn’t say nothing! :zipper_mouth_face:

I’m trying to run the fastbook notebooks. I installed fastai2 and fastcore as editable installs, but I get the following error when I try to run the first line of the notebook:

ModuleNotFoundError: No module named ‘fastai2’

How can I get fastai2 to work in these notebooks?

How about? !pip install fastai2

Is the Monte Carlo Dropout functionality introduced here PR: Ability to use dropout at prediction time (Monte Carlo Dropout) available in fastai v2?

What’s the point of having size argument in both Resize in item_tfms and aug_transforms in batch_tfms?

Goodday , ill like to ask if you had success with it using fastai, i am stuck at trying to train, after passing it to a learner and trying to train i get a no target error.

Can someone help me with this problem?

dls_clas = DataBlock(
    blocks=[TextBlock.from_df(text_cols=["reviews"],vocab=dls_lm.vocab),RegressionBlock(n_out=1)],
    get_x    = ColReader("reviews"),
    get_y    = ColReader("sentiment"),
    splitter = RandomSplitter(0.1)
).dataloaders(traindf, bs=128, seq_len=80)

This Throws:

AttributeError: 'Series' object has no attribute 'reviews'

I’m confused how TextBlock.from_df(text_cols=["reviews"],vocab=dls_lm.vocab)
works, Any help is appreciated.

Easiest solution is to create df[“text”] which contains your reviews/text data. I’m not sure why, but I do know this fixed my issue. Then change get_x to “text” and text_cols to “text”

Sorry, I don’t understand properly. Do you mean get_x=“reviews” instead of using colreader? And how to get the “sentiment”?

When you use item_tfms, resize is done on each file independently. This is needed so the data can be loaded on GPU.
After that, you can do additional transforms on GPU which are faster through batch_tfms, such as scaling/rotating and resizing again to a lower size.
In that case you would have sized to a higher size in item_tfms so that your other transforms are more accurate and include more details.

1 Like

How would you go about calling functions in the utils.py like search_images_bing ? (without copying it to local dir)

pip install utils doesn’t do it, it’s another pypi package.

The tokenizer will read the texts in “reviews” and tokenize them, but the result will be in a column called “texts” (unless you pass an argument to change that, should be something like output_col). So your get_x should use the column named “texts”.

1 Like

It’s not a package, just an utils file. You need to have it in the same directory has the notebook you are working on.

Sorry, that wasn’t clear. get_x=col_reader(“text”), get_y=(“label”). After it is tokenized, the text column and label column names are changed to text and label.

1 Like