Fastai v2 chat

They should be passed only to the training dataset. They won’t be used on the validation dataset (which will process all the items it has).

1 Like

Oh, ok! Thank you very much!

Thank you for fixing them, I found some imports being duplicated too. I opened a pull request.

My solution didn’t work. Sorry, i didn’t say nothing! :zipper_mouth_face:

I’m trying to run the fastbook notebooks. I installed fastai2 and fastcore as editable installs, but I get the following error when I try to run the first line of the notebook:

ModuleNotFoundError: No module named ‘fastai2’

How can I get fastai2 to work in these notebooks?

How about? !pip install fastai2

Is the Monte Carlo Dropout functionality introduced here PR: Ability to use dropout at prediction time (Monte Carlo Dropout) available in fastai v2?

What’s the point of having size argument in both Resize in item_tfms and aug_transforms in batch_tfms?

Goodday , ill like to ask if you had success with it using fastai, i am stuck at trying to train, after passing it to a learner and trying to train i get a no target error.

Can someone help me with this problem?

dls_clas = DataBlock(
    blocks=[TextBlock.from_df(text_cols=["reviews"],vocab=dls_lm.vocab),RegressionBlock(n_out=1)],
    get_x    = ColReader("reviews"),
    get_y    = ColReader("sentiment"),
    splitter = RandomSplitter(0.1)
).dataloaders(traindf, bs=128, seq_len=80)

This Throws:

AttributeError: 'Series' object has no attribute 'reviews'

I’m confused how TextBlock.from_df(text_cols=["reviews"],vocab=dls_lm.vocab)
works, Any help is appreciated.

Easiest solution is to create df[“text”] which contains your reviews/text data. I’m not sure why, but I do know this fixed my issue. Then change get_x to “text” and text_cols to “text”

Sorry, I don’t understand properly. Do you mean get_x=“reviews” instead of using colreader? And how to get the “sentiment”?

When you use item_tfms, resize is done on each file independently. This is needed so the data can be loaded on GPU.
After that, you can do additional transforms on GPU which are faster through batch_tfms, such as scaling/rotating and resizing again to a lower size.
In that case you would have sized to a higher size in item_tfms so that your other transforms are more accurate and include more details.

1 Like

How would you go about calling functions in the utils.py like search_images_bing ? (without copying it to local dir)

pip install utils doesn’t do it, it’s another pypi package.

The tokenizer will read the texts in “reviews” and tokenize them, but the result will be in a column called “texts” (unless you pass an argument to change that, should be something like output_col). So your get_x should use the column named “texts”.

1 Like

It’s not a package, just an utils file. You need to have it in the same directory has the notebook you are working on.

Sorry, that wasn’t clear. get_x=col_reader(“text”), get_y=(“label”). After it is tokenized, the text column and label column names are changed to text and label.

1 Like

Empty images
I’m setting up to run a vision model on some images that have been annotated with coco-json style annotations. Many of the training images have no objects of interest in them, therefore no bounding boxes.

I’m not sure how to properly annotate ‘empty’ images. I’ve tried adding a list of tuples to the annotation list for each of the empty photos, in two styles:

  1. ([],[]) #bbox, class
  2. ([[0.,0.,0.,0.]],[]) #bbox, class

It almost works: I’ve gotten the data and annotations into a fastai2 dataloader, and all seems fine until I try show_batch; then it throws an error. I’ve found some conflicting advice in the forums. Does anyone know for sure how to do it right?

Or is it preferable to train the model exclusively on images that do contain objects of interest?

Hey all. Here are the times im running into while building a language model. I want to make sure it seems right. Training is still going and all of the metrics are spot on, just want to make sure im doing it right.
Dataframe records - 17.1 million
Average length of text = 13 words
Time to build Dataloader = 2 hours 15 min
batch size = 256
seq_len = 128
Time per Epoch = 2 hours 35 min
GPU utilized = 20%

Again, everything seems to be doing very well, just checking if this is normal behavior or not :slight_smile:

When I tried to load the saved encoder from the LM to classification model, I get

RuntimeError: Error(s) in loading state_dict for AWD_LSTM:
	size mismatch for encoder.weight: copying a param with shape torch.Size([14280, 400]) from checkpoint, the shape in current model is torch.Size([14304, 400]).
	size mismatch for encoder_dp.emb.weight: copying a param with shape torch.Size([14280, 400]) from checkpoint, the shape in current model is torch.Size([14304, 400]).

I’m following the fastbook draft, I did exactly like it said.

How to use torchvision models with fastai2? I need to use mobilenet_v2 which is available in fastai1.