Lesson 3 In-Class Discussion ✅

yes forgot to pull

I am getting this as well. I ran git pull followed by restarting the kernel, but still get the error.

1 Like

Could someone talk a bit more about the data block ideology? I’m not quite sure how the blocks are meant to be used. Do they have to be in a certain order? Is there any other library that uses this type of programming I could look at?

8 Likes

Looking at the MNIST example at the end of the lesson 1 notebook, I saw that the unzipped version of the file had 3 items in it: labels.csv, a valid folder, and a train folder.

That being said, when creating the ImageDataBunch like this:
data = ImageDataBunch.from_csv(path, ds_tfms=tfms, size=28)

How does the default value of valid_pct = 0.2 come into the picture? Will the ImageDataBunch just have a validation set of 20% of what was in the validation folder or does the function look at all the images in both the valid and train folder in order to make the validation set?

1 Like

How frequently does the fit / training function use the validation set? Is it once (or multiple times) an epoch?

Unclear to met when the fitting process looks at training set vs validation set.

This is a toy example where we put everything to be able to test efficiently. I don’t recommend looking at it for a first understanding.
Here for instance, the files labels.csv has filenames in train and valid.

1 Like

It runs a full run through the validation set after each epoch. So once per epoch. It is used to see if your model is able to generalize and tune hyperparameters. It is not used for updating the weights. The model uses the validation set’s information only indirectly, when the user changes hyperparameters according to the training and validation losses.

4 Likes

Tried to load some xray images (grayscale) using ImageDataBunch. When tried to visualize with show_batch it comes in black and white (not grayscale). Does some custom dataset is required?

You have loaded the labels for the training set via train_v2.csv. How do you load the labels for the validation and test sets?

1 Like

Why do we use size 128 for the planet dataset when loading the images?

1 Like

Does anyone know where I can see the notebook with the example dataset stuff in? It’s not the same as the https://github.com/fastai/course-v3/blob/master/nbs/dl1/lesson3-planet.ipynb one

2 Likes

Does the order of things matter for Data Blocks?

1 Like

More generally, I’d love a run through of how to include a test set and extract not just labels but predictions (for prep to submit for a competition, for example).

if your data are 16bit grayscale (likely) then you will have to make your own open_image

1 Like

Can you add a test folder if you created label from a CSV file?

1 Like

Can we use fastai for feature extraction?

1 Like

Looks like it’s here: https://github.com/fastai/fastai/blob/master/docs_src/data_block.ipynb

5 Likes

worked for me - im using the dev version

is it possible to do multi label classification with a model trained on single labels?

2 Likes

Can we use DataBlock API with text DataSet?

1 Like