A walk with fastai2 - Vision - Study Group and Online Lectures Megathread

muellerzr · February 6, 2020, 6:44pm

They are train and valid. There is no dsrc.test anymore as we can have unlimited datasets on any one object. As such, if you do dsrc[2] (index into it) this will be one of our test datasets. (if we had more than one, this could be dsrc[4]

This is why we can do the following later:
a,b = learn.get_preds(ds_idx=2), this grabs the third dataset in our dataloader (which is our test)

Srinivas · February 6, 2020, 7:06pm

Thanks! In playing with it, I saw that dsrc[0] and by extension dsrc[1], dsrc[2] are all tuples with first item of tuple being the image and the second the label.

muellerzr · February 6, 2020, 7:07pm

Ah I may be slightly wrong. Give me one moment

Srinivas · February 6, 2020, 7:12pm

Also - way to access test image is NOT then through show_at(dsrc.test, 3) as with train and valid but through dsrc[2][0].show() - at least that is what I got to work.

Is there a better way to access and show test images - like for train and valid images which use show_at(dsrc.train, idx)

muellerzr · February 6, 2020, 7:16pm

Got it. You access them via .subset. IE We can do:

show_at(dsrc.subset(2), 1)

With subset 2 being that test dataloader. This help @Srinivas?

Subset 0 is train, 1 valid, etc…

Srinivas · February 6, 2020, 7:19pm

Yes. Makes more sense. For I was actually able to do dsrc[1000] which is also a tuple of length 2 which says that dsrc ie the dataset has a len of 12954 which is probably a addition of train and test 9025 + 3929

muellerzr · February 6, 2020, 7:20pm

Yes, and the length of 2 that it contains an x and a y

Srinivas · February 6, 2020, 7:22pm

yup - that clears things up

barnacl · February 6, 2020, 7:28pm

quick question on this subset .
x,y = pets.subset(1)[0] is the same as pets.valid[0] so subset(1) is the validation set similarly subset(0) is the training set . but when you have more than one training set (in k-fold cross val), you cannot use the shortcuts any more you would have to always index into subset correct ? @Srinivas @muellerzr

muellerzr · February 6, 2020, 7:31pm

Yes. That is correct.

Srinivas · February 6, 2020, 7:36pm

OK - Moving on to dataloaders - dls
I see that dls.n_subsets is 3. And when I do len(dls.subset(0)) I get 7220 which is len of training set and strangely its type is Dataset (!!)
Now - when I do dls.subset(1) I get a list index out of range error - EH?? When there are 3 subsets to dls? What am I missing?

muellerzr · February 6, 2020, 7:37pm

Depending on how you set up your subsets. We were able to cause we used seperate index’s beforehand (splits) and then built the dsrc etc and kept the test. You could instead make a bunch of different dataloaders each with your different train split. There a a few ways you could do it

muellerzr · February 6, 2020, 7:38pm

Index into your dataloaders (try dls[0])

Srinivas · February 6, 2020, 7:40pm

dls[0] seems to be a transform
<fastai2.data.core.TfmdDL at 0x7f1a12646588>

Also - dls.subset(0) worked !! and its length is that of the training set.

muellerzr · February 6, 2020, 7:42pm

That’s not a transform, that’s a Transformed DataLoader

Srinivas · February 6, 2020, 7:46pm

Yes. I did not read that carefully. However still able to access dls.subset(0) but dls.subset(1) and dls.subset(2) give errors.

muellerzr · February 6, 2020, 7:50pm

That I’m unsure of. I’ve always index’d them. Does that exist on the git version too?

Srinivas · February 6, 2020, 7:57pm

do you mean on the dev version. If so, I will need to learn how to load that and try. Think @barnacl did that for something else. @barnacl could you share how you switched to dev version of fastai2 pls? Thx

muellerzr · February 6, 2020, 7:58pm

Check the notebook with dblock summary. That’s the dev install

foobar8675 · February 6, 2020, 8:12pm

for anyone running 04_Segmentation on windows, I had some cuda errors and what fixed it for me was changing this line

dls = camvid.dataloaders(path/'images', bs=8)

to

dls = camvid.dataloaders(path/'images', bs=8, num_workers=0)

explanation is here https://pytorch.org/docs/stable/notes/windows.html#cuda-ipc-operations