A Brief Guide to Test Sets in v2 (you can do labelled now too!)

muellerzr · October 23, 2019, 2:45pm

Hi everyone,

I was having some troubles doing test sets in v2 so I decided to make a brief tutorial notebook for doing so. This is two-fold, first to show what test sets can be, and second to show that we can now have labelled test sets! My notebook is available here

A detailed walk-through is below:
Create your DataLoader using your test set. In my example notebook I use the ADULTs dataset.

to = TabularPandas(df_main, procs, cat_names, cont_names, y_names="salary", splits=splits)
dls = to.dataloaders()
test_dl = dls.test_dl(df_test, with_labels=True)

From here, we can pass in this new DataLoader into learn.get_preds() and learn.validate() and utilize both functionalities as so:

preds = learn.get_preds(dl=test_dl)
learn.validate(dl=test_dl)

and it’s as simple as that! If you are worried about it not aligning, you can check yourself by doing:

accuracy(preds[0], preds[1])

And it should align with the accuracy you achieved with learn.validate(dl=test_dl)

Hope this helps someone!

farid · October 23, 2019, 4:22pm

Thank you @muellerzr for sharing. There is a small typo:

dbuch = to.databunch() should be replaced by:
dbch = to.databunch() in order to be consistent with:
learn = Learner(dbch, model, CrossEntropyLossFlat(), opt_func=opt_func, metrics=accuracy)

Otherwise the latter line will prompt the following error:
NameError: name 'dbch' is not defined

muellerzr · October 23, 2019, 4:31pm

Thanks! I found another one too with dbunch_test. Updated. Runs fine through and through now

farid · October 23, 2019, 4:37pm

A follow-up to my previous message:

dbunch_test = to_test.databunch(shuffle_train=False) should also be replaced by:
dbch_test = to_test.databunch(shuffle_train=False) in order to be consistent with:
preds = learn.get_preds(dl=dbch_test.train_dl)

Also, the markdown line should be changed accordingly (really minor typo but just to be consistent):

We can pass in our dbch_test’s dataloader (either train_dl or valid_dl) in the dl argument for both and it will operate on them!

muellerzr · October 23, 2019, 4:53pm

Thanks. Fixed. Also found an inconsistency I hadn’t updated regarding just using test_dl without generating the test databunch (and wasting memory)

cudawarped · November 14, 2019, 10:21am

Is there an easy way to use an image test set with labels as well? I am trying to use
test_dl() so that the validation transforms are applied to the test set but I cannot figure out a way to include the labels. Currently I am manually extracting them as shown below

# construct the test data loader
test_items = get_image_files(path_to_test_set)
test_dl_ = test_dl(dbunch_val, test_items)

# manually extract the labels for the test set
y_labels = L(map(parent_label,test_items))
_,o2i = uniqueify(y_labels, sort=True, bidir=True)
y = torch.from_numpy(np.array(L(map(o2i.get,y_labels))))

# check the accuracy
preds = learn.get_preds(dl=test_dl_)
accuracy(preds[0],y)

muellerzr · November 14, 2019, 2:16pm

You should be just creating a labeled dataloader for your images and then passing it into get_preds and that’s it. Or validate.

cudawarped · November 18, 2019, 5:43pm

Do you have this working with images, if so what type of data loader do you use, I have tried TfmdDL with many variations of the following without success?

tfms = [[PILImage.create], [parent_label, Categorize()]]
dsrc = DataSource(test_items,tfms)
tdl = TfmdDL(dsrc, bs=8, after_item=[Resize(224), ToTensor(), IntToFloatTensor()],
             after_batch=[Cuda(), IntToFloatTensor(),Normalize(*imagenet_stats)],num_workers=0)

I have also tried

tdl = dbunch.valid_dl.new(dsrc)

but it looks like the training and not the validation transforms are being applied when I look at the images with

tdl.show_batch()

muellerzr · November 18, 2019, 7:19pm

Hi @cudawarped, it’s much simpler than that! So lets say that we have a dbunch already made. Say from the PETs dataset like so:

dbunch = pets.databunch(untar_data(URLs.PETS)/"images", item_tfms=RandomResizedCrop(460, min_scale=0.75), bs=32,
                        batch_tfms=[*aug_transforms(size=224, max_warp=0), Normalize(*imagenet_stats)])

Where pets is a DataBlock. We can then grab some file names (I’ll use the entirety of the PETs dataset):

fnames = get_image_files(untar_data(URLs.PETS))

and then we can make our dataloader like so:

dl = test_dl(dbunch, fnames)

You should then be able to do a learn.validate(dl=dl) and run it. I can’t verify as of right now due to a bug (I believe) but that should do it!

cudawarped · November 18, 2019, 7:25pm

Hi, thanks for the quick reply.

That is what I originally tried but the labels are missing. The label transform is removed because a test set usually won’t contain labels. If I add it again then the training not the validation augmentation transforms appear to be applied when I call show_batch().

muellerzr · November 18, 2019, 7:42pm

Ahh yes. Very true. That’s an unlabeled test set. @sgugger is there a way to go about this? I’m trying to look around to see an easy way like how there was for tabular but I didn’t see any. Perhaps add an option to label in test_dl? Or is there a method for doing so with the DataBlock that I’m not quite seeing.

sgugger · November 18, 2019, 8:27pm

To change the transforms applied, you may need to use the context manager:

with my_dl.set_split_idx(1):
    do things with my_dl

1 is for the validation set, 0 is for the training set.

Also, you can normally pass a split_idx to a DataSource (again, 0 for training set, 1 for validation set).

muellerzr · November 18, 2019, 10:27pm

@sgugger I think I follow you. So for example with the test_dl I made earlier, I could then apply a label transform to have a labeled dataloader?

sgugger · November 18, 2019, 10:56pm

No, this is just to change the behavior of the transforms (when they are different on the training vs validation set). You can’t add new transforms with this.

muellerzr · January 22, 2020, 12:47am

On this point, I figured out how to make it work. (I’ll include this in WWF2):

First make your test_set:

tst_set = test_set(learn.dbunch.valid_ds, tst_imgs)

Then we need to add a labelling TfmdList (this can be anything you want)

val_tfmdlist = TfmdList(tst_imgs, tfms=[parent_label, Categorize])

Finally, add to the tls in our tst_set

tst_set.tls.append(val_tfmdlist)

And now you can simply do:

dl = dbunch.valid_dl.new(tst_set)
learn.validate(dl=dl)

@sgugger was there an easier way of doing something I may have overlooked?

mschmit5 · January 22, 2020, 1:03am

Do you think it’s a good idea to move to v2 right now if I’ve only got to lesson 3 in the fastai course or do you suggest sticking with v1 until v2 is more developed? Are the benefits of v2 obvious to you? Thank you for your effort

muellerzr · January 22, 2020, 1:06am

I’d say go until lesson 4, when you’re comfortable with how it all works (tabular and images) and then you can start to move over! Especially so you understand as a whole how fastai v1 is working, as v2 is very similar. In terms of benefits, absolutely there are tons of reasons why I prefer the v2 library over v1! (hence why I started the study group, to help others out with migrating )

mschmit5 · January 22, 2020, 1:08am

Hats off, you’re a great person.

sgugger · January 22, 2020, 3:18pm

It seems a bit weird to do it this way. You know you can have several validation sets in a DataSource/DataBunch? Just send all the items and a list of three splits instead of two.

muellerzr · January 22, 2020, 3:28pm

I figured I may have done something inefficiently. I’ll try that