I am using basic_train
modules to create custom datasets. There are three dataloaders that need to be passed, train_dl
,valid_dl
and fix_dl
. What’s fix_dl
? It’s not optional. it hasn’t been documented as well. Neither are any traces of it in any nbs.
fix_dl is optional now, it was a bug to have it be a non-default arg.
Hello. My question is similar to the one of @prajjwal1 about fix_dl
in DataBunch. Then, I post it here.
I do not understand dl_tfms
as argument of the ImageDataBunch class in the fastai v1 documentation. What is the purpose of dl_tfms
that appears as well in the DataBunch class? Thanks.
Note: in the vision.data page of the fastai v1 documentation, we use ds_tfms
but I never saw dl_tfms
.
Hi Sylvain,
I have a problem too with fix_dl when I try to create a DataBunch with a test set. I’m working on colab using fats.ai version 1.0.42. This is the code that I use to create the DataBunch …
data = ImageDataBunch.from_folder(path=train_path, valid_pct=0.2, test=test_path,
ds_tfms=tfms, size=100, bs=32)
data.normalize(imagenet_stats)
That’s the output I get …
ImageDataBunch;
Train: LabelList
y: CategoryList (39124 items)
[Category Avocado ripe, Category Avocado ripe, Category Avocado ripe, Category Avocado
ripe, Category Avocado ripe]...
Path: data/fruits/fruits-360/Training
x: ImageItemList (39124 items)
[Image (3, 100, 100), Image (3, 100, 100), Image (3, 100, 100), Image (3, 100, 100), Image (3, 100, 100)]...
Path: data/fruits/fruits-360/Training;
Valid: LabelList
y: CategoryList (9781 items)
[Category Papaya, Category Plum 3, Category Walnut, Category Tomato Maroon, Category Cherry Wax Red]...
Path: data/fruits/fruits-360/Training
x: ImageItemList (9781 items)
[Image (3, 100, 100), Image (3, 100, 100), Image (3, 100, 100), Image (3, 100, 100), Image (3, 100, 100)]...
Path: data/fruits/fruits-360/Training;
Test: LabelList
y: EmptyLabelList (0 items)
[]...
Path: .
x: ImageItemList (0 items)
[]...
Path: data/fruits/fruits-360/Training
As you can see the test Test LabelList is empty. I think that the problem is in the create method of the DataBunch…
@classmethod
def create(cls, train_ds:Dataset, valid_ds:Dataset, test_ds:Optional[Dataset]=None, path:PathOrStr='.', bs:int=64,
num_workers:int=defaults.cpus, dl_tfms:Optional[Collection[Callable]]=None, device:torch.device=None,
collate_fn:Callable=data_collate, no_check:bool=False)->'DataBunch':
"Create a `DataBunch` from `train_ds`, `valid_ds` and maybe `test_ds` with a batch size of `bs`."
datasets = cls._init_ds(train_ds, valid_ds, test_ds)
val_bs = bs
dls = [DataLoader(d, b, shuffle=s, drop_last=s, num_workers=num_workers) for d,b,s in
zip(datasets, (bs,val_bs,val_bs,val_bs), (True,False,False,False)) if d is not None]
return cls(*dls, path=path, device=device, dl_tfms=dl_tfms, collate_fn=collate_fn, no_check=no_check)
… when you call the constructor unpacking the dataloaders …
return cls(*dls, path=path, device=device, dl_tfms=dl_tfms, collate_fn=collate_fn, no_check=no_check)
…because in the constructor of the DataBunch the third parameter is fix_dl and not test_dl. As a proof of this when I execute this command …
len(data.valid_dl.dataset), len(data.fix_dl.dataset), len(data.test_dl.dataset)
I get the following results…
(9781, 39124, 1)
Let me know if I’m right and this is a bug or otherwise if I missed something.
Thank you!
That is a bug if it happens, but I can’t reproduce it. Your codes gives me the fix_ds
and the test_ds
with the right number of elements. Can you double-check your version of fastai and then try your code on master?
Yes sure, I will check on the master and I’ll let you know. In the meanwhile you can reproduce the error using my notebook on colab. This is the public link:
https://colab.research.google.com/drive/1Irwe5wwvW25ZCsfAYeOXkmp01vibme7-
Andrea
Use print(learn.summary())
Hi,
I think dl_tfms stands for data loader transformations, if it can help someone.