tmfdDL has two parameters (among others). One is bs=64, the other batch_size=None. Can any one explain to me the difference between the two? How are they meant to be used? A subsequent, show_batch() command crashes the system whenever batch_size is set to anything other than 1.
It’s from DataLoader
:
So it’s just for compatibility. Use bs
. (No idea why setting batch_size
would cause a problem, however).
I have the following code that loads the COCO_TINY dataset:
Data acquisition
ds_items = get_image_files(ds_source/‘train’)
ds_split = RandomSplitter(seed=SEED_VL, valid_pct=0.2)(ds_items)
train, valid = (ds_items[i] for i in ds_split)
ds_bbox = lambda o: get_x_y[o.name][0]
ds_label = lambda o: get_x_y[o.name][1]
Datasets
def to_np(x): return np.array(x, dtype=np.float32)
dsets = Datasets(ds_items, [PILImage.create, [ds_bbox, to_np, TensorBBox.create], [ds_label, MultiCategorize(add_na=True)]], splits=ds_split, n_inp=1)
Transformations and dataloaders (WORKS!)
aft_itm = [BBoxLabeler(as_item=False), PointScaler(y_first=False), ToTensor()]
aft_btch = [IntToFloatTensor(), AffineCoordTfm(size=SZ)]
tdl_trn = TfmdDL(dsets, bs=1, num_workers=4, after_item=aft_itm, after_batch=aft_btch, device=default_device())dldr = DataLoaders(tdl_trn)
If bs is set to 1 in TfmdDL, then the code works fine, tdl_trn.one_batch() runs correctly, and tdl_trn.show_batch( figsize=(5,5)) shows the image. However, if bs=2 (or any number greater than 1), tdl_trn.one_batch() crashes with the following error:
RuntimeError Traceback (most recent call last)
in ()
----> 1 x,y,z = tdl_trn.one_batch()
2 type(x), type(y), type(z), x.shape, y11 frames
/usr/local/lib/python3.6/dist-packages/torch/utils/data/utils/collate.py in default_collate(batch)
53 storage = elem.storage().new_shared(numel)
54 out = elem.new(storage)
—> 55 return torch.stack(batch, 0, out=out)
56 elif elem_type.module == ‘numpy’ and elem_type.name != 'str’
57 and elem_type.name != 'string’:RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 0. Got 1 and 2 in dimension 1 at /pytorch/aten/src/TH/generic/THTensor.cpp:689
Your help on how to solve this issue will be appreciated
That’s pretty hard to read. Can you please use markdown code formatting and try to make your post as clear as possible?
My apologies, Sir. I am pretty inept at formatting things. I have edited the code above and hope it is clearer.
IIRC coco_tiny
images are not all the same size. You should include a Resize method first in your item_tfms
(after_item) to get it working.
That’s the reason I set AffineCoordTfm(size=SZ). At least I thought so.