I think the 2 should be -2. You’re cutting your model at the second layer instead of the second to last, which then doesn’t work with the split.
This split will be made easier later, it’s basically the layers where to separate the model to have three groups for differential learning rates.
I am not sure if I should start an extra thread for this, but I currently try to debug this strange behavior when I iterate through a DataBunch from ObjectDetectDatasets based on png images with bounding boxes (so kind of segmentation):
Test code for a dummy dataset of 100 entries each for train and valid:
# Create ObjectDetectDatasets
train_ds = get_datasets(PATH_train)
valid_ds = get_datasets(PATH_valid)
size = 128
bs = 4 # bs=1 is working!
# Create DataBunch
def get_data(bs, size):
return DataBunch.create(train_ds, valid_ds, bs=bs, size=size,ds_tfms=None, path=PATH)
data = get_data(bs, size)
# Test DataBunch DataLoader
for i in range(100):
print(i, end=', ')
next(iter(data.train_dl.dl))
Output:
bs = 1
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99,
For bs > 1it stops the loop after a unreproducible step (I guess due to random shuffling) with this error:
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-132-d0590afb83a6> in <module>()
1 for i in range(100):
2 print(i, end=', ')
----> 3 next(iter(data.train_dl.dl))
~/anaconda3/envs/fastai/lib/python3.6/site-packages/torch/utils/data/dataloader.py in __next__(self)
351 self.reorder_dict[idx] = batch
352 continue
--> 353 return self._process_next_batch(batch)
354
355 next = __next__ # Python 2 compatibility
~/anaconda3/envs/fastai/lib/python3.6/site-packages/torch/utils/data/dataloader.py in _process_next_batch(self, batch)
372 self._put_indices()
373 if isinstance(batch, ExceptionWrapper):
--> 374 raise batch.exc_type(batch.exc_msg)
375 return batch
376
RuntimeError: Traceback (most recent call last):
File "/home/paperspace/anaconda3/envs/fastai/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 114, in _worker_loop
samples = collate_fn([dataset[i] for i in batch_indices])
File "/home/paperspace/fastai/fastai/torch_core.py", line 86, in data_collate
return torch.utils.data.dataloader.default_collate(to_data(batch))
File "/home/paperspace/anaconda3/envs/fastai/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 198, in default_collate
return [default_collate(samples) for samples in transposed]
File "/home/paperspace/anaconda3/envs/fastai/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 198, in <listcomp>
return [default_collate(samples) for samples in transposed]
File "/home/paperspace/anaconda3/envs/fastai/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 198, in default_collate
return [default_collate(samples) for samples in transposed]
File "/home/paperspace/anaconda3/envs/fastai/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 198, in <listcomp>
return [default_collate(samples) for samples in transposed]
File "/home/paperspace/anaconda3/envs/fastai/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 175, in default_collate
return torch.stack(batch, 0, out=out)
RuntimeError: invalid argument 0: Tensors must have same number of dimensions: got 2 and 3 at /opt/conda/conda-bld/pytorch-nightly_1538165619353/work/aten/src/TH/generic/THTensorMoreMath.cpp:1308
or got 3 and 2
.
The error also occurs with the show_image_batch()
method.
I found this thread on PyTorch forum which points into direction of png files with different channel numbers: https://discuss.pytorch.org/t/runtimeerror-invalid-argument-0/17919/5
However, in the fastai library the open_image()
function uses .convert('RGB')
and when I debug the tensor shapes I always find the same shape for each element with 3 channels x width x height.
What I don’t get is why is it working with bs = 1?
Maybe somebody has a tip?
Maybe I am using parts of the library which are currently under development?
Thank you & best regards
Michael
PS: When I try to visualize the images with show_image_batch()
and bs = 1 I get this error:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-68-7d54941a820f> in <module>
1 # http://docs.fast.ai/vision.data.html
----> 2 show_image_batch(data.train_dl, data.train_ds.classes, rows=3, figsize=(5,5))
~/fastai/fastai/vision/data.py in show_image_batch(dl, classes, rows, figsize, denorm)
44 x = x[:rows*rows].cpu()
45 if denorm: x = denorm(x)
---> 46 show_images(x,y[:rows*rows].cpu(),rows, classes, figsize)
47
48 def show_images(x:Collection[Image],y:int,rows:int, classes:Collection[str], figsize:Tuple[int,int]=(9,9))->None:
AttributeError: 'list' object has no attribute 'cpu'
When I define a custom show_image_batch()
function without the .cpu()
in the jupyter notebook I get this error:
NameError: name 'show_image' is not defined
@Tcapelle If you call the model with tvm.resnet34() and your body, you can see where the create_body
function cuts.
For example with 2 instead of -2 you only see the input stage with the 7x7 kernel and a subsequent batchnorm layer:
Sequential(
(0): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
(1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
Thanks, I have already figured it out, I changed this to -2.
Anyway,
lr_find(learn)
>> ---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-98-dcd2a06c9caf> in <module>()
----> 1 lr_find(learn)
2 # learn.recorder.plot()
/usr/local/lib/python3.6/dist-packages/fastai/train.py in lr_find(learn, start_lr, end_lr, num_it, **kwargs)
24 cb = LRFinder(learn, start_lr, end_lr, num_it)
25 a = int(np.ceil(num_it/len(learn.data.train_dl)))
---> 26 learn.fit(a, start_lr, callbacks=[cb], **kwargs)
27
28 def to_fp16(learn:Learner, loss_scale:float=512., flat_master:bool=False)->Learner:
/usr/local/lib/python3.6/dist-packages/fastai/basic_train.py in fit(self, epochs, lr, wd, callbacks)
135 callbacks = [cb(self) for cb in self.callback_fns] + listify(callbacks)
136 fit(epochs, self.model, self.loss_fn, opt=self.opt, data=self.data, metrics=self.metrics,
--> 137 callbacks=self.callbacks+callbacks)
138
139 def create_opt(self, lr:Floats, wd:Floats=0.)->None:
/usr/local/lib/python3.6/dist-packages/fastai/basic_train.py in fit(epochs, model, loss_fn, opt, data, callbacks, metrics)
88 except Exception as e:
89 exception = e
---> 90 raise e
91 finally: cb_handler.on_train_end(exception)
92
/usr/local/lib/python3.6/dist-packages/fastai/basic_train.py in fit(epochs, model, loss_fn, opt, data, callbacks, metrics)
78 for xb,yb in progress_bar(data.train_dl, parent=pbar):
79 xb, yb = cb_handler.on_batch_begin(xb, yb)
---> 80 loss = loss_batch(model, xb, yb, loss_fn, opt, cb_handler)
81 if cb_handler.on_batch_end(loss): break
82
/usr/local/lib/python3.6/dist-packages/fastai/basic_train.py in loss_batch(model, xb, yb, loss_fn, opt, cb_handler, metrics)
16 if not is_listy(xb): xb = [xb]
17 if not is_listy(yb): yb = [yb]
---> 18 out = model(*xb)
19 out = cb_handler.on_loss_begin(out)
20
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
475 result = self._slow_forward(*input, **kwargs)
476 else:
--> 477 result = self.forward(*input, **kwargs)
478 for hook in self._forward_hooks.values():
479 hook_result = hook(self, input, result)
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/container.py in forward(self, input)
90 def forward(self, input):
91 for module in self._modules.values():
---> 92 input = module(input)
93 return input
94
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
475 result = self._slow_forward(*input, **kwargs)
476 else:
--> 477 result = self.forward(*input, **kwargs)
478 for hook in self._forward_hooks.values():
479 hook_result = hook(self, input, result)
/usr/local/lib/python3.6/dist-packages/fastai/vision/models/unet.py in forward(self, up_in)
28 up_out = self.upconv(up_in)
29 cat_x = torch.cat([up_out, self.hook.stored], dim=1)
---> 30 x = F.relu(self.conv1(cat_x))
31 x = F.relu(self.conv2(x))
32 return self.bn(x)
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
475 result = self._slow_forward(*input, **kwargs)
476 else:
--> 477 result = self.forward(*input, **kwargs)
478 for hook in self._forward_hooks.values():
479 hook_result = hook(self, input, result)
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py in forward(self, input)
311 def forward(self, input):
312 return F.conv2d(input, self.weight, self.bias, self.stride,
--> 313 self.padding, self.dilation, self.groups)
314
315
/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py in handler(signum, frame)
271 # This following call uses `waitid` with WNOHANG from C side. Therefore,
272 # Python can still get and update the process status successfully.
--> 273 _error_if_any_worker_fails()
274 if previous_handler is not None:
275 previous_handler(signum, frame)
RuntimeError: DataLoader worker (pid 2631) is killed by signal: Bus error.
This is the call I would like to understand: learn.split([model[0][6], model[1]])
I was pretty good using fastai v0.7 and I am having a hard time with this…
You didn’t specify any transform so I’m guessing you don’t have images of the same size. The error message indicates pytorch isn’t able to group them in a batch.
This error message has nothing to do with the model being split. A quick search led me there, I don’t know if this is applicable to you or not.
learn.model.state_dict()
this is not working.
I solved this reducing the bs. I never had this problem in v0.7 with the exactly same dataset and params.
We need more info to help you. Stack trace and exact code and error message at least.
Thank you for your fast reply!
You guys are great!
I now added a ds_tfms and tfms (see code below).
I have to specifiy the ds_tfms as a list because, otherwise I get an error that it cannot be indexed.
Calling data.train_ds.tfms, data.valid_ds.tfms, data.train_dl.tfms, and data.valid_dl.tfms returns the information on the transformation and seems to be looking ok.
However, I get this error with show_image_batch()
:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-31-7d54941a820f> in <module>
----> 1 show_image_batch(data.train_dl, data.train_ds.classes, rows=3, figsize=(5,5))
~/fastai/fastai/vision/data.py in show_image_batch(dl, classes, rows, figsize, denorm)
40 denorm:Callable=None) -> None:
41 "Show a few images from a batch."
---> 42 x,y = next(iter(dl))
43 if rows is None: rows = int(math.sqrt(len(x)))
44 x = x[:rows*rows].cpu()
~/anaconda3/envs/fastai/lib/python3.6/site-packages/torch/utils/data/dataloader.py in __next__(self)
602 self.reorder_dict[idx] = batch
603 continue
--> 604 return self._process_next_batch(batch)
605
606 next = __next__ # Python 2 compatibility
~/anaconda3/envs/fastai/lib/python3.6/site-packages/torch/utils/data/dataloader.py in _process_next_batch(self, batch)
623 self._put_indices()
624 if isinstance(batch, ExceptionWrapper):
--> 625 raise batch.exc_type(batch.exc_msg)
626 return batch
627
AttributeError: Traceback (most recent call last):
File "/home/paperspace/anaconda3/envs/fastai/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 137, in _worker_loop
samples = collate_fn([dataset[i] for i in batch_indices])
File "/home/paperspace/anaconda3/envs/fastai/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 137, in <listcomp>
samples = collate_fn([dataset[i] for i in batch_indices])
File "/home/paperspace/fastai/fastai/vision/data.py", line 190, in __getitem__
x = apply_tfms(self.tfms, x, **self.kwargs)
File "/home/paperspace/fastai/fastai/vision/image.py", line 422, in apply_tfms
tfms = sorted(listify(tfms), key=lambda o: o.tfm.order)
File "/home/paperspace/fastai/fastai/vision/image.py", line 422, in <lambda>
tfms = sorted(listify(tfms), key=lambda o: o.tfm.order)
AttributeError: 'functools.partial' object has no attribute 'tfm'
This is my code:
paths = [path1, path2]
stats = ([0.4914, 0.4914, 0.4914], [0.2492, 0.2492, 0.2492])
norm, denorm = normalize_funcs(*stats)
tfms = get_transforms()
def get_tfms_datasets(size, paths, tfms):
datasets = get_datasets(paths)
return transform_datasets(*datasets, test_ds=None, tfms=tfms, size=size)
def get_data(bs, size, paths):
return DataBunch.create(*get_tfms_datasets(size, tfms=tfms, paths=paths), bs=bs, size=size, ds_tfms=[norm, norm], tfms=tfms)
data = get_data(bs, size, paths)
I also upgraded to the latest pytorch-nightly.
There must be still an issue with how I apply the tfms to the data…?
Best regards
Michael
Yes ds_tfms
must be a list of two list of transforms (one for the training set, one for the validation set) as explained in the docs.
Then in your last DataBunch
you’re mxing the arguments: tfms
are the transforms that will be applied to the batches, so it should be [norm] and ds_tfms
should be your tfms
variable.
I think the error comes from loading the weights of resnet34.
from fastai.vision.models.unet import *
body = create_body(tvm.resnet34(True), -2) #/root/.torch/models/
model = DynamicUnet(body, n_classes=2).cuda()
learn = Learner(data, model, metrics=metrics,
loss_fn=CrossEntropyFlat())
learn.split([model[0][7], model[1]])
learn.freeze()
lr_find(learn)
>>RuntimeError Traceback (most recent call last)
<ipython-input-78-dcd2a06c9caf> in <module>()
----> 1 lr_find(learn)
2 # learn.recorder.plot()
/usr/local/lib/python3.6/dist-packages/fastai/train.py in lr_find(learn, start_lr, end_lr, num_it, **kwargs)
24 cb = LRFinder(learn, start_lr, end_lr, num_it)
25 a = int(np.ceil(num_it/len(learn.data.train_dl)))
---> 26 learn.fit(a, start_lr, callbacks=[cb], **kwargs)
27
28 def to_fp16(learn:Learner, loss_scale:float=512., flat_master:bool=False)->Learner:
/usr/local/lib/python3.6/dist-packages/fastai/basic_train.py in fit(self, epochs, lr, wd, callbacks)
136 callbacks = [cb(self) for cb in self.callback_fns] + listify(callbacks)
137 fit(epochs, self.model, self.loss_fn, opt=self.opt, data=self.data, metrics=self.metrics,
--> 138 callbacks=self.callbacks+callbacks)
139
140 def create_opt(self, lr:Floats, wd:Floats=0.)->None:
/usr/local/lib/python3.6/dist-packages/fastai/basic_train.py in fit(epochs, model, loss_fn, opt, data, callbacks, metrics)
69 cb_handler = CallbackHandler(callbacks)
70 pbar = master_bar(range(epochs))
---> 71 cb_handler.on_train_begin(epochs, pbar=pbar, metrics=metrics)
72
73 exception=False
/usr/local/lib/python3.6/dist-packages/fastai/callback.py in on_train_begin(self, epochs, pbar, metrics)
186 self.state_dict = _get_init_state()
187 self.state_dict['n_epochs'],self.state_dict['pbar'],self.state_dict['metrics'] = epochs,pbar,metrics
--> 188 self('train_begin')
189
190 def on_epoch_begin(self)->None:
/usr/local/lib/python3.6/dist-packages/fastai/callback.py in __call__(self, cb_name, **kwargs)
180 def __call__(self, cb_name, **kwargs)->None:
181 "Call through to all of the `CallbakHandler` functions."
--> 182 return [getattr(cb, f'on_{cb_name}')(**self.state_dict, **kwargs) for cb in self.callbacks]
183
184 def on_train_begin(self, epochs:int, pbar:PBar, metrics:MetricFuncList)->None:
/usr/local/lib/python3.6/dist-packages/fastai/callback.py in <listcomp>(.0)
180 def __call__(self, cb_name, **kwargs)->None:
181 "Call through to all of the `CallbakHandler` functions."
--> 182 return [getattr(cb, f'on_{cb_name}')(**self.state_dict, **kwargs) for cb in self.callbacks]
183
184 def on_train_begin(self, epochs:int, pbar:PBar, metrics:MetricFuncList)->None:
/usr/local/lib/python3.6/dist-packages/fastai/callbacks/lr_finder.py in on_train_begin(self, **kwargs)
22 def on_train_begin(self, **kwargs:Any)->None:
23 "Initialize optimizer and learner hyperparameters."
---> 24 self.learn.save('tmp')
25 self.opt = self.learn.opt
26 self.opt.lr = self.sched.start
/usr/local/lib/python3.6/dist-packages/fastai/basic_train.py in save(self, name)
167 def save(self, name:PathOrStr):
168 "Save model with `name` to `self.model_dir`."
--> 169 torch.save(self.model.state_dict(), self.path/self.model_dir/f'{name}.pth')
170
171 def load(self, name:PathOrStr):
/usr/local/lib/python3.6/dist-packages/torch/serialization.py in save(obj, f, pickle_module, pickle_protocol)
207 >>> torch.save(x, buffer)
208 """
--> 209 return _with_file_like(f, "wb", lambda f: _save(obj, f, pickle_module, pickle_protocol))
210
211
/usr/local/lib/python3.6/dist-packages/torch/serialization.py in _with_file_like(f, mode, body)
132 f = open(f, mode)
133 try:
--> 134 return body(f)
135 finally:
136 if new_fd:
/usr/local/lib/python3.6/dist-packages/torch/serialization.py in <lambda>(f)
207 >>> torch.save(x, buffer)
208 """
--> 209 return _with_file_like(f, "wb", lambda f: _save(obj, f, pickle_module, pickle_protocol))
210
211
/usr/local/lib/python3.6/dist-packages/torch/serialization.py in _save(obj, f, pickle_module, pickle_protocol)
286 f.flush()
287 for key in serialized_storage_keys:
--> 288 serialized_storages[key]._write_file(f, _should_read_directly(f))
289
290
RuntimeError: cuda runtime error (59) : device-side assert triggered at /pytorch/torch/csrc/generic/serialization.cpp:15
I will add that when this eerror ocurr, I am force to restart the kernel, even a model that worked before (some lines before) it triggers:
Traceback (most recent call last):
File "/usr/lib/python3.6/multiprocessing/queues.py", line 234, in _feed
obj = _ForkingPickler.dumps(obj)
File "/usr/lib/python3.6/multiprocessing/reduction.py", line 51, in dumps
cls(buf, protocol).dump(obj)
File "/usr/local/lib/python3.6/dist-packages/torch/multiprocessing/reductions.py", line 240, in reduce_storage
fd, size = storage._share_fd_()
RuntimeError: unable to write to file </torch_352_228954310>
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-87-aec71e564917> in <module>()
----> 1 x,y = next(iter(md.train_dl))
/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py in __next__(self)
596 while True:
597 assert (not self.shutdown and self.batches_outstanding > 0)
--> 598 idx, batch = self._get_batch()
599 self.batches_outstanding -= 1
600 if idx != self.rcvd_idx:
/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py in _get_batch(self)
575 # need to call `.task_done()` because we don't use `.join()`.
576 else:
--> 577 return self.data_queue.get()
578
579 def __next__(self):
/usr/lib/python3.6/multiprocessing/queues.py in get(self, block, timeout)
92 if block and timeout is None:
93 with self._rlock:
---> 94 res = self._recv_bytes()
95 self._sem.release()
96 else:
/usr/lib/python3.6/multiprocessing/connection.py in recv_bytes(self, maxlength)
214 if maxlength is not None and maxlength < 0:
215 raise ValueError("negative maxlength")
--> 216 buf = self._recv_bytes(maxlength)
217 if buf is None:
218 self._bad_message_length()
/usr/lib/python3.6/multiprocessing/connection.py in _recv_bytes(self, maxsize)
405
406 def _recv_bytes(self, maxsize=None):
--> 407 buf = self._recv(4)
408 size, = struct.unpack("!i", buf.getvalue())
409 if maxsize is not None and size > maxsize:
/usr/lib/python3.6/multiprocessing/connection.py in _recv(self, size, read)
377 remaining = size
378 while remaining > 0:
--> 379 chunk = read(handle, remaining)
380 n = len(chunk)
381 if n == 0:
/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py in handler(signum, frame)
271 # This following call uses `waitid` with WNOHANG from C side. Therefore,
272 # Python can still get and update the process status successfully.
--> 273 _error_if_any_worker_fails()
274 if previous_handler is not None:
275 previous_handler(signum, frame)
RuntimeError: DataLoader worker (pid 351) is killed by signal: Bus error.
The last one is linked to pytorch. On what environment are you running this?
https://colab.research.google.com/gist/tcapelle/4083ffd865fabc8703175515d521a7f2/tgs-fastai-v1.ipynb
Ok, so on colab, there is an issue with pytorch as reported here. You need to increase the amount of shared memory you have.
Do you know why the first model just runs fine then and the segmentation one does not?
Thank you for your help and patience.
But I am not sure if I understood it correctly.
I set up like this (and I also tried the other combination, see below):
def get_tfms_datasets(size, paths, tfms):
datasets = get_datasets(paths)
#print('get_tfms_datasets tfms:\n', tfms)
return transform_datasets(*datasets, test_ds=None, tfms=tfms, tfm_y=True, size=size)
def get_data(bs, size, paths):
return DataBunch.create(*get_tfms_datasets(size=size, paths=paths, tfms=[get_transforms(), get_transforms()]), bs=bs, size=size, tfms=rsna_norm)
data = get_data(bs, size, paths)
These are the results:
data.train_dl.tfms & data.valid_dl.tfms show the norm function:
[functools.partial(<function _normalize_batch at 0x7f0162126510>, mean=tensor([0.4914, 0.4914, 0.4914]), std=tensor([0.2492, 0.2492, 0.2492]))]
data.train_ds.tfms & data.valid_ds.tfms show the data augm. func.:
([RandTransform(tfm=TfmCrop (crop_pad), kwargs={'row_pct': (0, 1), 'col_pct': (0, 1)}, p=1.0, resolved={}, do_run=True, is_random=True),
RandTransform(tfm=TfmPixel (flip_lr), kwargs={}, p=0.5, resolved={}, do_run=True, is_random=True),
RandTransform(tfm=TfmCoord (symmetric_warp), kwargs={'magnitude': (-0.2, 0.2)}, p=0.75, resolved={}, do_run=True, is_random=True),
RandTransform(tfm=TfmAffine (rotate), kwargs={'degrees': (-10.0, 10.0)}, p=0.75, resolved={}, do_run=True, is_random=True),
RandTransform(tfm=TfmAffine (zoom), kwargs={'row_pct': (0, 1), 'col_pct': (0, 1), 'scale': (1.0, 1.1)}, p=0.75, resolved={}, do_run=True, is_random=True),
RandTransform(tfm=TfmLighting (brightness), kwargs={'change': (0.4, 0.6)}, p=0.75, resolved={}, do_run=True, is_random=True),
RandTransform(tfm=TfmLighting (contrast), kwargs={'scale': (0.8, 1.25)}, p=0.75, resolved={}, do_run=True, is_random=True)],
[RandTransform(tfm=TfmCrop (crop_pad), kwargs={}, p=1.0, resolved={}, do_run=True, is_random=True)])
But with:
show_image_batch(data.train_dl, data.train_ds.classes, rows=3, figsize=(5,5))
I still get the same AttributeError: 'list' object has no attribute 'tfm'
from above.
When I debug the error, the dl in line 42 shows the norm. func.:
/home/paperspace/fastai/fastai/vision/data.py(42)show_image_batch()
40 denorm:Callable=None) -> None:
41 "Show a few images from a batch."
---> 42 x,y = next(iter(dl))
43 if rows is None: rows = int(math.sqrt(len(x)))
44 x = x[:rows*rows].cpu()
ipdb> dl
DeviceDataLoader(dl=<torch.utils.data.dataloader.DataLoader object at 0x7f016054b780>, device=device(type='cuda'), tfms=[functools.partial(<function _normalize_batch at 0x7f0162126510>, mean=tensor([0.4914, 0.4914, 0.4914]), std=tensor([0.2492, 0.2492, 0.2492]))], collate_fn=<function data_collate at 0x7f016922a840>)
This also happens when I change the tfms func. to (= exchange the two tfms func. from above with each other):
def get_data(bs, size, paths):
return DataBunch.create(*get_tfms_datasets(size=size, paths=paths, tfms=[rsna_norm, rsna_norm]), bs=bs, size=size, tfms=get_transforms())
With that setup I still get the AttributeError: 'functools.partial' object has no attribute 'tfm'
even though I see at the debugging the tfm attribute in the dl that generates the error:
DeviceDataLoader(dl=<torch.utils.data.dataloader.DataLoader object at 0x7f01605384a8>, device=device(type='cuda'), tfms=[[RandTransform(tfm=TfmCrop (crop_pad), kwargs={'row_pct': (0, 1), 'col_pct': (0, 1)}, p=1.0, resolved={}, do_run=True, is_random=True), RandTransform(tfm=TfmPixel (flip_lr), kwargs={}, p=0.5, resolved={}, do_run=True, is_random=True), RandTransform(tfm=TfmCoord (symmetric_warp), kwargs={'magnitude': (-0.2, 0.2)}, p=0.75, resolved={}, do_run=True, is_random=True), RandTransform(tfm=TfmAffine (rotate), kwargs={'degrees': (-10.0, 10.0)}, p=0.75, resolved={}, do_run=True, is_random=True), RandTransform(tfm=TfmAffine (zoom), kwargs={'row_pct': (0, 1), 'col_pct': (0, 1), 'scale': (1.0, 1.1)}, p=0.75, resolved={}, do_run=True, is_random=True), RandTransform(tfm=TfmLighting (brightness), kwargs={'change': (0.4, 0.6)}, p=0.75, resolved={}, do_run=True, is_random=True), RandTransform(tfm=TfmLighting (contrast), kwargs={'scale': (0.8, 1.25)}, p=0.75, resolved={}, do_run=True, is_random=True)], [RandTransform(tfm=TfmCrop (crop_pad), kwargs={}, p=1.0, resolved={}, do_run=True, is_random=True)]], collate_fn=<function data_collate at 0x7f016922a840>)
I also checked the docs for ObjectDetectDataset
and transform_datasets
.
I am not sure what I can mess up or in which direction I should debug further?
If I understand v1 myself (a stretch!), the list error comes from tfms=[get_transforms(), get_transforms()]
which should be ds_tfms=get_transforms()
and am not sure that you need get_tfms_datasets
as the DataBunch creator already calls it, just pass in your datasets.
I think it’s due to our inefficient implementation of bounding boxes for data augmentation. It will change soon as we commit the transformations for points inside the main fastai library.
nothing is working now =(, resnet34 is not there anymore, Darknet does not work, snif…
What would you recommend to be able to help in dev, a paperspace instance?
Colab is free, and most people will try the library there first, and the K80 is not bad.
Kaggle is not working either.