Got it, thanks. Looking forward to v2! Really appreciate what you guys do.
Same question, did you solve it ?
Hi @KarlH,
Did you solve the issue with the ābackgroundā data? I could create a databunch with both labelled and unlabelled data. Everything seems ok but when I tried to fit the data I got the following error:
RuntimeError Traceback (most recent call last)
<ipython-input-64-f250d77c386e> in <module>
----> 1 learn.fit_one_cycle(4, 1e-3, wd=1e-3)
~/anaconda3/envs/Fastai/lib/python3.7/site-packages/fastai/train.py in fit_one_cycle(learn, cyc_len, max_lr, moms, div_factor, pct_start, final_div, wd, callbacks, tot_epochs, start_epoch)
21 callbacks.append(OneCycleScheduler(learn, max_lr, moms=moms, div_factor=div_factor, pct_start=pct_start,
22 final_div=final_div, tot_epochs=tot_epochs, start_epoch=start_epoch))
---> 23 learn.fit(cyc_len, max_lr, wd=wd, callbacks=callbacks)
24
25 def fit_fc(learn:Learner, tot_epochs:int=1, lr:float=defaults.lr, moms:Tuple[float,float]=(0.95,0.85), start_pct:float=0.72,
~/anaconda3/envs/Fastai/lib/python3.7/site-packages/fastai/basic_train.py in fit(self, epochs, lr, wd, callbacks)
198 else: self.opt.lr,self.opt.wd = lr,wd
199 callbacks = [cb(self) for cb in self.callback_fns + listify(defaults.extra_callback_fns)] + listify(callbacks)
--> 200 fit(epochs, self, metrics=self.metrics, callbacks=self.callbacks+callbacks)
201
202 def create_opt(self, lr:Floats, wd:Floats=0.)->None:
~/anaconda3/envs/Fastai/lib/python3.7/site-packages/fastai/basic_train.py in fit(epochs, learn, callbacks, metrics)
99 for xb,yb in progress_bar(learn.data.train_dl, parent=pbar):
100 xb, yb = cb_handler.on_batch_begin(xb, yb)
--> 101 loss = loss_batch(learn.model, xb, yb, learn.loss_func, learn.opt, cb_handler)
102 if cb_handler.on_batch_end(loss): break
103
~/anaconda3/envs/Fastai/lib/python3.7/site-packages/fastai/basic_train.py in loss_batch(model, xb, yb, loss_func, opt, cb_handler)
28
29 if not loss_func: return to_detach(out), to_detach(yb[0])
---> 30 loss = loss_func(out, *yb)
31
32 if opt is not None:
~/anaconda3/envs/Fastai/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
545 result = self._slow_forward(*input, **kwargs)
546 else:
--> 547 result = self.forward(*input, **kwargs)
548 for hook in self._forward_hooks.values():
549 hook_result = hook(self, input, result)
~/DeepLearning/ObjectDetection/loss/RetinaNetFocalLoss.py in forward(self, output, bbox_tgts, clas_tgts)
53 focal_loss = torch.tensor(0, dtype=torch.float32).to(clas_preds.device)
54 for cp, bp, ct, bt in zip(clas_preds, bbox_preds, clas_tgts, bbox_tgts):
---> 55 bb, focal = self._one_loss(cp, bp, ct, bt)
56
57 bb_loss += bb
~/DeepLearning/ObjectDetection/loss/RetinaNetFocalLoss.py in _one_loss(self, clas_pred, bbox_pred, clas_tgt, bbox_tgt)
28
29 def _one_loss(self, clas_pred, bbox_pred, clas_tgt, bbox_tgt):
---> 30 bbox_tgt, clas_tgt = self._unpad(bbox_tgt, clas_tgt)
31 matches = match_anchors(self.anchors, bbox_tgt)
32 bbox_mask = matches >= 0
~/DeepLearning/ObjectDetection/loss/RetinaNetFocalLoss.py in _unpad(self, bbox_tgt, clas_tgt)
15
16 def _unpad(self, bbox_tgt, clas_tgt):
---> 17 i = torch.min(torch.nonzero(clas_tgt - self.pad_idx))
18 return tlbr2cthw(bbox_tgt[i:]), clas_tgt[i:] - 1 + self.pad_idx
19
RuntimeError: invalid argument 1: cannot perform reduction function min on tensor with no elements because the operation does not have an identity at /opt/conda/conda-bld/pytorch_1565272271120/work/aten/src/THC/generic/THCTensorMathReduce.cu:64
`
Checking the forums seems to be somethig related with the images/labels, so I guess is due to the [0,0,0,0] ['background'] trick. Any idea how to solve it?
Thanks!
Very nice work by @Bronzi88. Running fastai version 1.0.57 I got the same error mesage as @Joan when trying to run the examples/CocoTiny_Retina_Net.ipynb notebook. I was able to fix it by using squishing for resizing and removing the default tranforms. This seems to avoid bboxes ending up outside of the transformed image.
data = (ObjectItemList.from_folder(
.split_by_rand_pct()
.label_from_func(get_y_func)
.transform(tfm_y=True, size=size, resize_method=ResizeMethod.SQUISH)
.databunch(bs=64, collate_fn=bb_pad_collate))
Hi @hallvagi, in my case this is not solving the problem. Anything else that you tried?
How is your unlabelled data? [ābackgroundā] and [0,0,0,0]?
Ok, my bad. I was using the coco_sample dataset, and it only has labelled data.
When removing the labels from one of the images with img2bbox['000000318219.jpg'] = [[[0., 0., 0., 0.]],['background']]
the notebook still runs fine, but the databunch now has 2 ābackgroundā classes.
Using img2bbox['000000318219.jpg'] = [[[0., 0., 0., 0.]],[]]
leaves the number of classes correct, and also works in my case.
But when using img2bbox['000000318219.jpg'] = [[[]],[]]
it crashes with an index out of bounds error.
Hi, I am stuck with the same error, unable to understand the data causing this error. i am tryin to use run notebook given in course-v3 dl2
Great work! @Bronzi88 and thank you for sharing.
I am attempting to apply your methods to another dataset from LLNL.
I am able to get the data bunch working and have the bounding boxes loaded correctly.
I am also able to get the anchors workingā¦
However, after loading the model and when I start to find the learning rate, it crashes:
At first, I thought it was because I had an image without a bounding box but I went through the images and they all have a bb. Would you have any guidance on how I can fix this issue?
Thanks,
Tom
Hi,
yes, your number of boxes doesnāt fit the number of expected boxes for the model.
So this line:
anchors = create_anchors(sizes=[(32,32)], ratios=[1], scales=[0.35, 0.5, 0.6])
has to be consistent with this line:
model = RetinaNet(encoder, n_classes=data.train_ds.c, n_anchors=3, sizes=[32], chs=32, final_bias=-4., n_conv=2)
With kind regards,
Christian
Thanks @Bronzi88 - I was able to build a model with your help. Iām wasnāt able to add different transform/augmentation techniques as I believe the bounding boxes were getting transformed outside the image boundary but Ill keep working on that later. I am currently stuck on getting a small inference pipeline working. I have a folder of test images, they are all pngs and have been converted to grey scale which are the same format that my model was trained on. Iām trying to follow the two examples that you mentioned earlier in this thread but am having a difficult time getting it to work.
It looks like the image needs to be converted from an array to an image tensor which is then fed through a small process of inference. Below is where I am currently.
Iām currently dealing with data structure errors where I think the pil2tensor process is not getting the data in the correct format required for inference. For example, this type of error
RuntimeError: Given groups=1, weight of size 64 3 7 7, expected input[1, 2, 256, 256] to have 3 channels, but got 2 channels instead
Could you offer some guidance or code suggestions on how to pass an image from a test_directory to get the model output returned for this use case?
Thank you very much!
Tom
I have a question/problem regarding the Pascal Notebook from the fastai-course-v3 and I hope that maybe someone can point me in the right direction
The Pascal Notebook itself works fine for my own object detection dataset. After playing aroung with it for some time, I wanted to try my own trained ResNet as a backbone.
However, I am stuck at using my saved ResNet model for the RetinaNet as a backboneā¦
Some thoughts I had so far:
- When loading my own model as a Learner (e.g. with
own_resnet18.load('res18-stage5')
, it is associated with the dataset I used for training. - The RetinaNet is built with a pretrained RetinaNet from the model zoo and the last layers need to be cut off.
- If I access the model of the Learner object via
own_resnet18.model
I can manually cut off the last layers, there is an additionalSequential
wrapped around the remaining architecture (this looks different than theencoder
in the Pascal notebook)
So I think I can manage to get rid off that extra Sequential around my own model, but is that even the right way?
Or, to be more precise: are the weights still stored inside the Sequential container? (I bet not ? and also I am sorry if this is a stupid question, I am still new to all this stuffā¦)
I would really appreciate any help or hints to how I can reuse an existing trained model for another task
Nevermind, I guess I was too tired to make it work yesterday, I found the solution today: I used the state-dict to transfer the weights
Thatās exactly what I almost recommended. Thereās an excellent thread on the forum that shows how to do that with any pretrained model
Thanks @muellerzr for your reply! Do you might sharing a link to that thread or do you remember how it was named so I can search for that?
Here
That was quick, thank you!
I use it so much, itās bookmarked
IIRC part of it needs to be refactored a bit (let me know if you get any issues), Iām going to get to it for the study group so Iāll definitely have that in the next week or two. It has to do with sometimes the model has an optimizer and a model in the state_dict so we want to just modify the second.
hey, can you share the link for extra lessons for using object detection using fastai v1.
Thanks,
Harshit
Hey Everyone,
Does anyone know how can we save the databunch objects that we have created, I tried saving them using the databunch.save method but it gives me ctype error?
ERROR:
ValueError Traceback (most recent call last)
in
----> 1 data.save(ātest_data_bs8.pklā)
3 frames
/usr/local/lib/python3.7/dist-packages/fastai/basic_data.py in save(self, file)
153 warn(āSerializing the DataBunch
only works when you created it using the data block API.ā)
154 return
ā 155 try_save(self.label_list, self.path, file)
156
157 def add_test(self, items:Iterator, label:Any=None, tfms=None, tfm_y=None)->None:
/usr/local/lib/python3.7/dist-packages/fastai/torch_core.py in try_save(state, path, file)
414 #To avoid the warning that come from PyTorch about model not being checked
415 warnings.simplefilter(āignoreā)
ā 416 torch.save(state, target)
417 except OSError as e:
418 raise Exception(f"{e}\n Canāt write {path/file}. Pass an absolute writable pathlib obj fname
.")
/usr/local/lib/python3.7/dist-packages/torch/serialization.py in save(obj, f, pickle_module, pickle_protocol, _use_new_zipfile_serialization)
378 if _use_new_zipfile_serialization:
379 with _open_zipfile_writer(opened_file) as opened_zipfile:
ā 380 _save(obj, opened_zipfile, pickle_module, pickle_protocol)
381 return
382 _legacy_save(obj, opened_file, pickle_module, pickle_protocol)
/usr/local/lib/python3.7/dist-packages/torch/serialization.py in _save(obj, zip_file, pickle_module, pickle_protocol)
587 pickler = pickle_module.Pickler(data_buf, protocol=pickle_protocol)
588 pickler.persistent_id = persistent_id
ā 589 pickler.dump(obj)
590 data_value = data_buf.getvalue()
591 zip_file.write_record(ādata.pklā, data_value, len(data_value))
ValueError: ctypes objects containing pointers cannot be pickled