Dynamic SSD implementation for fastai v1

mElabasiri · March 11, 2019, 2:33am

Ok nice work around @vha14, I’ll give it a try, but does that mean that I have to do the learner’s pre-processing?

mElabasiri · March 13, 2019, 2:13am

Also another question, how would you go about loading this model into pure Pytroch?

rbunn80130 · April 24, 2019, 8:40pm

I am using this code:

img = open_image(‘my_image.jpg’)
m = ssd.learn.model.eval();
xb, _ = data.one_item(img)
xb = xb.cuda()
out = m(xb)
for item in out:
  print(item.shape)

Which outputs:
torch.Size([1, 189, 93])
torch.Size([1, 189, 4])

I’m trying to get out the bounding boxes and labels and it doesn’t appear my code is returning this info. How would I do this with any random image?

arajendran · May 6, 2019, 2:30pm

I’m having a consistent problem with both an SSD implementation I built on my own as well as this one. All of my classes get assigned to background (whether I use focal loss or not). This appears to be a result of something the loss function is doing though it is identical to the one defined here. Could this instead be a problem with my data or the way classes are defined in my data? Has anyone seen anything similar? Basically achieving low loss numbers after some epochs of training but actual results are not valid at all using my own data set or PASCAL VOC.

rohitgeo · May 7, 2019, 2:47am

Our implementation used fastai 1.0.39 so if something has changed since, that might be responsible…

arajendran · May 7, 2019, 12:47pm

Thanks for the response. It’s not an issue related to versioning. I believe the issue was with my learning rate, I was using a learning rate that was too small and the model was getting “stuck” in a part of the loss function where it was assigning background to all classes which obviously performs better than random. I’ve fixed that by using a larger learning rate but am now sorting out an issue where the last class (by index) is being assigned to a number of boxes. I have a feeling this is related to me building the model assuming the last class was background and then switching background to the index 0 class since the databunch assigns that automatically.

dhoa · May 29, 2019, 3:03pm

Thank you so much for this. However, I ran to an error at show_results
ssd.show_results(rows=4, thresh=0.1)
IndexError: index 3 is out of bounds for dimension 1 with size 3

Same as: ssd.learn.predict(data.train_ds[0][0])

Finally, like @rbunn80130 , can you explain more on how we can do the nms to reduce the number of bounding box ?

Thank you

dhoa · May 29, 2019, 3:26pm

I make it works. Sorry I used the default ObjectItemList instead of yours SSDObjectItemList. I will continue to dig in to your source code to know how to use nms.

Thank you

crumblr · May 31, 2019, 3:05pm

Hello Juan @jccj , could you please share how you plugged in a different backbone in the SSD implementation being discussed here? Somehow I cannot get my head around these things currently. Thank you very much in advance.

KarlH · June 4, 2019, 10:25pm

This seems like a good place to ask.

I’m trying to adapt a dataset to the RetinaNet / coco model used in this notebook in the dl2 folder.

The problem is some of my images don’t have any objects in them. Does anyone know how to adapt the code in the notebook to deal with this?

dhoa · July 1, 2019, 12:29pm

I found that loc_loss and clas_loss are not in the same order ( clas_loss are much bigger than loc_loss ). Digging into the source code I found loc_loss is calculated by:
loc_loss = ((a_ic[pos_idx] - gt_bbox[pos_idx]).abs()).mean()

and clas_loss is calculated by:
clas_loss = self._loss_f(b_c, gt_clas)
where
_loss_f = F.binary_cross_entropy_with_logits(x, t, w, size_average=False)/(self.num_classes-1)

The difference come from: we mean the loc_loss but not clas_loss.

I try a new version of _loss_f:
F.binary_cross_entropy_with_logits(x, t, w, reduction='mean')/(self.num_classes-1)
it seems to give me same order of these 2 losses.

Because we don’t have any stardard version of metrics (like mAP) so it is not easy to compare the performance. In my understanding, if loc_loss >> clas_loss so the model will try to minimize the location much more than the classification, am I right ?

What do you think ? @vha14, @rohitgeo

vha14 · July 1, 2019, 8:35pm

It is correct that loc_loss and clas_loss are currently not weighted (we use weight of 1 for both). In Lesson #9 Jeremy said that this is not an issue, but did not offer an explanation why. You should integrate and measure mAP to get a sense of how the model performs.

dhoa · July 1, 2019, 9:05pm

Thanks. Actually I did one (Pascal VOC metrics based on the Retina Net notebook) and found that both have the same results. That ensure what Jeremy said. I was thinking that my implementation is wrong but may be it is not. I still don’t understand why I remember In the course of Andrew NG and others course I did in school like LQR controller or Kalman filter that we should weight these losses in the same order.

almarca · July 12, 2019, 6:14pm

Someone can help me to show my results? Please
I’m trying to show my test image with the preds but I can’t do it without errors.

Once I get the preds with ssd.learn.model(img), I don’t know how to follow. I’ve watching the original code and next should go the analyze_pred and finally the reconstruct and plot, but I really don’t know how to do that.

Probably this is very easy but I’ve tried for long without results

mojiamenke · January 13, 2020, 5:03pm

I implemented the Dynamic SSD and worked on my own dataset. But I got a problem when I tried to export the model:
**D:\Software\envs\data245\lib\site-packages\torch\serialization.py in _save(obj, f, pickle_module, pickle_protocol)
295 pickler = pickle_module.Pickler(f, protocol=pickle_protocol)
296 pickler.persistent_id = persistent_id
–> 297 pickler.dump(obj)
298
299 serialized_storage_keys = sorted(serialized_storages.keys())

TypeError: can’t pickle weakref objects**

Any one can help me with this? Thank you!

mojiamenke · January 14, 2020, 9:12pm

Thanks, how to create a pickle file after the training? I got “can’t pickle weakref object” error.

MarkD · April 25, 2020, 4:54pm

@mojiamenke Kevin, were you able to resolve this. I am also getting this

Thanks, Mark