Is it possible to make an object detector from scratch?

The more I try, the more difficult I find it…
I’ve been trying to reproduce the typical ssd multiple object detector but it’s results are so bad, At first I can’t even load a pretrained resnet50 and use transfer learning for my own dataset. But even if I copy the steps from the notebooks or the forums, I find the results quite bad.
After looking for information I found that there are problems with the combining function for the output layers, have those troubles been fixed?

Can you share with us what information have you looked(links?) and details of the problems(errors?) with combining function for the output layers?

This post describe the flatten function I use on convs outputs (flatten_conv).
The problems is that I don’t see (I can’t find someone who says that also) a good result like the one I could get using tensorflow pretrained models.
The last thing is that I’m not sure if I’m taking pretrained models, following pascal-multi notebook doesn’t seems to do it:

  • f_model = resnet34
  • head_reg4 = SSD_MultiHead(k, -4.)
  • models = ConvnetBuilder(f_model, 0, 0, 0, custom_head=head_reg4)
  • learn = ConvLearner(md, models)

I tried with resnet34(pretrained=True) but then I get different errors when I try to train.

I assume the answer is a no, actual results on object detectors are quite bad and it’s still behind others frameworks in transfer learning?