I am trying to build an object detection model (based on one of the Yolo or SSD architectures). Basically, my task is to localize i.e draw bounding boxes around specific fields of interest on documents. So, for example, I want to be able learn a model that can draw bounding boxes around first-name, last-name, DOB on a identification document for example.
The following are my questions:
- What kind of annotated dataset I would need to be able to train this model? Specifically, do I need to annotate
backgroundbounding boxes too in my data?
- What is a good size dataset? 5000 annotated documents, 10000?
- Can I use a pre-trained object detection model that has been trained on a large open-dataset and refine it using my data? Similar to what we typically do with image classification using transfer learning?
- Does fastai support this kind of transfer learning for object detection?
Would love some thoughts from folks who have experience in this area or any pointers to existing work that does this.