I had a couple of questions about evaluation metrics for object detection tasks (eg. detecting a person for self-driving applications):
I’ve noticed that the pre-trained models available online specify a certain mAP and the dataset (COCO, PASCAL VOC etc) it was achieved on. Do you take that mAP as a baseline when working on your own dataset with the same pre-trained model? Or is the objective to achieve the SoTA result, and improve? How do you quantify what mAP is good/bad?
The mAP for pre-trained models is usually on the entire training+validation set for a number of object classes. How is it affected if we train the same model on a dataset with just one class (person)?