Object Detection and Identification Basic Question

Hello,
I am having difficulty understanding during Object Detection and Identification, does the images gets classified first and then the image is detected with a bounding box or is it the other way round.

Thank You,
Kal

It depends somewhat on the architecture type (one-stage vs two-stages detector). But usually you have 2 parallel output heads with a combined loss. One head for classification and one head for bounding box regression. They both use features from a base CNN ± combined with a FPN (feature pyramid network). 2 stages detectors also use a RPN (region proposal network).

Consequently, a forward network pass usually outputs at the same time the bounding box coordinates and the classification score of this box.