Understanding about Translation in-variance in SSD object detection

Does SSD achieve Translation in-variance? if it is able to how ?

Yes it does. From the paper:

SSD is the first work to combine the idea of convolutional
bounding box priors with the ability to predict
multiple object categories. In our method, the offset
adjustment and confidences for multiple categories of
each prior are predicted from the underlying 11 feature
at each location on a feature map, as opposed to
the whole feature map as done in MultiBox [2] and
YOLO [15]. This results in a more compact network,
which is crucial for efficient detection of a large number
of object categories. Additionally, this adds translational
invariance in the model output layers, and reduces
overfitting and improves detection performance.

Basically, we can understand that as “we put a lot of boxes everywhere one the input and feature maps at different depth. Each of them predicting for all categories”.

If an object is located (center position) at (x1,y1) on your input and if you translate the object to (x2, y2) then there will definitely be another box/prior at that location that will detect the object.

Can you elaborate it.
From my understanding of SSD , taking different feature maps of different depths will help you from achieving Scale in-variance Property . As far as i know, In faster-Rcnn paper their achieved this property of translation in-variance by using Region of Interest Pooling for the proposed Regions (anchors). Translation in-variance My understanding over translation invariance . But here in SSD how does the anchor boxes over different region of same categories achieve the translation invariant.

The output of SSD is a grid (actually multiple grids of different sizes). For each grid cell it makes A predictions where A is the number of anchor boxes. So if the grid is 13x13 then SSD outputs 13x13xA predictions, each of which is roughly centered in a grid cell.

@Learntocode: That is correct for scale-invariance and @machinethink explained pretty well about the grid generation of priors. One last comment I can add is for overlapping of many boxes for the same category, there is a last step of non max suppression. So if you have one box with 0.9 confidence and another overlapping with the first one with 0.5 confidence, basically only the 0.9 confidence prior will remain.