
How (Is it possible) to combine FastRCNN (2stage) and YOLO (1stage) ?

Why with the addition of anchor boxes we changed the resolution to 416 x 416 ? Why using anchor boxes we get a small decrease in accuracy ? How does using anchor boxes decouple the class prediction mechanism from the spatial location ?

Why if we use standard kmeans with Euclidean distance larger boxes generate more error than smaller boxes ? How to derive d(box, centroid) = 1  IOU(box, centroid) ?

How to derive Pr(object) * IOU(b, object) = σ(to) in Yolo v2 ? Why is this expression not used in Yolo v3 ?

Why apply sigmoid function to tx and ty ? Why apply exponential function to pw and ph ?

From yolov3spp.cfg , I did not see anything about 3 different scales. Could anyone advise ?

Why Yolo v3 tensor size needs to multiply by N*N ? What is represented by N ?
2 Likes