YOLO Architecture

user123random · June 16, 2019, 3:04am

How (Is it possible) to combine Fast-RCNN (2-stage) and YOLO (1-stage) ?
Why with the addition of anchor boxes we changed the resolution to 416 x 416 ? Why using anchor boxes we get a small decrease in accuracy ? How does using anchor boxes decouple the class prediction mechanism from the spatial location ?
Why if we use standard k-means with Euclidean distance larger boxes generate more error than smaller boxes ? How to derive d(box, centroid) = 1 - IOU(box, centroid) ?
How to derive Pr(object) * IOU(b, object) = σ(to) in Yolo v2 ? Why is this expression not used in Yolo v3 ?
Why apply sigmoid function to tx and ty ? Why apply exponential function to pw and ph ?
From yolov3-spp.cfg , I did not see anything about 3 different scales. Could anyone advise ?
Why Yolo v3 tensor size needs to multiply by N*N ? What is represented by N ?