To me they are almost the same. Are they interchangeable?
Discussed in Lesson 14 video at 1:59:20
My paraphrasing… “this is deep learning, where people often make minor changes to an existing concept or apply it in a new area and call it something different, inadequately citing where it came from. Basically fpns are a different implementation of the concept behind unets”
They are indeed very similar concepts. One of the main difference I see is that U-nets output an image that is of the same dimension as the input image (for example a mask of the original image), whereas what is used in FPNs are the intermediary feature maps.
The Feature Pyramid Network (FPN) looks a lot like the [U-net] (https://vitalab.github.io/deep-learning/2017/02/27/unet.html). The main difference is that there is multiple prediction layers: one for each upsampling layer. Like the U-Net, the FPN has laterals connection between the bottom-up pyramid (left) and the top-down pyramid (right). But, where U-net only copy the features and append them, FPN apply a 1x1 convolution layer before adding them. This allows the bottom-up pyramid called “backbone” to be pretty much whatever you want.