Hi all, I'm trying to understand the perceptual losses paper and having a hard time understanding this graph
I have a few questions
1) Why does the loss remain constant for the perceptual losses method (green line)?
2) Why does the x-axis say L_BFGS iteration when they are using Adam?
3) In supplementary material, in case of super-resolution, for the convolutional layers in res_blocks , they didn't use any padding because it causes artifacts.Because of that the output after 2 conv layers will be of different shape compared to input to res_block.To avoid this they center cropped the input to match the size of output of 2 conv layers.I can understand cropping raw images but at the res block stage they are features(whatever that means) right?, what is the intuition behind cropping features.