Understanding SuperResolution Loss Functions & Other Nuances

I did some preliminary testing with the FastAI code for SuperResolution using U-Net with ResNet backbone. Our use case is that we are trying to detect image manipulation, and I’d figured SuperResolution method could be a useful way for that by learning which pixels are manipulated and learning an algorithm to correct it.

However, it’s totally not working, it seems to train “well”, we have low validation loss (although this itself may not have much meaning according to the training vids), as the predicted image is way closer to the input than the target. Basically, an untrained and trained model give me the same image (at least to my eyes).

In order to move forward, I need to understand what is going on. I’m starting with the loss functions - pixel loss, gram loss, feature loss, and in some cases each loss has multiple entries. It looks like the notebook is used a weighted sum of these losses. I looked at the video for this notebook, but a lot of stuff was glossed over and Jeremy said he would cover the details in Part 2.

Is there any resources to understand the theory behind what the notebook is doing as to these loss functions, and/or has Part 2 been released yet? I was under the impression it would be out by Jun 2019.

Also, does anyone have a good understanding of the various nuances of super-resolution training, for example:

  1. Why does pct_start gets reduced from 0.9 to 0.3 when upsampling the image?
  2. Why is VGG used as a feature extractor?

Thanks.

Hi, were you able to extract feature from ResNet?

@hamelsmu any updates on the questions in this thread?

Can you please remind me why you tagged me? I don’t know anything about this

I just happened to find that you were one among the admins in fastai discord server and answering some threads regarding part 2. Sorry if I have tagged you wrong.
I was asking about any notes or insights about the SuperResolution Deep Dive