Had to educate myself a little took a glance at the document you referred to.
Found some points that support what I was saying, if I understand correctly.
…
3.3.1 Perceptual Loss
To ensure that the generated images resemble the input image, we require the generator not only
minimize the adversarial loss (JG in Section 3.2) but also the L1 difference between the input image
and the downsized generated image. Hence we define the new generator loss function to be a weighted sum of the above two terms…
…
5.3 Mode collapse
Figure 3(e) and (k) show strong mode collapse with several images being clearly identical, consistent
with existing literature (Goodfellow, 2016 and reference therein). Normally we would not expect to
observe mode collapse in super-resolution images, since they are enforced to look similar to input
files by the L1 loss term in the generator loss (see Section 3.3.1). Their appearance suggests that in
these two examples, the L1 loss term should have a larger weight.