Why use Adaptive Loss for GAN critic?

When you train a critic, it use an AdaptiveLoss with BCEWithLogitsLoss.

This is because the output of the model of the critic spits out a dynamic size tensor as an output.
For example an input tensor of shape [4, 3, 128, 128] will be outputted as [4, 25]

here is the code for the model

def gan_critic(n_channels=3, nf=128, n_blocks=3, p=0.15):
    "Critic to train a `GAN`."
    layers = [ _conv(n_channels, nf, ks=4, stride=2), nn.Dropout2d(p/2), res_block(nf, dense=True,**_conv_args)]
    nf *= 2 # after dense block
    for i in range(n_blocks):
        layers += [ nn.Dropout2d(p), _conv(nf, nf*2, ks=4, stride=2, self_attention=(i==0))]
        nf *= 2
    layers += [ _conv(nf, 1, ks=4, bias=False, padding=0, use_activ=False), Flatten()]
    return nn.Sequential(*layers)

Then later the Adaptive loss will expand say, a tensor of shape [4,1] which for example [1, 0, 1, 0] will correspond to [real_img, gen_img, real_img, gen_img] into [4, 25] to match the output size.

This will look something like [[1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1], [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0], [1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1], [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]]

of which we then take the BCEWithLogitsLoss of. Why not just have the gan_critic output something of shape [4,1] and then take the loss of a Non Adaptive Loss? What benefit has the current approach have against the other?

Here is the AdaptiveLoss code

class AdaptiveLoss(Module):
    "Expand the `target` to match the `output` size before applying `crit`."
    def __init__(self, crit):
        self.crit = crit

    def forward(self, output, target):
        return self.crit(output, target[:, None].expand_as(output).float())