How to use mini-batch discrimination without fully connected layers

I am working on a shoe2handbag style transfer task, with CycleGAN, like the one that was addressed by Kim et al (2017) in their discogan paper. With the original dataset provided by the authors, everything works fine. When I use my own dataset (of similar size and, IMO, better quality), I quickly experience mode collapse.

I understand that cycleGAN has some built-in mechanism that should prohibit mode collapse, but these are ineffective.

After some research, I stumbled on mini-batch discrimination, however, I am unsure how to implement it on a discriminator without fully connected layers.

Does anyone have any experience with this?