I’ve been using a CycleGAN to try and predict spectrograms from other spectrograms and I’ve been struggling to get something beyond vaguely structured noise. I’m confident it’s possible, due to their being a research paper doing this exact thing, however I’ve tried quite a few things now with little luck.
I’ve got to the point where I’ve tried to minimise the problem as much as possible, in regards to feature size, dataset size, downsampling, etc. and I’ve done my best to work my way through tuning various parameters. But all I can produce at the moment is something that looks like a noisy spectrogram that has no real structure to it.
It’s becoming expensive both in terms of money and time to keep trying to iterate through these parameters, e.g. trying a different number of epochs to see if eventually converges. I guess at this point I feel as though I know it’s possible, but I’m questioning whether I need a larger dataset, more epochs, more features, more downsampling, a different architecture implementation etc. And I don’t know really know where to go from here so any thoughts are hugely appreciated