Transfer learning with a pre-trained StyleGAN

Hi Everyone, trust you are all doing well!

I have had some success with using a pre-trained StyleGAN (trained on a source data dist that is somewhat close to my target dist) to power a subsequent training run on my target training dataset/dist.

My question: does anyone have opinions on putting together a sequence of such pre-trained models (more than one), trained on data more and more closer to the eventual target distribution (as it gets closer to the target dist training run) to extract maximum value from the limited data that is available in each training run?

As an example, if my objective is to produce/generate very intricate Art using StyleGAN and assuming such Art data is highly limited, I hypothesize that a few training runs before getting to the actual run will help me manage the limited training data. Simple Art (hopefully abundant and contains relevant info to the objective), somewhat complex Art (a bit more limited in availability and more relevant), followed by the actual training run (using the highly limited target data).

Does transfer learning work this way? or will it have the end effect of delivering an under-powered model?

PS: I am aware of ADA and it helps. Despite that, my target data is still not enough to afford me long enough training time before over fitting the discriminator.

Thoughts please!