if i want to train a gan to do complete image generation (like for example these pokemon generators) i assume i use random noise inputs and actual pokemon as targets (or whatever, there are enough pokemon already).
if i then fed that trained gan a photo of an actual thing would it do some kind of style transfer-ish transformation?
for “proper” style transfer i’ve seen people talking about training on only one image. do i just really just train with noise and one target style image or is it more complicated?
i don’t want to do a hundred epochs and find i was pointing in the wrong direction the whole time. is there a lesson on this i’m not seeing?
It’s more complicated. You proposed a two-step process, which is what that would be. Style transfer is in of itself only done based off one input image (at least the implementations involving fastai). A good example would be: “Make my photo look more like one of Picasso’s paintings.” Wheras with your approach it would be “Make me a random Pokemon, and then make other pokemon look similar to that pokemon”. Does this make sense to you? In the latter we generate a new pokemon then use style transfer to make other pokemon similar to it. (Hopefully clears up a little confusion? Or mabye I’m just being redundant some )
no i think i wasn’t being clear, i’m asking 3 different questions.
can i do “make me a pokemon out of thin air” with random noise inputs and a whole bunch of different pokemon targets (or maybe let’s say manga as that’s a little more consistent) and the same basic gan approach from lesson 7?
if i gave that gan a picture would it attempt to create something which looked like the input in the style of the outputs it undertstands, eg: turn people into manga avatars.
am i supposed to do style transfer with noise input and 1 style image target?
i guess i’m confused about the difference between 2 and 3 other than the specificity of the style being transfered, could you not make a van goch style transfer trained on all of his work rather than a stary night style transfer?
I’m no expert on GANs and it’s been a while since I’ve watched the Lesson 7 video. But I think for this kind of problem - where you have random pixels as input and some type of images (in your case pokemon or mangas) as output - you would use a Wasserstein GAN like in this notebook: https://github.com/fastai/course-v3/blob/master/nbs/dl1/lesson7-wgan.ipynb
On this one I can only speculate. Probably the GAN would still produce an image that looks like the type of images it has been trained on (e.g. pokemon or manga). But I don’t think it would preserve the structure of the input image of e.g. a portrait of a human, when it creates the output, because it hasn’t been trained on such task.
1: yes you can train a GAN to generate pokemon like images from random noise
2: No, the trained gan will need the same kind of input (gaussian noise for instance) to generate good images. If you feed it with data from a different distribution (ex: a real image of pokemon), you will not have a good result
3: for style transfer (with an approach using gram matrices), you can optimize a image (i.e. create a new image) with 1 target for the style and 1 target for the content