Style Transfer w/ Inception v1

I found a project that used Inception v1 for style transfer:

If you look at the source, you can see that he tried to choose the best layers for content+style in the respective models.

# weights for the individual models
# assume that corresponding layers' top blob matches its name
VGG19_WEIGHTS = {"content": {"conv4_2": 1},
             "style": {"conv1_1": 0.2,
                       "conv2_1": 0.2,
                       "conv3_1": 0.2,
                       "conv4_1": 0.2,
                       "conv5_1": 0.2}}
VGG16_WEIGHTS = {"content": {"conv4_2": 1},
             "style": {"conv1_1": 0.2,
                       "conv2_1": 0.2,
                       "conv3_1": 0.2,
                       "conv4_1": 0.2,
                       "conv5_1": 0.2}}
GOOGLENET_WEIGHTS = {"content": {"conv2/3x3": 2e-4,
                             "inception_3a/output": 1-2e-4},
                 "style": {"conv1/7x7_s2": 0.2,
                           "conv2/3x3": 0.2,
                           "inception_3a/output": 0.2,
                           "inception_4a/output": 0.2,
                           "inception_5a/output": 0.2}}
CAFFENET_WEIGHTS = {"content": {"conv4": 1},
                "style": {"conv1": 0.2,
                          "conv2": 0.2,
                          "conv3": 0.2,
                          "conv4": 0.2,
                          "conv5": 0.2}}

I’ve run this project in the past and the generated content is distinctive between VGG and Inception v1.


This is great :slight_smile: It would be great to build on top of this to do a rigorous analysis of these different models for style transfer, because I don’t believe this has as yet been studied in the research literature! Perhaps you could start the ball rolling by pasting examples using each model here so we can see some of the differences?..


I actually used this repo in the past:

Inception v1


Anecdotally I’ve read that VGG seems to produce more pleasing results. The speculation is that because it’s a much bigger (and slower) network to run, its content layers produce results more pleasing to the eye. i.e. the bigger network is including useful information, but not necessarily required for imagenet classification.


this is really neat! thanks for sharing!

Has anyone actually run this? I’m curious to run inception on Gatys texture generation. It looks like for style transfer the gram matrix calculations are done at slightly different points in the tree than in texture generation (pool4, pool3, pool2, pool1, conv1_1).

I can get standard Gatys running VGG19 generating a texture in about 5 minutes. But given VGG has ~180M parameters, I’m hoping I can get that time down significantly by using Inception. Or because of the reduced parameter count, Inception might need to run more passes to find a solution with fewer parameters (so it becomes a wash, time-wise)?

I’m also curious where good points would be for gathering the gram matrices. With VGG, there were obvious “choke points” that we could gather gram statistics (and also because Gatys lays it out. :slight_smile: ), but I’m not completely clear where that would be here. I’m considering either the concat layers, but maybe sticking with the pooling layers is more correct (but then we lose sibling branch statistics…)