Semantic Style Transfer

I’m open to ideas. One option is to improve what we completed this weekend: better and faster style transfer, better segmentation techniques (ResNext, Markov random fields), incorporate ideas from recent papers, add WGAN on top, etc.

Another idea is to take the semantic style transfer idea and combine it with speech recognition to build a “hands free photo editing tool” where users can upload a photo and then issue voice commands like “crop cat, blur cat, copy and paste cat, apply fire style to mountains, etc”

I’m also curious to see how realistic of a movie we can create with this. Could we film a person walking and use style transfer to make them look like a character from Avatar?

The organizers of this upcoming event requested slides/demo of our idea. It’s not required, but it’s fair game to work on an existing idea.


Sounds like the start of a great product for movie studios :slight_smile: I imagine that the same techniques can be applied to help doctors give voice commands to focus on potential trouble spots in medical images.


@brendan - Did you end up using Mask R-CNN?

Also, here is a recent paper that came out last week on photo realistic style transfer.

1 Like

This is great work! Seems that the hackathon environment is well suited to getting projects done end to end. :slight_smile:

Which hackathon was it?

1 Like

At the event they also open sourced a framework called Kur, which is like Keras with .yaml files.

1 Like

We were just talking about how amazing that paper is! Interested in helping us implement it in Keras?

For segmentation we didn’t have time to implement Mask R-CNN, but we found a helpful github with a DilatedNet implementation that seemed to work ok.

If anyone is interested in helping out, we are going to try to implement a few of these new papers! It could be a cool class project :slight_smile:

Here are a few techniques we could try implementing:

Video Style Transfer with Optical Flow

Deep Photo Style Transfer

Fast Patch-based Style Transfer of Arbitrary Style

Targeted Style Transfer Using Instance-aware Semantic Segmentation

*Uses Markov Random Fields

Mask R-CNN

Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization

The One Hundred Layers Tiramisu: Fully Convolutional DenseNets for Semantic Segmentation

*Yoshua Bengio one of the authors

I’d be interested in getting together to write an implementation later this week, if folks are around. Should be fun!


Yay! @Matthew, @sravya8 and I were planning to meet Friday if that works for you?

@brendan Unfortunately, I am in the east coast. Waiting for the day when Neural Networks can do teleportation :slight_smile:

Indeed. East coast blues for me too.

@thunderingtyphoons @davecg I don’t think being on the east coast is a handicap that is totally debilitating… :wink: If we book a room, we can easily have a skype running at the same time, or just use private chat on the forums.

@thunderingtyphoons @davecg Indeed, but even if we don’t live stream Friday’s session I’m sure we can still collaborate if this interests you.

For example, we have a slack channel for coordinating and sharing progress. I can invite you if you’d like just ping me an email address.


What time were you guys planning to meet (PST) and what’s the focus. I’m super interested in style transfer and segmentation right now and would love to participate if time permits.

count me in! I’m super excited about the recent paper on photorealistic style transfer, anyone else want to try it together? @Even ping your email address and I’ll add you to the slack channel where we are coordinating the work.

Right now @Matthew is exploring Arbitrary Style Transfer, @sravya8 is working on Style Transfer For Videos, and I’m exploring segmentation.

Friday 12pm - 5pm at 101 Howard 5th floor study room.

1 Like

Photorealistic style is the one that interests me too. I’m going to focus on the photorealism component of the loss function to start and see if I can’t figure out how to port it to Keras.

I’m also going to dive back into the Wasserstein GAN paper implementations and see if I can get (or ideally find) earthmovers distance working in Tensorflow. I feel like the insights there are actually a lot broader than just for GANs and it has a lot of applicability to a project i’m working on.

I’ve seen a few papers that use histogram difference minimizations to try to improve the style transfer and I’m curious to see if they’re improved by using earthmovers.


I’m thinking of starting at 10a. 5 hours doesn’t seem enough time to implement this…

Even better. 10am then. I’ll see if I can find a room. We’re going to need a healthy head start.

1 Like

Do you have any success to run code form the paper (deep photo style transfer)?

I took a look at this after your post. It was a journey… The results are good though:

I don’t think he used Linux or LuaJIT. I had to: