Does anyone have experience with masking techniques for neural style transfer? Matthew and I are at a hackathon working on a project to combine style transfer with semantic image segmentation. There are some good papers on the topic (like this one) but we were wondering the best approach.
Their most promising results come from using an interesting residual architecture that I haven’t seen before that they call ResNext:
A bit of a rabbit hole, but I really enjoyed diving down it. @jeremy I’m curious if you have encountered ResNext blocks before? It seems like an interesting architectural change. I’m just reading the paper now to see if I can glean the foundations behind it.
Eg. nudity. This could be a browser extension for kids to automatically blur explicit content. Same technique could be applied to trademarks, logos, celebrity faces, etc.
Background Transfer
We identify the person, create a mask, then reverse the mask to stylize the non-person pixels.
I’m open to ideas. One option is to improve what we completed this weekend: better and faster style transfer, better segmentation techniques (ResNext, Markov random fields), incorporate ideas from recent papers, add WGAN on top, etc.
Another idea is to take the semantic style transfer idea and combine it with speech recognition to build a “hands free photo editing tool” where users can upload a photo and then issue voice commands like “crop cat, blur cat, copy and paste cat, apply fire style to mountains, etc”
I’m also curious to see how realistic of a movie we can create with this. Could we film a person walking and use style transfer to make them look like a character from Avatar?
The organizers of this upcoming event requested slides/demo of our idea. It’s not required, but it’s fair game to work on an existing idea.
Sounds like the start of a great product for movie studios I imagine that the same techniques can be applied to help doctors give voice commands to focus on potential trouble spots in medical images.