Neural Style Transfer using S4TF

regrettable-username · May 3, 2019, 1:07am

Hey everyone,

As a way of learning Swift for TensorFlow, I decided to take a stab at implementing Neural Style Transfer. I wasn’t sure it would be doable since, in keeping with Chris’ analogy, we’re building the airplane as we’re flying it. But nevertheless I was able to get something working to produce pretty decent results such as this:

The notebook is available on GitHub. I’ve also written up a blog post about it on medium.

So this is only a first pass at this. I didn’t spend a ton of time trying to write the slickest Swift code as I wasn’t even sure I’d be able to solve some of the problems I was facing. I’ll detail what some of the hurdles were in case someone else is trying to solve something similar or perhaps has a better workaround.

The first challenge was loading and dealing with a pre-trained network.
I used a pre-trained version of VGG-19 that came from TF-2.0’s tf.keras.applications.VGG19. I was able to write out the parameters to a checkpoint file. I then used the Raw.restoreV2(prefix:tensorNames:shapeAndSlices:dtypes:) to ultimately load the kernels and biases for each convolution. I ignored the fully-connected layers when generating the checkpoint as they weren’t needed–a huge win in terms of file size.

Another thing to point out that is related to loading the checkpoint file was when I was trying to use Just to download my checkpoint .tar.gz file. The server was redirecting to a CDN. The problem with that was that Just was writing the html from the “redirecting” landing page into my tar file I spent a good while wondering what I was doing wrong only to find out that was the case. I ended up just using wget via the nifty extension from the swift_dev notebooks. Worked like a charm. It looks as though Just isn’t being maintained (or at least it is very inactive). There are a ton of issues already piled up there, but I’ll submit this as one of them. I’ll probably just fork Just and fix the issue myself as it’s a nice API otherwise. I’ll be sure to put up a PR anyways if I do.

The next challenge was Retrieving the layer activations in a way that played nice with autodiff. This is something I’ll revisit, but my solution was to just create a differentiable struct that stored the layer activations of interest and return that from the model layer’s call(_:) method.

After that, the biggest thing was trying to get the optimizer to update the input image itself. The problem was that, as far as I could tell, allDifferentiableVariables / KeyPathIterable only knows about properties of self, not the input to the call(_:) method. This makes total sense. So my workaround was to have another layer that has the input image as a property and just returns that property from call(_:), ignoring the input. I then sequenced the input of this “ImageTensorLayer” with the VGG19 model which did the trick.

The last thing I’ll mention is how I used Adam (Note: I just started working on an L-BFGS optimizer) to only update the image itself and not the other model parameters. I’m convinced there is a super simple way to do this properly, but I couldn’t find it. It was as simple as copy-pasting the Adam optimizer and just adding a break statement after the first iteration through allDifferentiableParameters. This only worked because the image tensor was the first parameter in the model. I believe I could do some introspection on each variable to selectively “freeze” parameters, but I’m sure it’s not as simple as that.

All in all it was a pretty good way to dig into S4TF’s internals and really start understanding the autodiff system and how things fit together. I’d love to contribute what I can back to the project if it’s something that makes sense. As mentioned above, I’m working on an L-BFGS optimizer which will produce much better results without having to tweak so many hyper parameters.

Looking forward to your guys’ feedback.

-James

P.S. Here’s the tweet for those who are interested.

I just published Neural Style Transfer with Swift for TensorFlow https://t.co/DACoVVCUNK
— James Thompson (@WellFedWookiee) May 2, 2019

clattner · May 4, 2019, 12:48am

This is super awesome James, I love it!

machinethink · May 4, 2019, 10:29am

Nice work! I’ve recently implemented this syle transfer method for a client in Swift using iOS deep learning primitives (so not TF) and it looks like doing it with S4TF is definitely less work and much less code!

L-BFGS should be possible too (we use it on iOS). It only uses a lot of memory if you make it keep a large history.

regrettable-username · May 4, 2019, 8:31pm

Hey, Matthijs. It’s very interesting to hear that you were able to run an optimization algorithm such as L-BFGS on the device. My first thought would have probably been that it wouldn’t be possible for memory and speed reasons. I sat down with CoreML about a year and a half ago, but it seems like I ought to dive back into it. I’ve been meaning to do some benchmarking on my iPhone XS. I’ve heard some people mention a 10x performance improvement over the iPhone X. Love your blog by the way

machinethink · May 5, 2019, 10:16am

This isn’t the kind of thing you can use Core ML for.

akrimedes · May 21, 2019, 2:39pm

Hi James - I am unable to access your notebook on github (get a 404 error). Could you share the correct link?

regrettable-username · May 24, 2019, 5:26pm

Hey, my GitHub account got flagged by their automated system with no indication as to why . It took support a while to respond, but I believe it should be fixed now.

akrimedes · May 24, 2019, 7:37pm

Thanks!

tinhb · May 25, 2019, 8:58am

Cant handle the awesomeness?

utkb · November 7, 2019, 5:04pm

Hi,

Thanks for this. I am very new to Swift, and TF, and thus S4TF. I tried to run your notebook with S4TF 0.5.0, and seems to get some errors. The first one is to do with the Layer and Module protocols, and I think I managed to resolve it by updating func call to func callAsFunction in the relevant struct definitions. However, in the notebook cell that defines the pooling type, I am getting another error that I cannot figure out…

error: <Cell 21>:6:8: error: type 'PoolingLayer<Scalar>.TangentVector' does not conform to protocol 'VectorProtocol'
struct PoolingLayer<Scalar: TensorFlowFloatingPoint>: Layer {    
       ^

Any help / pointers would be much appreciated. Thanks!

utkb · November 7, 2019, 6:07pm

Hi,

I have managed to move on by referencing the code changes in SwiftAI (fastai for s4tf), e.g. by adding protocol ParameterlessLayer into the PoolingLayer definitions etc. Got to a point where you define the Optimizer with frozen parameters, and referencing the current code for Adam in S4TF 0.5, it is apparent that I will not be able to do much more here, without first delving deeper into Swift coding =P (e.g. your code on kp and model.recursivelyAllWritableKeyPaths). I think I will need to figure something else out. Nevertheless, thanks again for sharing this – very interesting learning experience!