We defined our own version of vgg in part 1, which included that Lambda layer (see vgg16.py in part 1’s repo).
Oh I see. Thanks!
Thanks Jeremy, by down-sampling you mean the decrease in res-block sizes in the cnn? how is it done?
from the supplement I see that they discuss removing padding from the res blocks. how to implement it in keras?
I’ve changed the last line of the res_block into:
return merge([x, ip], mode='sum')[:,2:-2,2:-2,:]
hope that will do the trick…
No, I mean add a few stride 2 conv layers. See the paper’s supplemental notes for details.
In the supplementary materials, the authors write the following:
For style transfer, we found that standard zero-padded convolutions resulted in severe artifacts around the borders of the generated image. We therefore remove padding from the convolutions in residual blocks.
I’m interpreting this as modifying the res_block code Jeremy provided in the super-resolution demo to use border_mode = ‘valid’ for each of the 2 Conv2D’s. Is this a good interpretation or am I missing something?
Update - Confirmed this is true from the lecture video.
As a general handy tip for people who may not have been around for session 1, I’m finding the graph visualization of models to be a really effective way of feeling like I understand what Keras is doing under the hood; Jeremy explains it in this post from last November, so I won’t re-invent the wheel there, except to ‘bump’ it here:
Weird problem: is anyone else having issues with Python 3’s f-strings not being able to close properly? In my Python-3 notebook, whenever I switch from a string to an f-string (i.e. when I put an ‘f’ in front), my notebook then treats the remainder of the cell as a single string. It seems to be able to evaluate it properly (insofar as it doesn’t throw a syntax error when I execute the cell), but it’s bothering me, and I wonder if anyone else has run into a similar problem
Yup, that’s fixed in the latest notebook version. It’s not in conda’s main channel yet, so ‘pip install notebook’ should grab it. You’ll need to restart jupyter notebook.
Is there anyway to use f-strings with 3.5, i.e. from future? There are a few useful modules that aren’t available for 3.6 yet (e.g. Mayavi), so I’ve been sticking with 3.5.
The syntax is definitely a lot less verbose than .format(), but for now I’ve just been removing f-strings from every notebook (and replacing with
That’s a clever trick!
No, I’m not aware of a backwards-compatible way to use f-strings. You can install Mayavi with a little bother: http://www.math.univ-paris13.fr/~cuvelier/mainsu25.html
For the super resolution network dataset I got a memory error below (with CPU, locally) and to pass it I have to increase my VM memory and this influences the speed of the host.
So I tried to reduce the dataset size (as I already did it for some projects), but after deleting some files to reduce the size, bcolz already see the data indexes.
So, is it possible do it with this dataset, and how to perform it ?
We’ll learn how to do that in today’s lesson!
Ok, thank you Jeremy !
Apologies if this question has been addressed somewhere else, but why do Tensors lose their knowledge of their shape when they’re passed through a deconv block? If all of the aspects of the deconvolution transformation are defined, and the input shape is known (as I’ve been able to confirm it is, by only applying the conv and resnet layers and then examining the resultant tensor), why do the height and width dimensions get set to “None” after the deconv layers?
I’m struggling a bit with implementing fast style transfer, so I I want to try to describe my solution in psuedo code and see where I went wrong. Here goes… any suggestions would be helpful:
- Take a batch of images, pass them through a network to produce a batch of output images.
- The network is similar to the super resolution network with some modifications to the beginning and the details of the resblock() function
- call this network the “style transfer network”
- Instantiate a pretrained image classification network (VGG or otherwise) and make all layers untrainable. This network will be used to produce activations we can compare between the synthesized image (output of step 1) and the “correct” images containing style/content.
- Take the original content images and pass them through our pre-trained image network from 2, grab the outputs at some step (block2_conv1).
- Take the outputs of the style transfer network (step1) and pass those images through the same process as step 2.
- Pass the style image through VGG and get the activations of some subset of the layers. These are our correct style activations.
- Pass the outputs of step 1 through VGG and get the activations from the layers used in step 5.
- Calculate gradients and loss
- Compare the outputs of step 3 and step 4 to produce content loss (using mse).
- Compare the outputs of steps 5 and 6 to produce style loss (using gram matrix technique from Gatys et al)
- Update the weights of the style transfer network and repeat with next batch working to minimize the overall loss.
How does this sound as a process? What did I miss? If this is it, then the devil is in the (implementation) details…
That sounds about right, and hopefully you’ll find that it’s exactly the same as the super-resolution code, except that the loss function has to be updated to included the style loss.
Note that there’s a ‘tips’ thread with some code snippets that may be of assistance.
Super resolution seems to be working fine on training data but does not work well on test data, not sure why? Seems like something else is going on apart from overfitting.
Some results on training data:
Some results on test data:
Will upload the code gist shortly. But curious, if anyone else seeing this pattern?
It might be because you don’t have the black cropping bar at the bottom of your test images?
Has anyone else run into this issue when trying to use the precomputed style targets from Jeremy’s tips post?
TypeError: Output tensors to a Model must be Keras tensors. Found: Tensor("Mean_7:0", shape=(), dtype=float32)
This error surfaces when I try to define a Model() with my inputs and the style function using said precomputed targets.
Yes, I hit this when I tried using loss = content_loss + style_loss, where both losses are Keras tensors. I had to do a merge instead of sum to get rid of it.