Part 2 Lesson 13 Wiki

yggg · April 26, 2018, 9:15am

@jeremy Thank you for the pointer to the NIN paper.

As a quick summary: the paper proposed a new structure called MLPConvBlock to replace “vanilla conv block”, and MLPConvBlock can be efficiently implemented using 1x1conv in the following fashion:

class MlpConvBlock(nn.Module):

    def __init__(self, in_chs, out_chs, **kw):
        h2_chs = (in_chs + out_chs) // 3
        h1_chs = 2 * h2_chs
        self.conv1 = nn.Conv2d(in_chs, h1_chs, kernel_size=3, stride=1, padding=1, **kw)
        self.conv2 = nn.Conv2d(h1_chs, h2_chs, kernel_size=1, stride=1, padding=0)
        self.conv3 = nn.Conv2d(h2_chs, out_chs, kernel_size=1, stride=1, padding=0)
        self.bn1 = nn.BatchNorm2d(h1_chs)
        self.bn2 = nn.BatchNorm2d(h2_chs)
        self.bn3 = nn.BatchNorm2d(out_chs)
        self.relu = nn.ReLU(inplace=True)
        
    def forward(self, x):
        y = self.relu(self.bn1(self.conv1(x)))
        y = self.relu(self.bn2(self.conv2(y)))
        y = self.relu(self.bn3(self.conv3(y)))
        return y

Since MLPConvBlock itself looks like a mini-network, so a network composed of many MLPConvBlock's can be seen as “Network in network”.

jeremy · April 26, 2018, 11:35pm

I’ve added the lesson video to the top post. It’s encoding now and will be live in about 15 mins.

Chris_Palmer · April 27, 2018, 2:09am

Hi Jeremy

I see this was posted 3 hours ago with an estimate of 15 minutes for video availability - but I have just tried it and there is no video available…

Moody · April 27, 2018, 5:56am

I manged to implement GAN and Cycle GAN in Excel. It is a good way to learn the maths behind them with visualization.

In GAN, the L2 loss (bottom right corner) works better than L1 loss (top right corner).

In Cycle GAN, the maximum single value in “Deconv filter” (in blue) could be over -53 million although the the values of “Conv Filter” were ranged from -1 to 1.

ganeshk · April 27, 2018, 6:29am

This is awesome! I’m curious as to how you implement this, though. For the GAN, how did you do the min-max optimization? Especially, having a hard time wrapping my head around how you implemented the discriminator. Did you just choose an arbitrary convolutional filter for the discriminator?

Elfayoumi · April 27, 2018, 1:41pm

Hello
I can not see the video, I was away this week and trying to view lesson 13 but the video is not available.
would you please help?
Regards
Ibrahim

kmatsuda · April 27, 2018, 2:53pm

Hi @Elfayoumi, you can view the unedited version at this link:

@jeremy, it looks like the edited video posted in the wiki section doesn’t play back:

Moody · April 27, 2018, 6:02pm

Setup a deconvolutional filter as usual. Then, using “solver” to minimize the sum of the loss function (L1 or L2) by “changing variable cells” (ie the deconvolutional filter).

jeremy · April 27, 2018, 6:17pm

Apologies - my computer went to sleep just before the upload completed! Fixed now; the full video is in the top post.

Kasianenko · April 27, 2018, 6:56pm

I have question about loss function. Jeremy told that numbers were too small and it didn’t learn. But when I see block 84 in Jupyter notebook with *1e6 in the end, I recall school lessons about precision, where teacher told to use big numbers in the beginning of equation. Does this rule apply here, should it work better with better precision, if we rewrite 1e6*torch.mm(x, x.t())/input.numel() ?

jeremy · April 27, 2018, 7:02pm

That may be better - although in this case the problem didn’t occur until later in the optimization when it calculated the gradient and step size, so it doesn’t really matter.

jfl · April 27, 2018, 9:46pm

Does anyone run into this issue when trying to rerun the notebook shown in class? I am stuck at the last part of the style transfer when trying to run the iterations on the comb_loss function. Thanks in advance!

alenas · April 27, 2018, 9:50pm

Remove line:

for sf in sfs: sf.close()

before style transfer section.

jfl · April 27, 2018, 10:44pm

I blended my cat + flowers - pretty fun course!

alenas · April 28, 2018, 5:51am

Experimenting with portraits:

sg1791 · April 28, 2018, 11:32am

Hi everyone,

I have a question regarding the architecture that you guys are using to generate these new pictures. Did anyone try using the ResNet like Jeremy told in the class?

I am struggling to get any good output there. I am yet to try anyother.

Meanwhile i read this: https://medium.com/mlreview/getting-inception-architectures-to-work-with-style-transfer-767d53475bf8

“Style transfer typically requires neural networks with a well-developed hierarchy of features for calculating the loss. Thus for this purpose, the good old vgg-16 and vgg-19 architectures work very well. But inception architectures (unlike resnet architectures) also have the same property.”

and i guess it makes sense. Since ResNet uses skip connections to combine different layers, it does not maintain an interpretable coarse- to-fine feature hierarchy from the last layer to early layers as the VGGNet does. Each layer of ResNet contains both semantical abstraction and spatial details from earlier layers.

Let me know if anyone tries any other network.

Interogativ · April 28, 2018, 7:38pm

Haven’t run into that issue, but the “standard” solution is to do a

Git pull

jeremy · April 29, 2018, 3:30am

It’s a really interesting issue. I’m not sure I agree with this post’s reasoning however. Because resnet has occassional downsampling layers which don’t have skip connections, it does still have a coarse-to-fine feature hierarchy.

I think the issue may be due to the fully connected layers.

Bodhi94 · April 29, 2018, 3:41pm

Hello,
I created a small blog post about neural style transfer using what I learned from Lesson 13. Please check that out and give me some feedback to improve the content.
Thanks…!!

ranakj · April 29, 2018, 7:36pm

Congratulations on the paper published