Lesson 7 further discussion ✅

(Nick Switanek) #16

In the lecture and resnet-mnist notebook, Jeremy’s basic CNN example uses Conv2D, BatchNorm, ReLU sequences. Then the refactor uses fastai’s conv_layer, which wraps a Conv2D, ReLU, BatchNorm sequence.

Is the order of the ReLU and BatchNorm operations not consequential?

I would have thought the composition of the sequence matters, especially the location of the ReLU. Further, I’d have thought that BatchNorm on the raw Conv2D weights rather than on the ReLU-truncated weights would make more sense, in accordance with the basic CNN example, rather than with the conv_layer function.

1 Like

(Jeremy Howard (Admin)) #17

It is somewhat consequential - experiments show that BatchNorm after ReLU is generally a bit better.


(Kofi Asiedu Brempong) #18

When implementing the unet paper, I noticed that in the case of the example in the paper, the features from the encoder are 64x64 and the upsampled features are 56x56.
My question is how do concatenate them, do you pad the upsampled features to 64x64 or do you crop the features from the encoder to 56x56.


(Francisco Ingham) #19

You crop the features from the encoder. Quoting the paper:

Every step in the expansive path consists of an upsampling of the
feature map followed by a 2x2 convolution (“up-convolution”) that halves the
number of feature channels, a concatenation with the correspondingly cropped
feature map from the contracting path, and two 3x3 convolutions, each followed by a ReLU. The cropping is necessary due to the loss of border pixels in
every convolution.

The ‘contracting path’ is the encoder which ‘contracts’/‘encodes’ the input into a representation of smaller dimensions.


(Kofi Asiedu Brempong) #20


1 Like

(Pushkar) #21

Hi All,

I am getting an error while doing learn.fit(40,lr) in lesson7-superres-gan notebook.

NameError Traceback (most recent call last)
----> 1 learn.fit(40,lr)

~/anaconda3/envs/fastai/lib/python3.7/site-packages/fastai/basic_train.py in fit(self, epochs, lr, wd, callbacks)
176 callbacks = [cb(self) for cb in self.callback_fns] + listify(callbacks)
177 fit(epochs, self.model, self.loss_func, opt=self.opt, data=self.data, metrics=self.metrics,
–> 178 callbacks=self.callbacks+callbacks)
180 def create_opt(self, lr:Floats, wd:Floats=0.)->None:

~/anaconda3/envs/fastai/lib/python3.7/site-packages/fastai/utils/mem.py in wrapper(*args, **kwargs)
102 try:
–> 103 return func(*args, **kwargs)
104 except Exception as e:
105 if (“CUDA out of memory” in str(e) or

~/anaconda3/envs/fastai/lib/python3.7/site-packages/fastai/basic_train.py in fit(epochs, model, loss_func, opt, data, callbacks, metrics)
78 cb_handler = CallbackHandler(callbacks, metrics)
79 pbar = master_bar(range(epochs))
—> 80 cb_handler.on_train_begin(epochs, pbar=pbar, metrics=metrics)
82 exception=False

~/anaconda3/envs/fastai/lib/python3.7/site-packages/fastai/callback.py in on_train_begin(self, epochs, pbar, metrics)
213 self.state_dict[‘n_epochs’],self.state_dict[‘pbar’],self.state_dict[‘metrics’] = epochs,pbar,metrics
214 names = [(met.name if hasattr(met, ‘name’) else camel2snake(met.class.name)) for met in self.metrics]
–> 215 self(‘train_begin’, metrics_names=names)
217 def on_epoch_begin(self)->None:

~/anaconda3/envs/fastai/lib/python3.7/site-packages/fastai/callback.py in call(self, cb_name, call_mets, **kwargs)
199 “Call through to all of the CallbakHandler functions.”
200 if call_mets: [getattr(met, f’on_{cb_name}’)(**self.state_dict, **kwargs) for met in self.metrics]
–> 201 return [getattr(cb, f’on_{cb_name}’)(**self.state_dict, **kwargs) for cb in self.callbacks]
203 def set_dl(self, dl:DataLoader):

~/anaconda3/envs/fastai/lib/python3.7/site-packages/fastai/callback.py in (.0)
199 “Call through to all of the CallbakHandler functions.”
200 if call_mets: [getattr(met, f’on_{cb_name}’)(**self.state_dict, **kwargs) for met in self.metrics]
–> 201 return [getattr(cb, f’on_{cb_name}’)(**self.state_dict, **kwargs) for cb in self.callbacks]
203 def set_dl(self, dl:DataLoader):

~/anaconda3/envs/fastai/lib/python3.7/site-packages/fastai/vision/gan.py in on_train_begin(self, **kwargs)
91 “Create the optimizers for the generator and critic if necessary, initialize smootheners.”
92 if not getattr(self,‘opt_gen’,None):
—> 93 self.opt_gen = self.opt.new([nn.Sequential(*flatten_model(self.generator))])
94 else: self.opt_gen.lr,self.opt_gen.wd = self.opt.lr,self.opt.wd
95 if not getattr(self,‘opt_critic’,None):

~/anaconda3/envs/fastai/lib/python3.7/site-packages/fastai/callback.py in new(self, layer_groups)
28 “Create a new OptimWrapper from self with another layer_groups but the same hyper-parameters.”
29 opt_func = getattr(self, ‘opt_func’, self.opt.class)
—> 30 split_groups = split_bn_bias(layer_groups)
31 opt = opt_func([{‘params’: trainable_params(l), ‘lr’:0} for l in split_groups])
32 return self.create(opt_func, self.lr, layer_groups, wd=self.wd, true_wd=self.true_wd, bn_wd=self.bn_wd)

NameError: name ‘split_bn_bias’ is not defined

I looked at the fastai source code, and it does not seem to be defined anywhere.
However, it is defined in the old fastai source code.

Anyone else face the same issue?


Developer chat
(Pierre Guillou) #22

Hi @pushkarneo, I’ve got the same error with fastai 1.0.43. Did you solve it?


(Pushkar) #23

Hi @pierreguillou, I did not.
I couldn’t spend much time on it.
But since the code for that function is present in old source code.
Try copying and pasting it in the new source code and running.

If you do try the above, lemme know how it goes.


(Pierre Guillou) #24

Hi @pushkarneo, @sgugger made the correction (cf post).



Hi, I would like to recheck with you the Model0 from the Human Numbers resource.

The code for Model0 on video differs from the code in Jupiter notebook with a tiny bit:

  • if x.shape[0]>1: (in video)
  • if x.shape[1]>1: (in Jupiter notebook)

As it turns out, I may set if True:, or I may completely remove the if branches and the result will be the same.

I created the counter to check how many times we enter the branch like this:

class Model0(nn.Module):
    def __init__(self):
        self.i_h = nn.Embedding(nv,nh)  # green arrow
        self.h_h = nn.Linear(nh,nh)     # brown arrow
        self.h_o = nn.Linear(nh,nv)     # blue arrow
        self.bn = nn.BatchNorm1d(nh)
        self.counter0 =0
        self.counter1 =0
        self.counter2 =0
    def forward(self, x):
        self.counter0 +=1
        h = self.bn(F.relu(self.i_h(x[:,0])))
        if x.shape[0]>1:            
            h = h + self.i_h(x[:,1])            
            h = self.bn(F.relu(self.h_h(h)))
        if x.shape[0]>2:
            h = h + self.i_h(x[:,2])
            h = self.bn(F.relu(self.h_h(h)))
        return self.h_o(h)

As it turns out these lines:


Will return:

torch.Size([64, 3])

Any feedback?


(Pierre Guillou) #26

Hi @jeremy. In the lesson 7, you show in the lesson7-wgan.ipynb notebook how to generate fake images of bathroom by training a WGAN.

The training set you use has 303125 images and you train your GAN within 30 epochs with a lr of 2e-4.

I did try to use the exact same code with mango images from ImageNet dataset that has only 1305 images (and about 500 after cleaning).

However, even after 100 epochs, the result is bad. I guess my issue is the size of my training dataset?
With you experience, what would be the minimum size for the training dataset of a WGAN? And how to choose the right lr? Thank you.

After 100 epochs (lr = 2e-4)

After 100 epochs (lr = 2e-3)




Any feedback on this tiny little issue?


(Thomas) #28

I would like to know how to use the models exposed in the human-numbers notebook to generate predictions.
As I can see, the batch_size is hard coded in the model, so If I want to predict a batch of size 1 (one prediction) it is not possible.
I am building an equivalent model to forecast time series, so given the N past values I would like to predict value N+1



I’m curious to know if anyone tried to pass that black hole picture through a superres network…


(Andreas) #30

I tried it, and the result is not too impressive:

However, it makes sense. The loss function was based on a pretrained ResNET34, where the training data didn’t have a lot of black hole images (or anything remotely similar). Training a model on pictures of planets, stars, etc. instead of cats might help too :stuck_out_tongue:

(It did however work extremely well on cats.)



Thanks for trying!


(Divyanshu Sharma) #32

Has any one used models from this GAN zoo https://github.com/eriklindernoren/PyTorch-GAN ? Is the implementation reliable and error free ?




In the notebook lesson7_superres, I am looking at the gram_matrix function output and have a hard time understanding why a certain result occurs on the diagonal. If I have a single unit vectors v_1 and and I do it’s inner product with itself <v_1, v_1>=cos\theta, knowing that the angle is 0, the value should be 1, mainly, the diagonal should be full of ones. Why is that not the case ?

def gram_matrix(x):
    n,c,h,w = x.size()
    x = x.view(n, c, -1)
    return (x @ x.transpose(1,2))/(c*h*w)

> tensor([[[0.0759, 0.0711, 0.0643],
           [0.0711, 0.0672, 0.0614],
           [0.0643, 0.0614, 0.0573]],

          [[0.0759, 0.0711, 0.0643],
           [0.0711, 0.0672, 0.0614],
           [0.0643, 0.0614, 0.0573]]])

Thank you in advance for your help :slight_smile:



sorry didnt look at the competition, but reading Jeremy’s comment…can spark be helpful in this situation? downsample?



Does anyone know where the lesson 7 Kaggle notebooks for superres, superres-imagenet, superres-gan and wgan are?