Lesson 7 further discussion ✅

pushkarneo · February 13, 2019, 8:36am

Hi All,

I am getting an error while doing learn.fit(40,lr) in lesson7-superres-gan notebook.

NameError Traceback (most recent call last)
in
----> 1 learn.fit(40,lr)

~/anaconda3/envs/fastai/lib/python3.7/site-packages/fastai/basic_train.py in fit(self, epochs, lr, wd, callbacks)
176 callbacks = [cb(self) for cb in self.callback_fns] + listify(callbacks)
177 fit(epochs, self.model, self.loss_func, opt=self.opt, data=self.data, metrics=self.metrics,
→ 178 callbacks=self.callbacks+callbacks)
179
180 def create_opt(self, lr:Floats, wd:Floats=0.)->None:

~/anaconda3/envs/fastai/lib/python3.7/site-packages/fastai/utils/mem.py in wrapper(*args, **kwargs)
101
102 try:
→ 103 return func(*args, **kwargs)
104 except Exception as e:
105 if (“CUDA out of memory” in str(e) or

~/anaconda3/envs/fastai/lib/python3.7/site-packages/fastai/basic_train.py in fit(epochs, model, loss_func, opt, data, callbacks, metrics)
78 cb_handler = CallbackHandler(callbacks, metrics)
79 pbar = master_bar(range(epochs))
—> 80 cb_handler.on_train_begin(epochs, pbar=pbar, metrics=metrics)
81
82 exception=False

~/anaconda3/envs/fastai/lib/python3.7/site-packages/fastai/callback.py in on_train_begin(self, epochs, pbar, metrics)
213 self.state_dict[‘n_epochs’],self.state_dict[‘pbar’],self.state_dict[‘metrics’] = epochs,pbar,metrics
214 names = [(met.name if hasattr(met, ‘name’) else camel2snake(met.class.name)) for met in self.metrics]
→ 215 self(‘train_begin’, metrics_names=names)
216
217 def on_epoch_begin(self)->None:

~/anaconda3/envs/fastai/lib/python3.7/site-packages/fastai/callback.py in call(self, cb_name, call_mets, **kwargs)
199 “Call through to all of the CallbakHandler functions.”
200 if call_mets: [getattr(met, f’on_{cb_name}‘)(**self.state_dict, **kwargs) for met in self.metrics]
→ 201 return [getattr(cb, f’on_{cb_name}’)(**self.state_dict, **kwargs) for cb in self.callbacks]
202
203 def set_dl(self, dl:DataLoader):

~/anaconda3/envs/fastai/lib/python3.7/site-packages/fastai/callback.py in (.0)
199 “Call through to all of the CallbakHandler functions.”
200 if call_mets: [getattr(met, f’on_{cb_name}‘)(**self.state_dict, **kwargs) for met in self.metrics]
→ 201 return [getattr(cb, f’on_{cb_name}’)(**self.state_dict, **kwargs) for cb in self.callbacks]
202
203 def set_dl(self, dl:DataLoader):

~/anaconda3/envs/fastai/lib/python3.7/site-packages/fastai/vision/gan.py in on_train_begin(self, **kwargs)
91 “Create the optimizers for the generator and critic if necessary, initialize smootheners.”
92 if not getattr(self,‘opt_gen’,None):
—> 93 self.opt_gen = self.opt.new([nn.Sequential(*flatten_model(self.generator))])
94 else: self.opt_gen.lr,self.opt_gen.wd = self.opt.lr,self.opt.wd
95 if not getattr(self,‘opt_critic’,None):

~/anaconda3/envs/fastai/lib/python3.7/site-packages/fastai/callback.py in new(self, layer_groups)
28 “Create a new OptimWrapper from self with another layer_groups but the same hyper-parameters.”
29 opt_func = getattr(self, ‘opt_func’, self.opt.class)
—> 30 split_groups = split_bn_bias(layer_groups)
31 opt = opt_func([{‘params’: trainable_params(l), ‘lr’:0} for l in split_groups])
32 return self.create(opt_func, self.lr, layer_groups, wd=self.wd, true_wd=self.true_wd, bn_wd=self.bn_wd)

NameError: name ‘split_bn_bias’ is not defined

I looked at the fastai source code, and it does not seem to be defined anywhere.
However, it is defined in the old fastai source code.

Anyone else face the same issue?

pierreguillou · February 14, 2019, 7:00pm

Hi @pushkarneo, I’ve got the same error with fastai 1.0.43. Did you solve it?

pushkarneo · February 15, 2019, 10:41am

Hi @pierreguillou, I did not.
I couldn’t spend much time on it.
But since the code for that function is present in old source code.
Try copying and pasting it in the new source code and running.

If you do try the above, lemme know how it goes.

pierreguillou · February 16, 2019, 6:02pm

Hi @pushkarneo, @sgugger made the correction (cf post).

prosti · February 23, 2019, 8:39pm

Hi, I would like to recheck with you the Model0 from the Human Numbers resource.

The code for Model0 on video differs from the code in Jupiter notebook with a tiny bit:

if x.shape[0]>1: (in video)
if x.shape[1]>1: (in Jupiter notebook)

As it turns out, I may set if True:, or I may completely remove the if branches and the result will be the same.

I created the counter to check how many times we enter the branch like this:

class Model0(nn.Module):
    def __init__(self):
        super().__init__()
        self.i_h = nn.Embedding(nv,nh)  # green arrow
        self.h_h = nn.Linear(nh,nh)     # brown arrow
        self.h_o = nn.Linear(nh,nv)     # blue arrow
        self.bn = nn.BatchNorm1d(nh)
        self.counter0 =0
        self.counter1 =0
        self.counter2 =0
        
    def forward(self, x):
        self.counter0 +=1
        
        h = self.bn(F.relu(self.i_h(x[:,0])))
        if x.shape[0]>1:            
            self.counter1+=1
            h = h + self.i_h(x[:,1])            
            h = self.bn(F.relu(self.h_h(h)))
           
        if x.shape[0]>2:
            self.counter2+=1
            h = h + self.i_h(x[:,2])
            h = self.bn(F.relu(self.h_h(h)))
        return self.h_o(h)

As it turns out these lines:

print(x.shape)
print(m.counter0)
print(m.counter1)
print(m.counter2)

Will return:

torch.Size([64, 3])
1974
1974
1974

Any feedback?

pierreguillou · February 25, 2019, 5:59pm

Hi @jeremy. In the lesson 7, you show in the lesson7-wgan.ipynb notebook how to generate fake images of bathroom by training a WGAN.

The training set you use has 303125 images and you train your GAN within 30 epochs with a lr of 2e-4.

I did try to use the exact same code with mango images from ImageNet dataset that has only 1305 images (and about 500 after cleaning).

However, even after 100 epochs, the result is bad. I guess my issue is the size of my training dataset?
With you experience, what would be the minimum size for the training dataset of a WGAN? And how to choose the right lr? Thank you.

After 100 epochs (lr = 2e-4)

After 100 epochs (lr = 2e-3)

Databunch

prosti · March 5, 2019, 4:32pm

Any feedback on this tiny little issue?

tcapelle · April 3, 2019, 11:54am

I would like to know how to use the models exposed in the human-numbers notebook to generate predictions.
As I can see, the batch_size is hard coded in the model, so If I want to predict a batch of size 1 (one prediction) it is not possible.
I am building an equivalent model to forecast time series, so given the N past values I would like to predict value N+1

Seb · April 12, 2019, 1:55pm

I’m curious to know if anyone tried to pass that black hole picture through a superres network…

andreasl · April 15, 2019, 7:11am

I tried it, and the result is not too impressive:

However, it makes sense. The loss function was based on a pretrained ResNET34, where the training data didn’t have a lot of black hole images (or anything remotely similar). Training a model on pictures of planets, stars, etc. instead of cats might help too

(It did however work extremely well on cats.)

Seb · April 15, 2019, 12:12pm

Thanks for trying!

hotessy · April 20, 2019, 6:35pm

Has any one used models from this GAN zoo https://github.com/eriklindernoren/PyTorch-GAN ? Is the implementation reliable and error free ?

yohan · April 21, 2019, 4:50pm

Hi,

In the notebook lesson7_superres, I am looking at the gram_matrix function output and have a hard time understanding why a certain result occurs on the diagonal. If I have a single unit vectors v_1 and and I do it’s inner product with itself <v_1, v_1>=cos\theta, knowing that the angle is 0, the value should be 1, mainly, the diagonal should be full of ones. Why is that not the case ?

def gram_matrix(x):
    n,c,h,w = x.size()
    x = x.view(n, c, -1)
    return (x @ x.transpose(1,2))/(c*h*w)

gram_matrix(t)
> tensor([[[0.0759, 0.0711, 0.0643],
           [0.0711, 0.0672, 0.0614],
           [0.0643, 0.0614, 0.0573]],

          [[0.0759, 0.0711, 0.0643],
           [0.0711, 0.0672, 0.0614],
           [0.0643, 0.0614, 0.0573]]])

Thank you in advance for your help

angelinayy · April 25, 2019, 10:52pm

sorry didnt look at the competition, but reading Jeremy’s comment…can spark be helpful in this situation? downsample?

Nitron · May 10, 2019, 5:23pm

Does anyone know where the lesson 7 Kaggle notebooks for superres, superres-imagenet, superres-gan and wgan are?

leviritchie · May 17, 2019, 6:28pm

I suspect, beyond just regular pictures of planets and stars, you’d need a large set of high-resolution images of celestial objects taken with the same kind of black hole spectroscopy. Especially gaseous objects, since I think that’s what we’re seeing in the black hole picture. Then, rather than blurring them with image compression, you’d need a function that approximates the blur of objects in space across a vast distance.

Unfortunately, I’m a few million bucks shy of a good radio telescope.

eljas1 · June 12, 2019, 12:42pm

Edit: Duh, I feel stupid. Everything worked all along and the picture was just within an iterator. The correct output is in pred_img[0].

What would be the simplest way to run inference on the Superres GAN with new images after it has been trained? I have tried feeding a new image with the learn.predict method but the outcome doesn’t seem to make sense or I’m using it wrong. I’m unable to view the outcome as an image. Below is what I’ve tried so far:

Train learner

lr = 1e-4
bs,size=32, 128
switcher = partial(AdaptiveGANSwitcher, critic_thresh=0.65)
learn = GANLearner.from_learners(learn_gen, learn_crit, weights_gen=(1.,50.), show_img=True, switcher=switcher,
                                 opt_func=partial(optim.Adam, betas=(0.,0.99)), wd=wd)
learn.callback_fns.append(partial(GANDiscriminativeLR, mult_lr=5.))

learn.fit(4, lr/2)

epoch	train_loss	valid_loss	gen_loss
0	1.525757	1.570713	04:49
1	1.394629	1.828694	04:51
2	1.385353	1.615942	04:50
3	1.397209	1.349061	04:51

Switch to generative mode

learn.gan_trainer.switch(gen_mode=True)

Open new image

infer_img = open_image('infer.png')
infer_img.shape

torch.Size([3, 224, 224])

Try inference but fail

pred_img = learn.predict(infer_img)
pred_img.shape

AttributeError: ‘tuple’ object has no attribute ‘shape’

Investigate the output

pred_img

Image (3, 128, 128),
tensor([[[1.0036, 1.0291, 1.0049, …, 1.0012, 1.0105, 0.9864],
[1.0247, 1.0235, 0.9945, …, 0.9945, 1.0167, 1.0068],
[1.0018, 1.0116, 1.0061, …, 0.9979, 0.9967, 0.9928],
…,

pred_img = fastai.vision.Image(pred_img)
pred_img.show()

AttributeError: ‘tuple’ object has no attribute ‘cpu’

Any help is appreciated!

umba3abp · June 16, 2019, 8:10pm

Hey guys - I’ve got an interesting question beyond the exercise. I noticed that GANs require much more compute than let’s say CV or NLP. I’m curious, what are the most compute-hungry ML tasks? Would GANs be the top one? Followed by what?

Just being curious:)

xnet · June 17, 2019, 5:13pm

Is there a way to “un-normalize” an image’s prediction (for the superresolution model).

Here’s what I have.

# load_img is my custom function using openCV that outputs a numpy array
img_ori_1 = load_img(tst_data.train_ds.x.items[i])
img_ori_2 = load_img(tst_data.train_ds.y.items[i])

# Make predictions
p, img_pred_1, b = learn.predict(tst_data.train_ds[i][0])
p, img_pred_2, b = learn.predict(tst_data.train_ds[i][1])

# Permute axis and convert to numpy
img_pred_1 = img_pred_1.permute(1, 2, 0).numpy()
img_pred_2 = img_pred_2.permute(1, 2, 0).numpy()

After this img_pred_1 values are normalized, and cannot be compared with my img_ori_1 values. How do I “un-imagenet-normalize” my predicted images? Thanks!

MaiSaid · June 28, 2019, 10:36pm

Hi,

In GANs , I am a little confused . When shall I use freeze instead of unfreeze ?

Thank you ,