Problem with lr_find when using both bidirectional and qrnn

ctn · June 5, 2019, 11:50pm

Hi as describe, I’m facing the following error when enabling both bidirectional and qrnn in the awd_lstm_lm_config file.
RuntimeError: The expanded size of the tensor (1150) must match the existing size (575) at non-singleton dimension 1. Target sizes: [64, 1150]. Tensor sizes: [64, 575]
all the other parameters are default parameters

Does fastai support bidirectional qrnn?

sgugger · June 6, 2019, 7:28pm

This bug was recently fixed in master. Note that, as explained in this topic, it is impossible to train a bidirectional language model. You can train a bidirectional classifier on the other hand.

ctn · June 6, 2019, 7:48pm

Thanks for the response. When I click on the link you mentioned, I’m getting ‘Sorry, you don’t have access to that topic!’.
If possible I would like to understand why it’s impossible to create a bidirectional language model. I also have a few more questions
1)Would bi_directional qrnn work with language classifier?
2) But a language model with bi_directional LSTM gives no errors. So is the error specific to qrnn?

sgugger · June 6, 2019, 8:07pm

Oh sorry it’s not public yet, will be when the latest part 2 of the course is released as a MOOC. A language model has for targets the input shifted by one word, on the left for a forward model, on the right for a backward model. You can’t have both at the same time, hence no bidirectional LM. Note that there won’t be any bug, the code can run, but you won’t train properly.

For the second point, it means it’s not the same error as before. Could you give the full stack trace?

ctn · June 7, 2019, 5:34pm

RuntimeError Traceback (most recent call last)
in ()
----> 1 learn.lr_find()

~/punchh/sensus/notebooks/chetan/fastai/fastai/train.py in lr_find(learn, start_lr, end_lr, num_it, stop_div, wd)
30 cb = LRFinder(learn, start_lr, end_lr, num_it, stop_div)
31 epochs = int(np.ceil(num_it/len(learn.data.train_dl)))
—> 32 learn.fit(epochs, start_lr, callbacks=[cb], wd=wd)
33
34 def to_fp16(learn:Learner, loss_scale:float=None, max_noskip:int=1000, dynamic:bool=True, clip:float=None,

~/punchh/sensus/notebooks/chetan/fastai/fastai/basic_train.py in fit(self, epochs, lr, wd, callbacks)
198 callbacks = [cb(self) for cb in self.callback_fns + listify(defaults.extra_callback_fns)] + listify(callbacks)
199 if defaults.extra_callbacks is not None: callbacks += defaults.extra_callbacks
–> 200 fit(epochs, self, metrics=self.metrics, callbacks=self.callbacks+callbacks)
201
202 def create_opt(self, lr:Floats, wd:Floats=0.)->None:

~/punchh/sensus/notebooks/chetan/fastai/fastai/basic_train.py in fit(epochs, learn, callbacks, metrics)
99 for xb,yb in progress_bar(learn.data.train_dl, parent=pbar):
100 xb, yb = cb_handler.on_batch_begin(xb, yb)
–> 101 loss = loss_batch(learn.model, xb, yb, learn.loss_func, learn.opt, cb_handler)
102 if cb_handler.on_batch_end(loss): break
103

~/punchh/sensus/notebooks/chetan/fastai/fastai/basic_train.py in loss_batch(model, xb, yb, loss_func, opt, cb_handler)
24 if not is_listy(xb): xb = [xb]
25 if not is_listy(yb): yb = [yb]
—> 26 out = model(*xb)
27 out = cb_handler.on_loss_begin(out)
28

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/nn/modules/module.py in call(self, *input, **kwargs)
491 result = self._slow_forward(*input, **kwargs)
492 else:
–> 493 result = self.forward(*input, **kwargs)
494 for hook in self._forward_hooks.values():
495 hook_result = hook(self, input, result)

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/nn/modules/container.py in forward(self, input)
90 def forward(self, input):
91 for module in self._modules.values():
—> 92 input = module(input)
93 return input
94

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/nn/modules/module.py in call(self, *input, **kwargs)
491 result = self._slow_forward(*input, **kwargs)
492 else:
–> 493 result = self.forward(*input, **kwargs)
494 for hook in self._forward_hooks.values():
495 hook_result = hook(self, input, result)

~/punchh/sensus/notebooks/chetan/fastai/fastai/text/models/awd_lstm.py in forward(self, input, from_embeddings)
116 new_hidden,raw_outputs,outputs = [],[],[]
117 for l, (rnn,hid_dp) in enumerate(zip(self.rnns, self.hidden_dps)):
–> 118 raw_output, new_h = rnn(raw_output, self.hidden[l])
119 new_hidden.append(new_h)
120 raw_outputs.append(raw_output)

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/nn/modules/module.py in call(self, *input, **kwargs)
491 result = self._slow_forward(*input, **kwargs)
492 else:
–> 493 result = self.forward(*input, **kwargs)
494 for hook in self._forward_hooks.values():
495 hook_result = hook(self, input, result)

~/punchh/sensus/notebooks/chetan/fastai/fastai/text/models/qrnn.py in forward(self, inp, hid)
200 if self.bidirectional: inp_bwd = inp.clone()
201 for i, layer in enumerate(self.layers):
–> 202 inp, h = layer(inp, None if hid is None else hid[2*i if self.bidirectional else i])
203 new_hid.append(h)
204 if self.bidirectional:

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/nn/modules/module.py in call(self, *input, **kwargs)
491 result = self._slow_forward(*input, **kwargs)
492 else:
–> 493 result = self.forward(*input, **kwargs)
494 for hook in self._forward_hooks.values():
495 hook_result = hook(self, input, result)

~/punchh/sensus/notebooks/chetan/fastai/fastai/text/models/qrnn.py in forward(self, inp, hid)
148 if self.backward: forget_mult = dispatch_cuda(BwdForgetMultGPU, partial(forget_mult_CPU, backward=True), inp)
149 else: forget_mult = dispatch_cuda(ForgetMultGPU, forget_mult_CPU, inp)
–> 150 c_gate = forget_mult(z_gate, f_gate, hid, self.batch_first)
151 output = torch.sigmoid(o_gate) * c_gate if self.output_gate else c_gate
152 if self.window > 1 and self.save_prev_x:

~/punchh/sensus/notebooks/chetan/fastai/fastai/text/models/qrnn.py in forward(ctx, x, f, hidden_init, batch_first)
59 output = f.new_zeros(batch_size, seq_size + 1, hidden_size)
60
—> 61 if hidden_init is not None: output[:, 0] = hidden_init
62 else: output.zero_()
63 else:

RuntimeError: The expanded size of the tensor (1150) must match the existing size (575) at non-singleton dimension 1. Target sizes: [64, 1150]. Tensor sizes: [64, 575]

This is the full trace. I installed the fastai developer package into my directory and calling the functions from there
Not sure why the tensor size is becoming half

sgugger · June 7, 2019, 6:01pm

Oh there was a bug in the way the hidden size was comptued in the case of QRNN. Should be fixed now.

ctn · June 7, 2019, 6:23pm

Now there’s a different error.