I’ve an assignment in college to train a language model and use it for regression. So I’ve had problems in transferring the trained model for regression. After a day of going through the text/learner.py
code I understood that we are supposed to batch the whole input sentence in a single batch and pass it to the new tail network we join for regression. I dont want to use PoolingLinearClassifier
for now since I dont understand it much.
In Lesson 12 Jeremy talks about this and I found the SentenceEncoder
code which is very similar to the MultiBatchEncoder
in source repository. This version kind of worked a bit in my code because of the different way in which it is concatenating at the end of the forward method.
I also dont understand how to concat
method of SentenceEncoder
is working or how the pad_tensor
method is working. Another thing is since we concat all texts together along with appending xxbos
and xxeos
, wouldn’t the size of input be always bppt, I dont understand how it would return sl
which I’m assuming is sentence length. Is batching for classification done differently?
def pad_tensor(t, bs, val=0.):
if t.size(0) < bs:
return torch.cat([t, val + t.new_zeros(bs-t.size(0), *t.shape[1:])])
return t
class SentenceEncoder(nn.Module):
def __init__(self, module, bptt, pad_idx=1):
super().__init__()
self.bptt,self.module,self.pad_idx = bptt,module,pad_idx
def concat(self, arrs, bs):
return [torch.cat([pad_tensor(l[si],bs) for l in arrs], dim=0) for si in range(len(arrs[0]))]
def forward(self, input):
bs,sl = input.size()
self.module.bs = bs
self.module.reset()
outputs = []
for i in range(0, sl, self.bptt):
o = self.module(input[:,i: min(i+self.bptt, sl)])
outputs.append(o)
ops = self.concat(outputs, bs)
return torch.stack(ops)
Now what I dont understand is there is no mention of a max_len for input and thus how do we fix the next Linear Layer?
My code for the regression model is:
class RegModel(nn.Module):
def __init__(self, learn_lm, y_range=[-0.5,3.5]):
super(RegModel, self).__init__()
self.encoder = SentenceEncoder(learn_lm.model.encoder, learn_lm.data.bptt)
layers = [1200, 50, 1]
ps = [0.12, 0.1]
# I'll add more layers for this part once this starts to work
self.plc = nn.Sequential(
nn.Linear(134*64, 1),
# the shape of the self.encoder(x) was (64*134)
# for first iteration so I used it. Still gives Error
)
def forward(self, x):
x = self.encoder(x)
x = self.plc(x)
# range of output is [0,3]
x = F.sigmoid(x)
x = x*(self.y_range[1] - self.y_range[0])
This is my encoder:
LabEncoder(
(rnn): my_gru(
(rnn): GRU(50, 50, batch_first=True)
)
(encoder): Sequential(
(0): Embedding(7400, 50)
(1): my_gru(
(rnn): GRU(50, 50, batch_first=True)
)
)
)
Thanks a lot