SentenceEncoder Lesson 12 ULMFit

Narang · April 26, 2020, 10:21pm

I’ve an assignment in college to train a language model and use it for regression. So I’ve had problems in transferring the trained model for regression. After a day of going through the text/learner.py code I understood that we are supposed to batch the whole input sentence in a single batch and pass it to the new tail network we join for regression. I dont want to use PoolingLinearClassifier for now since I dont understand it much.
In Lesson 12 Jeremy talks about this and I found the SentenceEncoder code which is very similar to the MultiBatchEncoder in source repository. This version kind of worked a bit in my code because of the different way in which it is concatenating at the end of the forward method.
I also dont understand how to concat method of SentenceEncoder is working or how the pad_tensor method is working. Another thing is since we concat all texts together along with appending xxbos and xxeos, wouldn’t the size of input be always bppt, I dont understand how it would return sl which I’m assuming is sentence length. Is batching for classification done differently?

def pad_tensor(t, bs, val=0.):
  if t.size(0) < bs:
    return torch.cat([t, val + t.new_zeros(bs-t.size(0), *t.shape[1:])])
  return t

class SentenceEncoder(nn.Module):
    def __init__(self, module, bptt, pad_idx=1):
        super().__init__()
        self.bptt,self.module,self.pad_idx = bptt,module,pad_idx

    def concat(self, arrs, bs):
        return [torch.cat([pad_tensor(l[si],bs) for l in arrs], dim=0) for si in range(len(arrs[0]))]
        
    def forward(self, input):
        bs,sl = input.size()
        self.module.bs = bs
        self.module.reset()
        outputs = []
        for i in range(0, sl, self.bptt):
            o = self.module(input[:,i: min(i+self.bptt, sl)])
            outputs.append(o)
        ops =  self.concat(outputs, bs)
        return torch.stack(ops)

Now what I dont understand is there is no mention of a max_len for input and thus how do we fix the next Linear Layer?
My code for the regression model is:

class RegModel(nn.Module):
  def __init__(self, learn_lm, y_range=[-0.5,3.5]):
    super(RegModel, self).__init__()
    self.encoder = SentenceEncoder(learn_lm.model.encoder, learn_lm.data.bptt)
    layers = [1200, 50, 1]
    ps = [0.12, 0.1]
    # I'll add more layers for this part once this starts to work
    self.plc = nn.Sequential(
        nn.Linear(134*64, 1), 
        # the shape of the self.encoder(x) was (64*134) 
        # for first iteration so I used it. Still gives Error
    )
  def forward(self, x):
    x = self.encoder(x)
    x = self.plc(x)
    # range of output is [0,3]
    x = F.sigmoid(x)
    x = x*(self.y_range[1] - self.y_range[0])

This is my encoder:

LabEncoder(
  (rnn): my_gru(
    (rnn): GRU(50, 50, batch_first=True)
  )
  (encoder): Sequential(
    (0): Embedding(7400, 50)
    (1): my_gru(
      (rnn): GRU(50, 50, batch_first=True)
    )
  )
)

Thanks a lot