Seq2Seq DataBunch

mdewitt · January 19, 2019, 3:35am

I’m looking for tips on how to create a Sequence to sequence DataBunch that will fit into the existing functionality of the fastai library well. My goal is something that works with TextDataBunch, but if there was a way to abstract out the NLP bits that would be even better. What I’ve tried:

Subclass of TextDataBunch
I can make this work with a custom create(..) method, but many of the convenient loaders (from_*) are expecting either class or LM labels.
Custom ItemList
In theory a SeqTextList: TextList and something like TextSequence: Text should be similar to the example covered in the custom item list tutorial, but I haven’t found a way to get the processors to work well with the target sequence.
Use a basic Seq2Seq: Dataset
This was demonstrated in L11, but there are many caveats with this approach in fast.ai_v1.

Any and all ideas are appreciated!

daleevans · February 8, 2019, 6:08pm

I’ve been successfully doing a lot of seq2seq training using a Seq2Seq Dataset and TextLists. I think the only things I had to do to adapt the example code from that lesson was to transpose the inputs on the loss function and the model forward:

def seq2seq_loss(input, target):
    target = torch.transpose(target, 0, 1).contiguous()
    sl, bs = target.size()
    sl_in, bs_in, nc = input.size()
    if sl > sl_in:
        input = F.pad(input, (0, 0, 0, 0, 0, sl-sl_in))
    input = input[:sl]
    return F.cross_entropy(input.view(-1, nc), target.view(-1))

and

class Encoder(nn.Module):

    def forward(self, inp, y=None, ret_attn=False):
        inp = torch.transpose(inp, 0, 1)