Custom (Tensor + non-Tensor) dataloader

koenvdv · August 12, 2019, 4:29pm

Hi everyone!

I am using fastai v1 for a semantic parsing task. Basically, the task is seq2seq, but the output sequence is a query. For every input example, I extract entities in the text and based on that I build a “Language object” . This object defines all correct transitions in the output space, based on the extracted entities in the input text. In order to prune the output space, I need to be able to sample these objects together with their corresponding input texts in the dataloader. However, to make this work I ended up changing a lot of core functionalities such as proc_batch and to_cpu to prevent fastai from trying to call tensor functions on my non-tensor language objects. This does not seem to be a very elegant way to do this, and I was wondering if anyone has ideas to do this in a more elegant way without throwing away all the cool fastai features.

Any help would be appreciated,

Koen