Single variable regression for text input

I am wondering if there is a neat, easy way for using TextClasDataBunch along with MultiBatchRNNCore and maybe PoolingLinearClassifier for loading the text with its corresponding single value varying from -1 to 10 for doing a linear regression without forging the sequence of layers myself?

Is there a possibility of doing that in fastai without many workarounds?

The goal would be to feed the RNN network with text via the embedding layer and get a single value output between -1 and 10.

I’d appreciate any hint :slight_smile:

1 Like

If I understand your question correctly, you can do the following:

data_regr = (TextList.from_folder(path, vocab=data_lm.vocab)
             .split_by_folder(valid='test')
             .label_from_folder(label_cls=FloatList)
             .databunch(bs=bs))

All you need to do is pass label_cls=FloatList (label class) to cast class labels to floats. In this case I’ve stored the samples under a folder for each integer label (ratings 1 to 5). In case you have actual floats as labels, you could for example use label_from_df on a DataFrame with a column number to cast that column to a float and use it as a label.