Hey
I’ve been trying to use ULMFIT for multi-label classification, I collected my labels into a comma-separated list for each row and fed it into a TextDataBunch like so
data_clas = TextClasDataBunch.from_csv('./', 'collected_combined_data.csv', valid_pct=0.3, vocab=data.vocab, text_cols='comment', label_cols='mot', label_delim=',', bs=16)
And the model looks kinda right, except that the final layer is 50x82 - I don’t understand how it got to 82, since I only have 21 classes, ideally, it should just take the sigmoid over a final layer of 50x21?
My preds and y_true from model.validate are the same shape (3765x82) but 1. how did it reach the number 82 in the first place and 2. how do I interpret the preds to get the original labels?
RNNLearner(data=TextClasDataBunch;
Train: LabelList (8782 items)
x: TextList
y: MultiCategoryList
['class1'; 'class2'],['class3']...
Path: .;
Valid: LabelList (3765 items)
x: TextList
<text>
y: MultiCategoryList
<classes in the same format>
Path: .;
Test: None, model=SequentialRNN(
(0): MultiBatchEncoder(
(module): AWD_LSTM(
(encoder): Embedding(4616, 400, padding_idx=1)
(encoder_dp): EmbeddingDropout(
(emb): Embedding(4616, 400, padding_idx=1)
)
(rnns): ModuleList(
(0): WeightDropout(
(module): LSTM(400, 1152, batch_first=True)
)
(1): WeightDropout(
(module): LSTM(1152, 1152, batch_first=True)
)
(2): WeightDropout(
(module): LSTM(1152, 400, batch_first=True)
)
)
(input_dp): RNNDropout()
(hidden_dps): ModuleList(
(0): RNNDropout()
(1): RNNDropout()
(2): RNNDropout()
)
)
)
(1): PoolingLinearClassifier(
(layers): Sequential(
(0): BatchNorm1d(1200, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(1): Dropout(p=0.2)
(2): Linear(in_features=1200, out_features=50, bias=True)
(3): ReLU(inplace)
(4): BatchNorm1d(50, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(5): Dropout(p=0.1)
(6): Linear(in_features=50, out_features=82, bias=True)
)
)
), opt_func=functools.partial(<class 'torch.optim.adam.Adam'>, betas=(0.9, 0.99)), loss_func=FlattenedLoss of BCEWithLogitsLoss(), metrics=[], true_wd=True, bn_wd=True, wd=0.01, train_bn=True, path=PosixPath('.'), model_dir='models', callback_fns=[functools.partial(<class 'fastai.basic_train.Recorder'>, add_time=True, silent=False)], callbacks=[RNNTrainer
learn: ...
alpha: 2.0
beta: 1.0], layer_groups=[Sequential(
(0): Embedding(4616, 400, padding_idx=1)
(1): EmbeddingDropout(
(emb): Embedding(4616, 400, padding_idx=1)
)
), Sequential(
(0): WeightDropout(
(module): LSTM(400, 1152, batch_first=True)
)
(1): RNNDropout()
), Sequential(
(0): WeightDropout(
(module): LSTM(1152, 1152, batch_first=True)
)
(1): RNNDropout()
), Sequential(
(0): WeightDropout(
(module): LSTM(1152, 400, batch_first=True)
)
(1): RNNDropout()
), Sequential(
(0): PoolingLinearClassifier(
(layers): Sequential(
(0): BatchNorm1d(1200, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(1): Dropout(p=0.2)
(2): Linear(in_features=1200, out_features=50, bias=True)
(3): ReLU(inplace)
(4): BatchNorm1d(50, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(5): Dropout(p=0.1)
(6): Linear(in_features=50, out_features=82, bias=True)
)
)
)], add_time=True, silent=None)