I have a problem I have never encountered before in text classification with fastai.
This exact same code works perfectly fine with a different dataset.
I also considered instances where the same labels are only in the testing data and not training data and vice-versa. This was not the issue
# Dataloader
dls = TextDataLoaders.from_df(
df_training,
text_col="text",
label_col="label",
bs=64 * 2,
is_lm=True,
seq_len=72)
# Take Pre-trained language model
learn = language_model_learner(
dls, AWD_LSTM, drop_mult=0.25, metrics=[accuracy, Perplexity])
# Train and save the encoder
learn.fit_one_cycle(6, slice(1e-3))
learn.export('coder.pth')
# data laoder for classification
dls_class = dls = TextDataLoaders.from_df(
df_training, text_col="text", label_col="label", bs=64 * 2, is_lm=False, seq_len=72, valid_pct = 0)
dls_class = dls = TextDataLoaders.from_df(
df_training, text_col="ha_question", label_col="ctd_section", bs=64 * 2, is_lm=False, seq_len=72, valid_pct = 0)
class_learn.load_encoder('/SFS/user/ry/hrobar/msd_projects/nlp/coder')
Returns an error:
ModuleAttributeError: 'SequentialRNN' object has no attribute 'keys'