I trained a simple language model for musical chords (that’s why the sequences look weird )
This is the simple code I’m running (df is a pandas with the input documents, my_vocabulary is the vocabulary that I pre-calculated):
dls = TextDataLoaders.from_df(df,
text_col='text',
seq_len=seq_length,
is_lm=True,
bs=1024,
text_vocab=my_vocabulary)
learn = language_model_learner(dls, AWD_LSTM, metrics=[accuracy, Perplexity(), top_k_accuracy])
learn.fine_tune(50, freeze_epochs=2)
Then when I try to run it like this:
learn.predict('a|m e|m a|m e|m a|m g| em| a|m', n_words=1, no_unk=False)
I get this output:
'c| a|m e|m a|m e|m a|m g| c| a|m e|m'
which adds a character at the beginning as well as at the end. Why? Is this a bug or am I missing something?