I have an LSTM I am having it return one-hot vectors. I just use the vocabulary size. But now its randomly generating SOS tokens and PADD tokens when I don’t want it (like in the middle of a sentence or something like that).
How are LSTM models fixed to avoid this?
Note that just removing them from the indices list causes the problem that now the first token I pass my LSTM cannot be SOS
It will be difficult to answer this without seeing the code you’re using. I assume you’re using a trained language model to generate text. If it’s trained well, and the text generation is implemented correctly (with proper temperature etc.) then SOS token shouldn’t appear randomly in the middle of a sequence. Similarly, it should be fairly easy for the model to learn that PADD only follows EOS and other PADD’s, so if the model didn’t learn it, something must be wrong with it.