I have documented my problem here
Looks like you forgot to limit the vocabulary to 5000 when encoding your test input!
# Limit the vocabulary size to 5000
textWordsIdxArray = [np.array([i if i < vocabsize -1 else vocabsize -1 for i in s]) for s in textWordsIdxArray]
Please note that it is generally not a good idea to share your machine credentials. Someone could (intentionally or unintentionally) cause damage to your files or your machine.
A good way to share your notebook is to create a gist at http://gist.github.com
Keep going, you’re doing well!
Thanks @niazangels , earlier the model was not able to find the embeddings for higher (than vocab) ids and hence failed to run. Am I right?
I think it makes intuitive sense now.
Exactly. When you tried to make the prediction, it tried to look up the latent factors of all the word ids; some of which didn’t exist, because they were outside the vocabulary of 5000 words we limited ourselves to.