Best way is to just try it in your model and see if the results are good. Also, you can look at language model research papers to see what kind of numbers they report for “perplexity”. Earlier this year anything <80 was state of the art IIRC! Although some datasets are easier than others, of course.
When I played around with generating a few sentences with the language model, I noticed a good deal of repetition (esp. with smaller primer sentences). Would this incline you to believe the model has a ways to go?
No it wouldn’t make me think that. Your perplexity looks great to me. As mentioned in class, we haven’t actually tried to create a good generator - our goal was to create a good classifier! To make a better generator you’ll probably want to use beam search and other tricks. One simple step is: Configuring stateful lstm cell in the the language model