Has anyone been able to train a decent language model with the Lesson 4 notebook?
I get to a validation loss of about 4.19, but the generated sentences are really awful. I’ve been tweaking the learning rate and wds, but the quality hasn’t improved, so I was wondering whether someone had found a good training schedule.
One of the students … Charin has done it for Thai language… here is the link…
For better results you can further improve the model by using cache pointers… As discussed in this article here…
While you can build a good language model… it isn’t necessary that you get decent prediction for every sequence of words. It could depend upon the corpus you used. Remember it’s just a basic language model. But they could be coupled with GANs for building better chatbots or just with FC to build a world class classifier (like Jeremy)…
Thanks @Vishucyrus. Yes agreed, depending on your goal it isn’t necessary to get good predictions. But I was still surprised that it did not perform a bit better than this, especially compared to the output from the char-rnn, which is quite amazing.
Oh… in that case I strongly recommend you to incorporate the cache pointers technique into your model. I guess that should get you the desired results…
Thanks, the cache pointers paper does look interesting, I had never heard of it! (it’s kind of a poor name for an ML method, it sounds more like a C programming technique)