Jeremy's demo - create new document from keywords and language model

GDB · May 14, 2019, 8:28pm

I recall Jeremy doing an interesting demo where he created a new document from just a few keywords by using a pre-trained language model fine tuned with a large corpus of academic document abstracts. I can’t find it in the course material. Can someone please let me know which lesson that was in and, ideally, if there was a demo notebook? I’ve looked at all the 2018 course #1 and #2 notebooks and can’t find it.

The demo results looked something like what might be produced by OpenAI’s gpt-2. I want to fine tune a ULMFit model with my own corpus.

Thanks!

Seemant · May 15, 2019, 4:26am

You can find it here in this notebook.

GDB · May 15, 2019, 5:05pm

Thanks Seemant. Looks like that notebook deal exclusively with the iMDB dataset but maybe I haven’t looked closely enough. Also possible that Jeremy did the academic paper demo without sharing the notebook. I’ll keep digging.

bfarzin · May 15, 2019, 5:46pm

.predict will work with any language model that you generate or load. You give it a word or set of words, it will then predict the next one and continue to do that till you tell it to stop (number of words.) This notebook above is a single example of that.

You can also find an example of using a beam search to predict the next work. All in the docs. This should all work “out of the box” with the Wiki pre-trained model and you could fine-tune to your particular needs.

GDB · May 15, 2019, 6:07pm

Great info Bobak. Thank you very much!