Part 2 Lesson 10 wiki

afrocraft · April 3, 2018, 4:02am

Wow. I think I’ve found my calling!

Even · April 3, 2018, 4:02am

screen has a similar issue, but when you reconnect you at least have access to the data and processing continues in the background. You just have to be sure to run juptyer in the background (via &) once you’ve activated screen.

I’ve done this before for jobs that ran over a week and I was able to reconnect multiple times.

I also used flowmatters to send me a slack notification when the job was done.

nirantk · April 3, 2018, 4:04am

@snagpaul please do check hind2vec out!

Would love to hear what you find interesting! What works, and what does not work for you

poppingtonic · April 3, 2018, 4:09am

What’s the advantage of using subword units vs. using character-level encoding/tokenization?

Borz · April 3, 2018, 4:11am

I’m looking up how to get what Jeremy did to work using screen r.now, if you/anyone know and could explain that’d be great.

rob · April 3, 2018, 4:16am

Thanks for sharing the tool that helps automate ablation studies.

I’m interested in hearing about any other tools people use to automate or speed up NLP work, including parallelization tools, automated hyperparameter tuning, etc.

narvind2003 · April 3, 2018, 4:16am

The 2 models in TF hub’s universal sentence encoder are not language models. Jeremy’s claim is that language models learn to encode language sequences better than other backbone models (seems a reasonable claim to me as well) - @jeremy Please correct me if I’m wrong.

So I asked him about this yesterday: link

My hunch was that, while they might have used a weaker objective fn(merely produce sentence embeddings), but they still use the same attention, LSTM/GRU building blocks but the difference is that, they trained it with a lot more data - google scale!

As you saw in his reply, Jeremy wants us to test this out by comparing the universal encoder with the wikitext103 LM encoder - I think we should give it a try…let me know if anyone is interested.

p.s: We could start by an apples-apples comparison and later add the sentencepiece stuff to the mix…to tokenize more carefully.

harveyslash · April 3, 2018, 4:18am

You will still have access to data and will be able to see future outputs that you run after reconnecting.
However, say you are printing an output of training loss per epoch , and you disconnect midway. If you reconnecting , you will not be able to see the new lines of logs that might have been output while you were disconnected nor will you be able to see new logs that are technically being printed

Even · April 3, 2018, 4:19am

I’ve done it before. You type:

screen
jupyter notebook &
ctrl-a and then d

This keeps your jupyter notebook running in the background on a screen wrapper.

If you need to reconnect to the screen running jupyter to shut it down

screen -r

Hope that helps. As mentioned above if you disconnect you lose output, but you at least don’t lose data. And if you want notifications about when the job is done I recommend flowmatters or some other system.

Even · April 3, 2018, 4:21am

That’s true. If you need it you can dump it to a file as you go or pipe it through a notification service.

I do wish it had a better way of reconnecting to output though.

narvind2003 · April 3, 2018, 4:23am

My takeaways:

In part-1 we trained the LM on IMDB dataset text itself. Here we are using a more powerful backbone.
I think the multi core CPU stuff will be super useful
Lots of cool tokenization tricks
training the whole thing in reverse for the BiDir
He just gave us a full language model + WEIGHTS to download and use for free

narvind2003 · April 3, 2018, 4:25am

Any background process that runs the jupyter kernel like nohup should also work right?

emilmelnikov · April 3, 2018, 4:28am

I’d also add to the list

the “semantic markup” technique
triangular LRs work better than cyclical LRs
works-out-of-the-box RNN dropout ratios
paper references (AWD-LSTM and Leslie’s paper)
and of course the new fastai text module

tensoralex · April 3, 2018, 4:30am

IMO that’s was the crux - transfer learning in NLP, game changer possibly very soon. + an example of successful pursuing an idea without too much support from the community. + tips and tricks on how to write papers (and pursue a right co-author for that :-))

narvind2003 · April 3, 2018, 4:33am

The fast.ai documentation project!

shoof · April 3, 2018, 4:35am

Yeah, the most exciting lecture to me so far. It also leaves a few untested hypotheses for us to try out like you mentioned.

gerardo · April 3, 2018, 4:38am

Does anyone have a way to prepare Text from emails, word, excel, PDF, images?

jsonm · April 3, 2018, 4:40am

No, I’m talking about “Screen Sharing” a utility on mac os x. It’s a VNC client, but doesn’t say VNC anywhere, so people don’t realize.

jsonm · April 3, 2018, 4:43am

You don’t need to background the jupyter notebook process. Might as well keep it foregrounded. (that’s what the & bit does for others’ knowledge).

As far as reconnecting, i believe screen -r requires a pid to reconnect.
I’ve always used screen -xRR to reconnect.

snagpaul · April 3, 2018, 4:54am

That’s awesome! Thanks! Going to leave a link here and maybe in the wiki if the gods approve.

https://support.apple.com/kb/PH25554?locale=en_US