Part 2 Lesson 10 wiki

Wow. I think I’ve found my calling!

2 Likes

screen has a similar issue, but when you reconnect you at least have access to the data and processing continues in the background. You just have to be sure to run juptyer in the background (via &) once you’ve activated screen.

I’ve done this before for jobs that ran over a week and I was able to reconnect multiple times.

I also used flowmatters to send me a slack notification when the job was done.

3 Likes

@snagpaul please do check hind2vec out!

Would love to hear what you find interesting! What works, and what does not work for you :slight_smile:

3 Likes

What’s the advantage of using subword units vs. using character-level encoding/tokenization?

I’m looking up how to get what Jeremy did to work using screen r.now, if you/anyone know and could explain that’d be great.

Thanks for sharing the tool that helps automate ablation studies.

I’m interested in hearing about any other tools people use to automate or speed up NLP work, including parallelization tools, automated hyperparameter tuning, etc.

4 Likes

The 2 models in TF hub’s universal sentence encoder are not language models. Jeremy’s claim is that language models learn to encode language sequences better than other backbone models (seems a reasonable claim to me as well) - @jeremy Please correct me if I’m wrong.

So I asked him about this yesterday: link

My hunch was that, while they might have used a weaker objective fn(merely produce sentence embeddings), but they still use the same attention, LSTM/GRU building blocks but the difference is that, they trained it with a lot more data - google scale!

As you saw in his reply, Jeremy wants us to test this out by comparing the universal encoder with the wikitext103 LM encoder - I think we should give it a try…let me know if anyone is interested.

p.s: We could start by an apples-apples comparison and later add the sentencepiece stuff to the mix…to tokenize more carefully.

6 Likes

You will still have access to data and will be able to see future outputs that you run after reconnecting.
However, say you are printing an output of training loss per epoch , and you disconnect midway. If you reconnecting , you will not be able to see the new lines of logs that might have been output while you were disconnected nor will you be able to see new logs that are technically being printed

I’ve done it before. You type:

screen
jupyter notebook &
ctrl-a and then d

This keeps your jupyter notebook running in the background on a screen wrapper.

If you need to reconnect to the screen running jupyter to shut it down

screen -r

Hope that helps. As mentioned above if you disconnect you lose output, but you at least don’t lose data. And if you want notifications about when the job is done I recommend flowmatters or some other system.

2 Likes

That’s true. If you need it you can dump it to a file as you go or pipe it through a notification service.

I do wish it had a better way of reconnecting to output though.

My takeaways:

  1. In part-1 we trained the LM on IMDB dataset text itself. Here we are using a more powerful backbone.
  2. I think the multi core CPU stuff will be super useful
  3. Lots of cool tokenization tricks
  4. training the whole thing in reverse for the BiDir
  5. He just gave us a full language model + WEIGHTS to download and use for free :slight_smile:
10 Likes

Any background process that runs the jupyter kernel like nohup should also work right?

1 Like

I’d also add to the list

  • the “semantic markup” technique
  • triangular LRs work better than cyclical LRs
  • works-out-of-the-box RNN dropout ratios
  • paper references (AWD-LSTM and Leslie’s paper)
  • and of course the new fastai text module
5 Likes

IMO that’s was the crux - transfer learning in NLP, game changer possibly very soon. + an example of successful pursuing an idea without too much support from the community. + tips and tricks on how to write papers (and pursue a right co-author for that :-))

2 Likes

The fast.ai documentation project!

1 Like

Yeah, the most exciting lecture to me so far. It also leaves a few untested hypotheses for us to try out like you mentioned.

1 Like

Does anyone have a way to prepare Text from emails, word, excel, PDF, images?

1 Like

No, I’m talking about “Screen Sharing” a utility on mac os x. It’s a VNC client, but doesn’t say VNC anywhere, so people don’t realize.

2 Likes

You don’t need to background the jupyter notebook process. Might as well keep it foregrounded. (that’s what the & bit does for others’ knowledge).

As far as reconnecting, i believe screen -r requires a pid to reconnect.
I’ve always used screen -xRR to reconnect.

That’s awesome! Thanks! Going to leave a link here and maybe in the wiki if the gods approve.

https://support.apple.com/kb/PH25554?locale=en_US