Lesson transcriptions (2020) - help wanted

I’m not confident that we should be removing the “um’s” and the “like’s” and the repeated words and the misspoken words.
From a machine learning perspective, if anyone were going to use these transcriptions as labeled training data, they would be very frustrated to find editorial license being taken.
From a human consumption perspective, I would say edit away!

Thanks for checking - best to keep the words exact as I said (but remove “ummm” and similar), so that YouTube can sync it to the video correctly.

No, just plain text please.

Yes.

I don’t mind either way.

Yes leaving most stuff in is best, although “umm” and “like” can be removed.

1 Like

If it’s going to be sync’ed to the video, maybe we could make a style guide for our use, so that the viewer gets a consistent experience.

Staying consistent with other editors, I used italics for function names. But some editors use italics and parentheses, some just italics. Sometimes entire code blocks are inserted.

Sometimes bolding is used – when should we bold?

For consistency, maybe we could make a glossary – for words like ResNet, Jupyter, fastai, Python, README …

It’s plain text in the end, so no formatting will end up in the final result.

Code blocks shouldn’t be inserted - the transcription should only contain what I actually say.

A glossary could certainly be added to the top post if anyone would find that helpful - thanks!

Hi @jeremy,

sorry I was a bit late, the situation here is complicated (but getting better!). So, I actually kept all the “like” and “right” (no “ummm” in my part :slightly_smiling_face:) but I can quickly remove them if you think that would help for YouTube syncing. Anyway, there’s a piece at 51:12 which I couldn’t parse no matter how much I tried: I highlighted the text, so could you please have a look into it?

Tip for transcribers: I found it useful to slow down the video and have the video window and the Google Doc window side by side. BTW, Google Docs is a great collaboration tool! Really good UX :+1:

No that’s fine - I don’t mind much either way :slight_smile:

1 Like

I had some extra time so I did some extra ones I hope you lot don’t mind :slight_smile:

Somehow we should make sure they are all in the same format. So they should all include “like” “umm” or not… Same kind of punctuation. Capitalisation of words like Jupiter, Python or not…

I am also not really sure about where to put commas. I would appreciate if someone could double check it. But of course comma style is also part of the syntax of the whole so this should be done after that is decided?

Looks like the video already has captions, and they are very close to being correct. Not sure how that happened …

Hadus,

Glossary is a great idea! Things like Python, Jupyter, etc. need to be consistent.

I looked yours over … biggest change was that I used the American (rather than UK) spelling (to remain consistent), and transitioned some semicolons to commas.

1 Like

Auto generated by YouTube. I tried looking at them while transcribing but it didn’t increase my productivity so I turned the captions off.

Also 0.75 playback speed on the video really helped.

1 Like

How are we to deal with changing speakers? If this is going to be automatically sync’ed to the video, not sure that it makes sense to include a name before the speaker (e.g., when dialog is ongoing between Jeremy and Rachel).

Should we just ignore changing speakers and simply include the audio text?

Yes please.

1 Like

@transcribe-1v4: Just wanted to post a quick note to say - you folks are absolute heros! Thank you so much for your great work on this! :slight_smile:

6 Likes

Happy to get stuck into lesson 2 transcription, is there a google doc available yet?

Not yet @morgan - I might wait until the YouTube auto transcription is available, since it is so much better than the Nuance one. Then I’ll set up the Google Doc.

4 Likes

@transcribe-1v4: The google doc is now available:

I’ve split it into 25 equal sized paragraphs, but haven’t put names above each paragraph - you can do that yourself when you start transcribing a paragraph of your choice. See the google doc for updated instructions.

Thanks so much everyone! :slight_smile:

3 Likes

Setting playback speed as 0.75 surely helps!

2 Likes

@jeremy I’ve done a few more. :slight_smile:

Not sure if this is the place to ask this, but:

I created a quizlet set of questions for Francois Chollet’s Deep Learning With Python book (first few chapters, about 120 terms/questions). You can see it here. I thought about doing the same thing for fastai, and began it here. Questions:

  1. This is currently public, but I can make it password-protected, or private-only (to myself). Which would you prefer?
  2. If it’s password-protected or public, I can create a google spreadsheet that can be imported into the app, so that others could contribute to it. I find quizlet to be extremely helpful. It has various clever learning techniques built in, showing flashcards and teaching you in a number of ways, tracking your progress and recording the ones you missed. This would allow students to review (for instance) the terms in lesson 2 that are critical to understanding your presentations going forward. I also created a few just based on questions asked and answered during Lesson 2.

You should be able to download the quizlet app on your phone and search for “fastai” or “Deep Learning With Python” and have those datasets come up.

Edit: I did upgrade to the paid version which allows me to insert images and text formatting.

Thanks!

5 Likes