Lesson transcriptions (2020) - help wanted

Could you edit this post to link to the recording as well?

1 Like

You’ll find the video here!

A few notes:

I wasn’t sure how much to edit it for grammar, sense, and flow. I just used my judgement (I’ve worked as a writer and editor in various contexts). For instance where you said:

“But you can see so if you train this for a few minutes, it’s nearly perfect.”

I changed to:

“But you can see that if you train this for a few minutes, it’s nearly perfect.”

It’s less conversational and has a little less of the implicit ‘Hey help I need editing in order to read well!’ Don’t know if you want that or whether you want it left, so to speak, warts and all. I also added a couple of small screenshots where you were pointing to code, just so it’s more comprehensible and whole.

edit: based on @jeremy’s comments I removed the screenshots.

In the document preferences I turned off ‘automatically capitalize words’ because it’s irritating to constantly have to (for instance) undo the automatically capitalized Untar into untar.

I’m omitting the utterance ‘umm’ … assuming that is the right thing to do? i also ditto @quantum’s question as well.

Questions for Jeremy:

  1. so this will be in Markdown. That means, if we are referring to code, we should put it in markdown format. Example: import *. Is this right? [bold should be bold, italics]

  2. I separated mine by paragraphs, makes it easier to read. Is that ok?

  3. It seems like it would be a good idea if people retained their time stamp in the google doc, instead of deleting it after they have reviewed the transcript. Is this correct?

I’m not confident that we should be removing the “um’s” and the “like’s” and the repeated words and the misspoken words.
From a machine learning perspective, if anyone were going to use these transcriptions as labeled training data, they would be very frustrated to find editorial license being taken.
From a human consumption perspective, I would say edit away!

Thanks for checking - best to keep the words exact as I said (but remove “ummm” and similar), so that YouTube can sync it to the video correctly.

No, just plain text please.


I don’t mind either way.

Yes leaving most stuff in is best, although “umm” and “like” can be removed.

1 Like

If it’s going to be sync’ed to the video, maybe we could make a style guide for our use, so that the viewer gets a consistent experience.

Staying consistent with other editors, I used italics for function names. But some editors use italics and parentheses, some just italics. Sometimes entire code blocks are inserted.

Sometimes bolding is used – when should we bold?

For consistency, maybe we could make a glossary – for words like ResNet, Jupyter, fastai, Python, README …

It’s plain text in the end, so no formatting will end up in the final result.

Code blocks shouldn’t be inserted - the transcription should only contain what I actually say.

A glossary could certainly be added to the top post if anyone would find that helpful - thanks!

Hi @jeremy,

sorry I was a bit late, the situation here is complicated (but getting better!). So, I actually kept all the “like” and “right” (no “ummm” in my part :slightly_smiling_face:) but I can quickly remove them if you think that would help for YouTube syncing. Anyway, there’s a piece at 51:12 which I couldn’t parse no matter how much I tried: I highlighted the text, so could you please have a look into it?

Tip for transcribers: I found it useful to slow down the video and have the video window and the Google Doc window side by side. BTW, Google Docs is a great collaboration tool! Really good UX :+1:

No that’s fine - I don’t mind much either way :slight_smile:

1 Like

I had some extra time so I did some extra ones I hope you lot don’t mind :slight_smile:

Somehow we should make sure they are all in the same format. So they should all include “like” “umm” or not… Same kind of punctuation. Capitalisation of words like Jupiter, Python or not…

I am also not really sure about where to put commas. I would appreciate if someone could double check it. But of course comma style is also part of the syntax of the whole so this should be done after that is decided?

Looks like the video already has captions, and they are very close to being correct. Not sure how that happened …


Glossary is a great idea! Things like Python, Jupyter, etc. need to be consistent.

I looked yours over … biggest change was that I used the American (rather than UK) spelling (to remain consistent), and transitioned some semicolons to commas.

1 Like

Auto generated by YouTube. I tried looking at them while transcribing but it didn’t increase my productivity so I turned the captions off.

Also 0.75 playback speed on the video really helped.

1 Like

How are we to deal with changing speakers? If this is going to be automatically sync’ed to the video, not sure that it makes sense to include a name before the speaker (e.g., when dialog is ongoing between Jeremy and Rachel).

Should we just ignore changing speakers and simply include the audio text?

Yes please.

1 Like

@transcribe-1v4: Just wanted to post a quick note to say - you folks are absolute heros! Thank you so much for your great work on this! :slight_smile:


Happy to get stuck into lesson 2 transcription, is there a google doc available yet?