Crowdsourcing lecture transcriptions

I am exploring the idea to create a fast, reliable crowd-powered method to accurately transcribe lectures. This has been well-explored in the past and fortunately due to Google’s auto transcription we have a fairly easy way to extract near-perfect descriptions – the draft for yesterday’s workshop is here.

For starters, one powerful use-case is the search functionality this doc can provide. For instance, I can quickly retrieve places where @jeremy is talking about list comprehension. However, the document is far from perfect – there are typos – some innocuous and a few others which may potentially impede the reader’s thought-flow. Here is a sample :

10:03, your laptop probably doesn’t have they
10:06, deep learning compatible GPU in it this
10:09, is something and you know most people I
10:11, know including most of the most serious
10:12, researchers in breakfast
10:13, used AWS for most of their work back so
10:18, that’ll make you you know it’s a kind of
10:21, getting familiar with AWS is something

Looking for suggestions on systematic ways to clean this. Ideally, it would be great to come up with a method which can work for all future lectures. We can probably add the improved version to the wiki thread. Happy learning!

3 Likes

The same feature is available in azure as video indexer API. But it will be nice if we can build something similar out as a output for learning in fast ai.

I know of https://meetscribe.io that a friend is working on for transcribing meetings. I feel this can be used for recording the lectures, although this product is in very early stages right now.

I tried it on the workshop video and the process was very slow.

A transcription of each lesson would be much appreciated. They don’t need to be time-coded at all - YouTube does that automatically. In the previous courses @lin.crampton was kind enough to transcribe the whole lot! Some kind of more crowd-sourced approach would be cool. It’s important for students where English isn’t their first language, and of course for those with hearing difficulties.

Another thing that’s helpful is to create a timeline: e.g. http://wiki.fast.ai/index.php/Lesson_1_Timeline . If you do this, you can put it straight into the wiki lesson post.

Thanks to all of you who are thinking about how to help your fellow students! :slight_smile: Let me know if you need anything from me to make things easier.

1 Like

Thanks @jeremy! :slight_smile:

How did you create the timeline? Did you use any specific tool or does YouTube do that for you?

[Edit]: I think I got the gist from the View Source tab on that wiki link you shared.

Wasn’t me! Was participants in the course :slight_smile: