Video Browser with Searchable Transcripts

Hi all,

I’ve posted a prototype of a fastai course video browser.


source code

You can watch the lessons and instantly search the transcript of the lesson using the search box at the bottom. If you click on the phrase, it will take you to the proper moment in the video.

It’s password protected and @jeremy has requested that no one share it outside the forums. The password is: deeplearningSF2018

If you have feedback on the UI/UX please feel free to share it here.

Known Issues:

  1. It’s hosted on a small, free instance so may crash or become very slow if many people are using it. Now on S3!
  2. In the future, you’ll be able to browse by chapter/section as well.
  3. The auto-generated transcripts are pretty bad. If anyone wants to share a better one, I can update it.
20 Likes

This is awesome

1 Like

Wow amazing work! Thanks @zachcaceres

I’m impressed by the search speed! Is it based on elasticsearch?

I find the keyword results below the video less informative without timestamps. Maybe instead of showing results in 2 rows below the video, show a column to the right side of the video and display results top down from the earliest to the latest mention along with timestamps? Just an idea and I’m not a designer…

2 Likes

@zachcaceres just in case you hadn’t noticed - the animated gif in your post isn’t showing since it’s too big; also the frame rate is too fast to see what’s going on.

1 Like

thanks, yeah that’s thanks to cloud app! i’ll see about converting it.

awesome idea

1 Like

I am impressed too. Not based on ElasticSearch. The search function is searching the transcripts in JSON format cached in the browser (client).

The UX certainly can use some of our help. I am also not a visual/creative designer :slight_smile:

3 Likes

Hey Zach. Cool app! Thank you for open sourcing the project.

As the project currently lack technical documentation, I did a quick review of the source code and here’s some of my findings:

  • this is a fully JavaScript front-end web app
  • core technologies used to built this app:
    • React.js
    • modern JavaScript (ES6)
    • Create React App (CRA)
  • CRA is a tool based on WebPack for compiling and packaging web assets such as static HTML, CSS, images, etc
  • UI styled using plain CSS (no Bootstrap or 3rd-party CSS framework dependency, which is nice)

The kind of web stack I used a lot in my work.

Questions:

  • Where can I find these lesson transcript files: assets/dl-1-1/transcript.json?
  1. The auto-generated transcripts are pretty bad. If anyone wants to share a better one, I can update it.

How did you generate these? I suggest you take a look at:

Thank you for starting this project. You just turn some idea I had at the back of my head into reality :slight_smile: I was looking for a better way to watch the lesson video together with the transcripts side-by-side when I am in the deep study mode. Sort of like Coursera video player. If I have a bit of free time, I will contribute to this project. No promise though.

1 Like

Brilliant! The way course material should be. Was just talking about something like this yesterday over lunch. Think of having this fully automated in the classroom.
Thanks for the great work!

1 Like

Hi Cedric,

Not based on ElasticSearch. The search function is searching the transcripts in JSON format cached in the browser (client).

You are correct. I am a big believer in making technology only as complex as the problem warrants! Since these JSON objects are never that large, we can do blazing-fast search right on the client without any problems.

The UX certainly can use some of our help. I am also not a visual/creative designer

I am open to anyone’s PRs! We already had some yesterday from @rramphal. If you have a good idea around timestamping the search results, please let me know.

I see the current iteration as a proof of concept. Hopefully we can all collaborate together to make some both beautiful and useful for everyone in fastai :slight_smile:

How did you generate these?

These are straight from YouTube, formatted manually to easy-to-parse JSON. A good project for someone would be to fix up these transcripts so we can offer the highest quality search possible.

Where can I find these lesson transcript files: assets/dl-1-1/transcript.json ?

Send me a forum note and I’ll pass you a zip of the assets. Too large to check into source control and we want to be careful to respect @jeremy 's wishes to keep the videos private for now.

Technical notes:
Your notes are spot on. I do use a few CSS classes from Tachyons, but it’s deliberately quite minimal as a dependency and we could in-line those styles and get rid of it.

Want to make a PR under a heading “Developer Notes” with your excellent analysis of the stack?

1 Like

videos.fast.ai has been updated with Lessons 4, 5, and 6. It also features some tweaks and refactoring by @cjwinslow and @rramphal !

For some reason, Youtube did not provide a transcript for Lesson 4. If someone out there wants to send one along (.JSON preferred but .SRT ok as well), I’m happy to add it.

PRs welcome at the repo

Is this still working?