Automatic Subtitles for video using deep learning

I would like to know if there is any pre trained model for getting transcript from the video.

Videos are not any different from any other audio tranascription job. You can download subtitles for many movies from sites like https://www.opensubtitles.org and then extract audio from movies and create your own dataset.