Word boundary detection in a .wav file

Hi All,

I am working on word boundary detection problem where dataset containing .wav files in which a sentence is spoken are given, and corresponding to each .wav file a .wrd file is also given which contains the words spoken in a sentence and also its boundaries (starting and end boundaries).
Our task is to identify word boundaries in test .wav file (words spoken will also be given).
I want to do this with sequential models ,so how should I start ? any thoughts are appreciated.

Hi chunky,

I am working on a similar problem but I am not able to get the dataset for the same. So, if you don’t have any issues, can I please get the dataset?

Hi Ankur,
I think you can use famous “timit dataset”.

But it is paid right? Isn’t there any free dataset?