Hi All,
I am working on word boundary detection problem where dataset containing .wav files in which a sentence is spoken are given, and corresponding to each .wav file a .wrd file is also given which contains the words spoken in a sentence and also its boundaries (starting and end boundaries).
Our task is to identify word boundaries in test .wav file (words spoken will also be given).
I want to do this with sequential models ,so how should I start ? any thoughts are appreciated.