Counting unique individuals in an audio file

(Hiromi Suenaga) #1

I am looking for a way to count how many unique voices are in a short audio file. I am not interested in what they are saying but just identify unique individuals (e.g. there are 2 unique voices in this audio). Could anybody point me to good resources?

Thank you!


I do not know the answer. It worth noting that sound is usually studied in the frequency domain (or time-frequency) - each voice has specific timbre. So the input to the neural network model can be Fourier Transform (FFT, STFT) of the signal.

(Hiromi Suenaga) #3

Thanks for the input! I will report back on what I find :slight_smile:

(Jeremy Howard (Admin)) #4

Here’s a paper that may be helpful

(Hiromi Suenaga) #5

Wonderful! Thank you so much! I will study the paper right away :slight_smile:

(Jeremy Howard (Admin)) #6

I also came across this today:
This is the paper:

(Hiromi Suenaga) #7

Thank you for keeping an eye out for me! I added them to my list :grinning:


What about fine tuning an existing audio model like WaveNet ( Another topic for research :slight_smile: Just brainstorming.

(Hiromi Suenaga) #9

This is great! I am so happy to have all the resources and the community!!