Deep Learning with Audio Thread

hwasiti · April 6, 2019, 4:37am

Seems the kernel showing 3d spectrogram, but it is actually just for convinient visualization. x-y is time-freq , z axis is amplitude which is the same like color in our 2d spectrograms… So no extra info added by using 3d…

I can imagine a useful 3d spectrogram like, multiple channels of sounds in the z axis. Like stereo, or more interestingly microphone array sound spectrograms concat. in the z axis for sound localization… I have thought about this before…