Idea about creating Spacial perception of surrounding objects using stereo sound

Akshay_1994 · January 9, 2020, 3:13pm

Hello everyone, I am new to programming. I have learnt the concepts of machine learning and deep learning. I am trying to figure out how to implement an idea that can help blind people. The idea is to create spacial perception using different intensities of sound. For example, Let us say the device is like audio sunglasses shown below instead of glasses we put cameras.

Screenshot%20from%202020-01-09%2020-21-33

We process the data and perform object detection. now in each frame we can get the centroid of the object “chair” for example, is on the left of the frame. now if we could generate speech saying “chair” but delivering it with different intensities in both ears. Loudly in left ear and softly in the right ear, of course calculated.Couldn’t we deliver perception that the chair is on the left? This is just an example to explain the idea. The core concept stretches further with the final goal to paint the world with different sounds, not confined to speeches. I am trying to make an open source platform for the development of this device but I have just started and I have lots of questions to ask. Like creating and managing open source discussions, repositories etc. these are all new to me. So looking for guidance and support.

Thank you

VDM · January 9, 2020, 3:41pm

It is a good idea, which falls within the field called “sonification”. You may search for it to find other examples, techniques, etc.

EDIT: by the way, if you just consider the object detection part (which is relevant for fastai), there is a Microsoft app, Seeing AI, that describes surrounding objects including face expressions. However, as far as I remember, there is no exploitation of stereo effects to help localization.

Akshay_1994 · January 9, 2020, 4:18pm

Thank you sir, I will go through that.
The core concern behind the idea was, If we generate captions and continuously comment about the surroundings maybe it still lacks in translation of vision to perception, I mean maybe not a true replacement for eyes. Simply put can dogs use this solution? even though this is a blunt argument and i understand the issues behind it. What I am trying to convey is I believe stereo sound for localization can provide advancement in solving the problem of overcoming blindness with technology further.

VDM · January 9, 2020, 4:38pm

What I implicitly meant is that if you look at relevant works in the field, maybe you may find that part of what you have in mind has been already proposed by someone… e.g., with a very quick search: https://link.springer.com/article/10.1007/s12193-018-0281-3 (“The sonification scheme is using stereo panning for azimuth angle localization of scene objects, loudness for their size and frequency for distance encoding…”)

digitalspecialists · January 9, 2020, 11:26pm

You may be interested in VizWiz
https://vizwiz.org/

jcatanza · January 11, 2020, 3:55am

Wonderful idea, @akshay_1994

Akshay_1994 · January 12, 2020, 1:28pm

This is cool, thanks @digitalspecialists