How do babies and blind people learn to localise sound without labelled data? We propose that innate mechanisms can provide coarse-grained error signals to boostrap learning. New preprint from @yang-chu.bsky.social. 🤖🧠🧪 arxiv.org/abs/2001.10605

Dan Goodman (@neuralreckoning.bsky.social) 2025-04-24T16:57:37.637Z

The acoustic cues we use to localise sounds are very specific to each individual, and change through our lifetimes, so we need to be able to learn them. But, most of the time we don't get precise feedback as babies, for blind people, or for sounds outside our visual field. So how do we learn?

Dan Goodman (@neuralreckoning.bsky.social) 2025-04-24T16:57:37.638Z

Babies have an innate mechanism - the auditory orienting response (AOR) - that lets them turn their head in the left/right direction of the sound. It's not precise, but it means they are born with the ability to tell left from right with at least some accuracy.

Dan Goodman (@neuralreckoning.bsky.social) 2025-04-24T16:57:37.639Z

We propose that we use this both to kickstart or 'bootstrap' learning a more detailed auditory spatial map, and also to reduce the amount of precise error feedback we need by replacing it with coarse grained feedback.

Dan Goodman (@neuralreckoning.bsky.social) 2025-04-24T16:57:37.640Z

The mechanism is this: if we hear a sound and turn our heads towards it, our innate mechanism can tell us if we undershot or overshot. Turns out, this gives us enough information to construct an approximate gradient for a neural network to learn by gradient descent.

Dan Goodman (@neuralreckoning.bsky.social) 2025-04-24T16:57:37.641Z

In our paper we show that in a fairly realistic setting in terms of neural tuning curves, noise, and possible hearing loss, this does indeed let us learn to localise sounds with a tiny number of labels (sometimes zero), including full 3D localisation (front/back, up/down).

Dan Goodman (@neuralreckoning.bsky.social) 2025-04-24T16:57:37.642Z

We find that there are many different algorithms that let you do this, including reinforcement learning with a noisy innate internal 'reward' (turning your head in the right general direction). We suggest that multiple mechanisms may work together, and individuals may use different strategies.

Dan Goodman (@neuralreckoning.bsky.social) 2025-04-24T16:57:37.643Z

Our paper gives a framework for further experimental studies, but we can't answer the question of which mechanisms we actually use.

Dan Goodman (@neuralreckoning.bsky.social) 2025-04-24T16:57:37.644Z

In conclusion: we shouldn't assume that visual feedback is the only way we learn to localise. We're adept at learning in a multitude of ways with surprisingly little signal, and this should inform experimental design. neural-reckoning.org/pub_learning...

Dan Goodman (@neuralreckoning.bsky.social) 2025-04-24T16:57:37.645Z

Learning spatial hearing via innate mechanisms

Chu Y, Luk W, Goodman D
Preprint
 

Abstract

The acoustic cues used by humans and other animals to localise sounds are subtle, and change during and after development. This means that we need to constantly relearn or recalibrate the auditory spatial map throughout our lifetimes. This is often thought of as a "supervised" learning process where a "teacher" (for example, a parent, or your visual system) tells you whether or not you guessed the location correctly, and you use this information to update your map. However, there is not always an obvious teacher (for example in babies or blind people). Using computational models, we showed that approximate feedback from a simple innate circuit, such as that can distinguish left from right (e.g. the auditory orienting response), is sufficient to learn an accurate full-range spatial auditory map. Moreover, using this mechanism in addition to supervised learning can more robustly maintain the adaptive neural representation. We find several possible neural mechanisms that could underlie this type of learning, and hypothesise that multiple mechanisms may be present and interact with each other. We conclude that when studying spatial hearing, we should not assume that the only source of learning is from the visual system or other supervisory signal. Further study of the proposed mechanisms could allow us to design better rehabilitation programmes to accelerate relearning/recalibration of spatial maps.

Links

Categories