How do babies and blind people learn to localise sound without labelled data? We propose that innate mechanisms can provide coarse-grained error signals to boostrap learning. New preprint from @yang-chu.bsky.social. 🤖🧠🧪 arxiv.org/abs/2001.10605

Dan Goodman (@neural-reckoning.org) 2025-04-24T16:57:37.637Z

The acoustic cues we use to localise sounds are very specific to each individual, and change through our lifetimes, so we need to be able to learn them. But, most of the time we don't get precise feedback as babies, for blind people, or for sounds outside our visual field. So how do we learn?

Dan Goodman (@neural-reckoning.org) 2025-04-24T16:57:37.638Z

Babies have an innate mechanism - the auditory orienting response (AOR) - that lets them turn their head in the left/right direction of the sound. It's not precise, but it means they are born with the ability to tell left from right with at least some accuracy.

Dan Goodman (@neural-reckoning.org) 2025-04-24T16:57:37.639Z

We propose that we use this both to kickstart or 'bootstrap' learning a more detailed auditory spatial map, and also to reduce the amount of precise error feedback we need by replacing it with coarse grained feedback.

Dan Goodman (@neural-reckoning.org) 2025-04-24T16:57:37.640Z

The mechanism is this: if we hear a sound and turn our heads towards it, our innate mechanism can tell us if we undershot or overshot. Turns out, this gives us enough information to construct an approximate gradient for a neural network to learn by gradient descent.

Dan Goodman (@neural-reckoning.org) 2025-04-24T16:57:37.641Z

In our paper we show that in a fairly realistic setting in terms of neural tuning curves, noise, and possible hearing loss, this does indeed let us learn to localise sounds with a tiny number of labels (sometimes zero), including full 3D localisation (front/back, up/down).

Dan Goodman (@neural-reckoning.org) 2025-04-24T16:57:37.642Z

We find that there are many different algorithms that let you do this, including reinforcement learning with a noisy innate internal 'reward' (turning your head in the right general direction). We suggest that multiple mechanisms may work together, and individuals may use different strategies.

Dan Goodman (@neural-reckoning.org) 2025-04-24T16:57:37.643Z

Our paper gives a framework for further experimental studies, but we can't answer the question of which mechanisms we actually use.

Dan Goodman (@neural-reckoning.org) 2025-04-24T16:57:37.644Z

In conclusion: we shouldn't assume that visual feedback is the only way we learn to localise. We're adept at learning in a multitude of ways with surprisingly little signal, and this should inform experimental design. neural-reckoning.org/pub_learning...

Dan Goodman (@neural-reckoning.org) 2025-04-24T16:57:37.645Z

Learning spatial hearing via innate mechanisms

Chu Y, Luk W, Goodman DFM
PLoS Computational Biology (2025) 21(10): e1013543
doi: 10.1371/journal.pcbi.1013543
 

Abstract

The acoustic cues used by humans and other animals to localise sounds are subtle, and change throughout our lifetime. This means that we need to constantly relearn or recalibrate our sound localisation circuit. This is often thought of as a “supervised” learning process where a “teacher” (for example, a parent, or your visual system) tells you whether or not you guessed the location correctly, and you use this information to update your localiser. However, there is not always an obvious teacher (for example in babies or blind people). Using computational models, we showed that approximate feedback from a simple innate circuit, such as that can distinguish left from right (e.g. the auditory orienting response), is sufficient to learn an accurate full-range sound localiser. Moreover, using this mechanism in addition to supervised learning can more robustly maintain the adaptive neural representation. We find several possible neural mechanisms that could underlie this type of learning, and hypothesise that multiple mechanisms may be present and provide examples in which these mechanisms can interact with each other. We conclude that when studying spatial hearing, we should not assume that the only source of learning is from the visual system or other supervisory signals. Further study of the proposed mechanisms could allow us to design better rehabilitation programmes to accelerate relearning/recalibration of spatial hearing.

Links

Categories