New preprint 🤖🧠🧪! With @swathianil.bsky.social and @marcusghosh.bsky.social. If you want to get the most out of a multisensory signal, you should take it's temporal structure into account. But which neural architectures do this best? 🧵👇 www.biorxiv.org/content/10.1...

Dan Goodman (@neuralreckoning.bsky.social) 2025-01-14T12:57:15.179Z

In previous work, we found that when multimodal information arrives sparsely in time (e.g. prey hiding from predator), nonlinear fusion of different modalities gives a big improvement over linear fusion. journals.plos.org/ploscompbiol...

Dan Goodman (@neuralreckoning.bsky.social) 2025-01-14T12:57:15.180Z

In this paper, we looked at what happens when, in addition to being sparse, information arrives in contiguous bursts (e.g. prey scurrying from hiding spot to hiding spot). In general, the optimal algorithm is computationally intractable, so how far can you get with simple neural architectures?

Dan Goodman (@neuralreckoning.bsky.social) 2025-01-14T12:57:15.181Z

We compared the performance of linear and nonlinear algorithms that ignore temporal structure to two architectures that can use it. The first just uses a sliding window or fixed length short term memory. The second is a recurrent neural network, which in principle can have a much longer memory.

Dan Goodman (@neuralreckoning.bsky.social) 2025-01-14T12:57:15.182Z

We were expecting the RNN to hugely outperform the sliding window approach, as it has a potentially longer memory and orders of magnitude more trainable parameters. However, if the bursts of information are not too long, the much simpler network does better.

Dan Goodman (@neuralreckoning.bsky.social) 2025-01-14T12:57:15.183Z

They also differ in how they generalise. If you train on one burst length and test on other burst lengths, the sliding window algorithms generalise well to longer bursts than they were trained on, and poorly to shorter bursts. The RNNs simply generalise worse the bigger the difference.

Dan Goodman (@neuralreckoning.bsky.social) 2025-01-14T12:57:15.184Z

When we tested more realistic mixed distributions of burst lengths using either a uniform or naturalistic Lévy flight distribution, the simpler algorithms tended to perform better.

Dan Goodman (@neuralreckoning.bsky.social) 2025-01-14T12:57:15.185Z

We can't say there is a single best network, but the simple sliding window network does close to as well or better than the RNN across a wide range of training/testing setups, with RNN outperforming the simpler network when information burst lengths get much longer than the window length.

Dan Goodman (@neuralreckoning.bsky.social) 2025-01-14T12:57:15.186Z

In conclusion: ⭐ a relatively simple modification of classic multisensory algorithms can give rise to substantially better performance in more realistic environments. ⭐ Studying the temporal structure in multisensory environments may help explain multisensory neural architectures.

Dan Goodman (@neuralreckoning.bsky.social) 2025-01-14T12:59:43.041Z

Fusing multisensory signals across channels and time

Preprint
 

Abstract

Animals continuously combine information across sensory modalities and time, and use these combined signals to guide their behaviour. Picture a predator watching their prey sprint and screech through a field. To date, a range of multisensory algorithms have been proposed to model this process including linear and nonlinear fusion, which combine the inputs from multiple sensory channels via either a sum or nonlinear function. However, many multisensory algorithms treat successive observations independently, and so cannot leverage the temporal structure inherent to naturalistic stimuli. To investigate this, we introduce a novel multisensory task in which we provide the same number of task-relevant signals per trial but vary how this information is presented: from many short bursts to a few long sequences. We demonstrate that multisensory algorithms that treat different time steps as independent, perform sub-optimally on this task. However, simply augmenting these algorithms to integrate across sensory channels and short temporal windows allows them to perform surprisingly well, and comparably to fully recurrent neural networks. Overall, our work: highlights the benefits of fusing multisensory information across channels and time, shows that small increases in circuit/model complexity can lead to significant gains in performance, and provides a novel multisensory task for testing the relevance of this in biological systems.

Links

Categories