Exploiting heterogeneous delays for efficient computation in low-bit neural networks

Psst - neuromorphic folks. Did you know that you can solve the SHD dataset with 90% accuracy using only 22 kb of parameter memory by quantising weights and delays? Check out our preprint with @pengfei-sun.bsky.social and @danakarca.bsky.social, or read the TLDR below. 👇🤖🧠🧪 arxiv.org/abs/2510.27434
— Dan Goodman (@neural-reckoning.org) 2025-11-13T17:40:46.232Z

We've known for a while now that adding learnable delays is a great way to improve performance for this and other temporal datasets. Just check out how often the word 'delay' crops up on @fzenke.bsky.social's leaderboard for the top performers at SHD.
— Dan Goodman (@neural-reckoning.org) 2025-11-13T17:40:46.233Z

Most people (including us) are using axonal delays so the number of parameters scales well as you increase network size - O(n) for axonal delays compared to O(n²) for weights. You get a lot more bang for your buck if you add in delays.
— Dan Goodman (@neural-reckoning.org) 2025-11-13T17:40:46.234Z

So how small (in terms of memory usage) could you make your network and keep good performance? We used learnable quantisations of both weights and delays using different bit budgets, and checked performance. Here's SHD. The point marked I has almost optimal performance but uses hardly any bits.
— Dan Goodman (@neural-reckoning.org) 2025-11-13T17:40:46.235Z

Optimal performance is at point II on the graph above, with 4 bit weights and 5 bit delays gives 94% accuracy - only 2% off the best known performance on this dataset using only 4% of the memory footprint. Or with 1.58 bit weights and 3 bit delays we get 90% accuracy using only 22 kB memory.
— Dan Goodman (@neural-reckoning.org) 2025-11-13T17:40:46.236Z

Just as a note here, the 1.58 bits for weights is because weights can take on three values, a positive, negative, or zero weight. So that's log₂3=1.58 bits per weight. Depending on sparsity level, a sparse matrix format might be able to use even fewer bits on average.
— Dan Goodman (@neural-reckoning.org) 2025-11-13T17:40:46.237Z

What we find is that the most important delays are the long ones. Selectively pruning the longest delays leads to a sharper fall in performance than the shorter ones. This is actually a problem for devices where long delays are expensive. We can reduce this with regularisation, but more work needed.
— Dan Goodman (@neural-reckoning.org) 2025-11-13T17:40:46.238Z

So here's our challenge to the community: how few bits do you need to solve the SHD dataset at 90%+ accuracy? Let us know if you can beat 22 kB. Just to motivate you, here's something you can do with just 16 kB of code: youtu.be/oITx9xMrAcM?... Time to bring the PC demoscene spirit to SNNs?
— Dan Goodman (@neural-reckoning.org) 2025-11-13T17:40:46.239Z

Exploiting heterogeneous delays for efficient computation in low-bit neural networks

Pengfei Sun
Achterberg J
Su Z
Dan Goodman
Danyal Akarca

Sun P, Achterberg J, Su Z, Goodman DFM, Akarca D

Preprint

Abstract

Neural networks rely on learning synaptic weights. However, this overlooks other neural parameters that can also be learned and may be utilized by the brain. One such parameter is the delay: the brain exhibits complex temporal dynamics with heterogeneous delays, where signals are transmitted asynchronously between neurons. It has been theorized that this delay heterogeneity, rather than a cost to be minimized, can be exploited in embodied contexts where task-relevant information naturally sits contextually in the time domain. We test this hypothesis by training spiking neural networks to modify not only their weights but also their delays at different levels of precision. We find that delay heterogeneity enables state-of-the-art performance on temporally complex neuromorphic problems and can be achieved even when weights are extremely imprecise (1.58-bit ternary precision: just positive, negative, or absent). By enabling high performance with extremely low-precision weights, delay heterogeneity allows memory-efficient solutions that maintain state-of-the-art accuracy even when weights are compressed over an order of magnitude more aggressively than typically studied weight-only networks. We show how delays and time-constants adaptively trade-off, and reveal through ablation that task performance depends on task-appropriate delay distributions, with temporally-complex tasks requiring longer delays. Our results suggest temporal heterogeneity is an important principle for efficient computation, particularly when task-relevant information is temporal - as in the physical world - with implications for embodied intelligent systems and neuromorphic hardware.

The short version

Psst - neuromorphic folks. Did you know that you can solve the SHD dataset with 90% accuracy using only 22 kb of parameter memory by quantising weights and delays? Check out our preprint with @pengfei-sun.bsky.social and @danakarca.bsky.social, or read the TLDR below. 👇🤖🧠🧪 arxiv.org/abs/2510.27434
— Dan Goodman (@neural-reckoning.org) 2025-11-13T17:40:46.232Z

We've known for a while now that adding learnable delays is a great way to improve performance for this and other temporal datasets. Just check out how often the word 'delay' crops up on @fzenke.bsky.social's leaderboard for the top performers at SHD.
— Dan Goodman (@neural-reckoning.org) 2025-11-13T17:40:46.233Z

Most people (including us) are using axonal delays so the number of parameters scales well as you increase network size - O(n) for axonal delays compared to O(n²) for weights. You get a lot more bang for your buck if you add in delays.
— Dan Goodman (@neural-reckoning.org) 2025-11-13T17:40:46.234Z

So how small (in terms of memory usage) could you make your network and keep good performance? We used learnable quantisations of both weights and delays using different bit budgets, and checked performance. Here's SHD. The point marked I has almost optimal performance but uses hardly any bits.
— Dan Goodman (@neural-reckoning.org) 2025-11-13T17:40:46.235Z

Optimal performance is at point II on the graph above, with 4 bit weights and 5 bit delays gives 94% accuracy - only 2% off the best known performance on this dataset using only 4% of the memory footprint. Or with 1.58 bit weights and 3 bit delays we get 90% accuracy using only 22 kB memory.
— Dan Goodman (@neural-reckoning.org) 2025-11-13T17:40:46.236Z

Just as a note here, the 1.58 bits for weights is because weights can take on three values, a positive, negative, or zero weight. So that's log₂3=1.58 bits per weight. Depending on sparsity level, a sparse matrix format might be able to use even fewer bits on average.
— Dan Goodman (@neural-reckoning.org) 2025-11-13T17:40:46.237Z

What we find is that the most important delays are the long ones. Selectively pruning the longest delays leads to a sharper fall in performance than the shorter ones. This is actually a problem for devices where long delays are expensive. We can reduce this with regularisation, but more work needed.
— Dan Goodman (@neural-reckoning.org) 2025-11-13T17:40:46.238Z

So here's our challenge to the community: how few bits do you need to solve the SHD dataset at 90%+ accuracy? Let us know if you can beat 22 kB. Just to motivate you, here's something you can do with just 16 kB of code: youtu.be/oITx9xMrAcM?... Time to bring the PC demoscene spirit to SNNs?
— Dan Goodman (@neural-reckoning.org) 2025-11-13T17:40:46.239Z

Exploiting heterogeneous delays for efficient computation in low-bit neural networks

Abstract

Links

Categories

The short version