Shodh-Memory: A Cognitive Memory System for Edge-Native AI Agents

Q: How is shodh-memory different from a vector database?

Vector databases give you similarity search. Shodh-memory gives you cognition — memories strengthen when accessed together (Hebbian learning), decay naturally over time (power-law forgetting), and form associative networks via a knowledge graph. It's the difference between storage and memory.

Q: Does shodh-memory require an internet connection?

No. Shodh-memory runs 100% offline. The embeddings, vector index, knowledge graph — everything runs locally. Perfect for edge devices, air-gapped systems, or anywhere you need data privacy.

Q: What's the memory overhead?

The binary is ~30MB. Models add ~50MB (22MB MiniLM embeddings + 14MB NER model + 14MB ONNX runtime). Each memory entry uses roughly 2-5KB. A system with 10,000 memories uses approximately 50MB of storage.

Q: Can shodh-memory run on a Raspberry Pi?

Yes. Shodh-memory is designed for edge deployment. It runs on Raspberry Pi Zero, Jetson Nano, industrial PCs, and other resource-constrained devices. Graph lookups are under 1 microsecond.

Q: How does memory decay work?

Shodh-memory uses a hybrid model: exponential decay for the first 3 days (consolidation phase), then power-law decay for long-term retention. Memories accessed 10+ times become potentiated and decay 10x slower. Based on Wixted & Ebbesen (1991).

Q: What is Hebbian learning in AI agent memory?

Cells that fire together, wire together. When memories are accessed together, their connection strengthens. When memories compete, interference effects occur. It's how biological brains work, now applied to AI agent memory.

Q: Is there a cloud version?

No, and that's intentional. Shodh-memory is built for local-first, privacy-preserving AI. Your agent's memories stay on your hardware. If you need multi-device sync, you can replicate the RocksDB storage yourself.

Q: What languages and frameworks does shodh-memory support?

The core is Rust. We provide: an MCP server (for Claude, Cursor, and other AI agents), Python bindings (via PyO3/maturin), and a REST API. The Rust crate can be embedded directly in your application.

Q: How do I contribute?

Check out github.com/varun29ankuS/shodh-memory. Open issues, submit PRs, or join discussions. The codebase is well-documented with 688+ tests. All constants have neuroscience citations.

Varun Sharma

doi:10.5281/zenodo.18668709

2026-03-18•15 min read

The Great Divorce: How AI Abandoned Neuroscience — And Why It's Coming Back

neurosciencetrendsresearch

neuroscience-ai-divergence.md

The Great Divorce: How AI Abandoned Neuroscience — And Why It's Coming Back

Artificial intelligence was born as brain simulation. In 2026, the two fields barely talk to each other. AI conferences and neuroscience conferences have almost zero overlap in attendees, citations, or methodology.

This wasn't inevitable. It was a series of choices — some pragmatic, some political, some accidental — that separated two fields studying the same problem: how does intelligence emerge from networks of simple elements?

Understanding this divergence matters because the ideas that were abandoned are precisely the ideas that memory systems need. And there are signs the divorce is ending.

Act I: The Marriage (1943–1969)

McCulloch-Pitts Neurons (1943)

It started with a paper by a neurophysiologist (Warren McCulloch) and a mathematician (Walter Pitts). They showed that networks of simplified neurons — binary threshold units — could compute any logical function. The artificial neuron was born as a direct model of the biological neuron.

There was no distinction between "AI" and "neuroscience" at this point. Understanding the brain and building intelligent machines were the same project.

Hebb's Rule (1949)

Donald Hebb, a psychologist, proposed that when two neurons repeatedly fire together, the connection between them strengthens. This wasn't experimentally confirmed until 1998 (Bi & Poo), but it immediately became the theoretical foundation for learning in neural networks.

The implications were profound: a network could learn from experience by adjusting connection weights based on activity patterns. No teacher. No error signal. No global optimization. Just local co-activation driving structural change.

Rosenblatt's Perceptron (1958)

Frank Rosenblatt built the Mark I Perceptron — a physical machine with photocells, potentiometers, and electric motors that could learn to classify visual patterns. It was explicitly designed as a brain model. The New York Times headline read: "New Navy Device Learns By Doing."

The perceptron learned through a biologically-inspired rule: if the output was correct, do nothing; if wrong, adjust the weights that contributed to the error. Simple, local, effective for linearly separable problems.

The Assassination (1969)

Marvin Minsky and Seymour Papert publishedPerceptrons, a mathematical analysis proving that single-layer perceptrons couldn't solve XOR — or any non-linearly-separable problem. The proof was correct but the implication was overstated: they suggested (without proving) that multi-layer networks wouldn't overcome this limitation.

The effect was devastating. DARPA and other funding agencies pulled support for neural network research almost entirely. The field entered what's now called the "AI winter." Researchers who continued working on neural networks did so at personal career risk.

Minsky and Papert knew multi-layer networks might work. Their book's impact was partly intentional — they were championing symbolic AI (GOFAI: Good Old-Fashioned AI) and saw neural networks as a competitor for limited funding.

For 15 years, AI meant rule-based expert systems, logic programming, and symbolic reasoning. The brain-inspired approach was effectively dead.

Act II: The Revival and the Split (1986–2012)

Backpropagation Changes Everything

In 1986, Rumelhart, Hinton, and Williams published the backpropagation algorithm for training multi-layer networks. (The math had been independently discovered several times before, but this paper made it accessible and demonstrated its power.)

Backprop solved Minsky's objection: multi-layer networkscould learn non-linear functions. With enough layers and enough data, neural networks could approximate any function.

But backprop introduced a philosophical split. It worked brilliantly as engineering — but it was biologically impossible:

```

What backprop requires: What brains do:

Global loss function No central error signal

Error flows backward Synapses are unidirectional

Symmetric weights (transpose) No weight transport mechanism

Stored activations No caching of forward pass

Synchronous updates Asynchronous firing

Separate train/inference Learning during operation

```

The field had a choice: pursue biologically plausible learning rules (slower progress, harder math, less impressive demos) or pursue backpropagation (fast progress, clean math, impressive results).

It chose backprop. This was the moment AI and neuroscience diverged.

The Roads Not Taken

Several biologically-plausible approaches were active in the 1990s. All were abandoned — not because they were wrong, but because backprop was good enough:

Boltzmann Machines (Hinton & Sejnowski, 1983). Stochastic networks that learn through a process resembling thermal equilibrium. More biologically plausible than backprop — learning uses only local information. But training was slow (required sampling) and results were worse on benchmarks.

Hopfield Networks (1982). Content-addressable associative memory using energy minimization. Mathematically elegant but capacity-limited. Became a textbook curiosity. Won the Nobel Prize 42 years later.

Spike-Timing Dependent Plasticity (STDP). The most biologically realistic learning rule — synaptic changes depend on precise spike timing, not average firing rates. Extensively studied in neuroscience but never competitive with backprop for practical AI tasks. The hardware couldn't simulate it efficiently.

Competitive Learning / Self-Organizing Maps (Kohonen, 1982). Networks that learn topographic representations without supervision. Used in some industrial applications but never matched supervised backprop on accuracy benchmarks.

Each of these approaches had real advantages over backpropagation: real-time learning, biological plausibility, unsupervised operation, low power consumption. But none of them won ImageNet. In the AI community, benchmarks determine what gets funded.

The Neuroscience Community Moves On

As AI pursued backprop, neuroscience pursued its own path. Computational neuroscience developed detailed models of neural circuits, synaptic plasticity, and brain dynamics. But these models weren't trying to beat benchmarks — they were trying to explain experimental data.

The result was two parallel fields studying intelligence with almost no communication:

• AI conferences (NeurIPS, ICML, ICLR): How to get 95% accuracy on ImageNet

• Neuroscience conferences (SfN, Cosyne, CCN): How does the hippocampus consolidate spatial memory

Same question, different languages, different incentives, different audiences.

Act III: Scale Wins, Biology Loses (2012–2023)

The Deep Learning Revolution

AlexNet (2012) proved that deep convolutional networks + GPUs + large datasets = superhuman image recognition. The result was unambiguous: scale the model, scale the data, and performance improves.

This triggered a gold rush. The entire AI field pivoted to making networks bigger. The recipe was simple: more layers, more parameters, more data, more compute. Biological plausibility became irrelevant — the only question was whether the approach worked on benchmarks.

Transformers (2017)

"Attention is All You Need" introduced the transformer architecture. Pure attention mechanisms, no recurrence, no convolution, nothing that resembles biological neural circuits. It was designed for engineering efficiency (parallelizable on GPUs), not biological fidelity.

By 2020, transformers dominated language, vision, speech, protein folding, and code generation. The architecture was so successful that questioning it seemed absurd.

The Scaling Hypothesis

The dominant belief by 2023: intelligence emerges from scale. Make the model bigger, give it more data, and capabilities emerge. No architectural innovation needed. No neuroscience needed. Just scale.

This is the maximum point of divergence between AI and neuroscience. The brain — the only system known to produce general intelligence — operates on 20 watts with 86 billion neurons connected by 100 trillion synapses. The scaling hypothesis says none of that architecture matters — just make a matrix multiplication engine big enough.

Act IV: The Reunion (2024–present)

Several developments suggest the divorce is ending:

The Nobel Prize (2024)

Hopfield and Hinton receiving the Physics Nobel legitimized the connection between neural networks and physical principles. It said: these aren't just engineering tricks. They're fundamental science. The energy-based view of memory and learning is a physical principle.

Hinton's Forward-Forward Algorithm (2022)

The godfather of backpropagation published an alternative to backpropagation motivated by biological implausibility. The Forward-Forward algorithm uses local learning rules — each layer has its own objective, no backward pass, no weight transport. It performed nearly as well as backprop on initial benchmarks.

When the most prominent advocate of an approach starts looking for alternatives, the field takes notice.

Neuromorphic Hardware

Intel's Loihi 2, IBM's NorthPole, and SynSense's Speck — chips designed around spiking neural networks rather than matrix multiplication. These chips consume milliwatts instead of megawatts. They learn using local rules (STDP). They process information asynchronously, like biological neurons.

The hardware is catching up to the algorithms. When you can run spike-timing dependent plasticity natively in silicon, the "backprop is faster" argument dissolves.

Modern Hopfield Networks (2020–2025)

The discovery that transformer attention equals Hopfield recall blew a hole in the wall between the two fields. If the most successful AI architecture is secretly an associative memory system from 1982, then maybe the other ideas from that era deserve re-examination.

Active research now explores modern Hopfield networks with exponential capacity, continuous-time dynamics, connections to diffusion models, and applications to immune system repertoire classification. The field that was dead is suddenly producing papers at NeurIPS and ICLR.

Predictive Coding and Active Inference

Karl Friston's free energy principle — the idea that the brain is fundamentally a prediction engine that minimizes surprise — is increasingly influencing AI. Predictive coding networks learn through local prediction errors, not global gradients. They're biologically plausibleand increasingly competitive on benchmarks.

The Abandoned Ideas That Memory Systems Need

The ideas abandoned during the divergence are exactly what AI memory systems lack:

|---|---|---|---|

Every item in this table is implemented in shodh-memory. Not because we set out to be contrarian, but because these are the right solutions to the engineering problems of persistent agent memory.

When you need a memory system that:

• Learns in real-time (not through a training pipeline)

• Runs offline on edge devices (not in a data center)

• Gets better with use (not just accumulates data)

• Forgets gracefully (not just fills up)

• Surfaces context proactively (not just responds to queries)

...the neuroscience literature has had the answers for decades. The AI field just wasn't looking.

Where This Goes

We don't know whether the next major AI architecture will come from neuroscience. It might come from pure mathematics, or physics, or an approach nobody has considered. Sam Altman recently said he believes there's "another architecture to find" — something as transformative as transformers were over LSTMs.

What we do know is that the exploration has been asymmetric. The AI field has spent enormous resources optimizing one approach (gradient-based learning on transformer architectures) while leaving entire branches of neuroscience-inspired computation unexplored at scale.

Hopfield networks were ignored for 35 years and then won a Nobel Prize. Hinton spent his career on backprop and then published an alternative. Transformers turned out to be associative memory in disguise. The neuroscience wasn't wrong — it was early.

The gap between current AI architectures and biological neural systems isn't a sign that biology is irrelevant. It's a sign of how much remains unexplored. One indicator: neuromorphic hardware is quietly advancing. Intel's Loihi 2, Manchester's SpiNNaker, Heidelberg's BrainScaleS — these chips implement spike-timing-dependent plasticity, membrane decay, and spreading activation as physical processes, not simulated ones. The ideas that AI software abandoned decades ago are now being manufactured in silicon.

For memory systems specifically, the reunion is already happening. The principles that govern how brains remember — Hebbian learning, activation decay, spreading activation, consolidation, interference — are the same principles that practical AI memory systems need. Not as metaphors. As engineering solutions.

The great divorce lasted 40 years. The ideas survived. The question now is what we build with them.

References

1. McCulloch, W.S. & Pitts, W. (1943). A Logical Calculus of the Ideas Immanent in Nervous Activity.Bulletin of Mathematical Biophysics, 5(4), 115-133.

2. Hebb, D.O. (1949). The Organization of Behavior. Wiley.

3. Rosenblatt, F. (1958). The Perceptron: A Probabilistic Model for Information Storage and Organization in the Brain.Psychological Review, 65(6), 386-408.

4. Minsky, M. & Papert, S. (1969). Perceptrons. MIT Press.

5. Rumelhart, D.E., Hinton, G.E. & Williams, R.J. (1986). Learning Representations by Back-Propagating Errors.Nature, 323, 533-536.

6. Hopfield, J.J. (1982). Neural Networks and Physical Systems with Emergent Collective Computational Abilities.PNAS, 79(8), 2554-2558.

7. Vaswani, A. et al. (2017). Attention Is All You Need.NeurIPS, 30.

8. Hinton, G. (2022). The Forward-Forward Algorithm: Some Preliminary Investigations.arXiv:2212.13345.

9. Ramsauer, H. et al. (2021). Hopfield Networks is All You Need.ICLR.

10. Bi, G.Q. & Poo, M.M. (1998). Synaptic Modifications in Cultured Hippocampal Neurons.J. Neuroscience, 18(24), 10464-10472.

11. Markram, H. et al. (2012). A History of Spike-Timing-Dependent Plasticity.Frontiers in Synaptic Neuroscience, 3, 4.

12. Anderson, J.R. & Pirolli, P.L. (1984). Spread of Activation.J. Experimental Psychology: Learning, Memory, and Cognition, 10(4), 791-798.

13. Wixted, J.T. (2004). The Psychology and Neuroscience of Forgetting.Annual Review of Psychology, 55, 235-269.

14. Magee, J.C. & Grienberger, C. (2020). Synaptic Plasticity Forms and Functions.Annual Review of Neuroscience, 43, 95-117.

The Great Divorce: How AI Abandoned Neuroscience — And Why It's Coming Back

The Great Divorce: How AI Abandoned Neuroscience — And Why It's Coming Back

Act I: The Marriage (1943–1969)

McCulloch-Pitts Neurons (1943)

Hebb's Rule (1949)

Rosenblatt's Perceptron (1958)

The Assassination (1969)

Act II: The Revival and the Split (1986–2012)

Backpropagation Changes Everything

The Roads Not Taken

The Neuroscience Community Moves On

Act III: Scale Wins, Biology Loses (2012–2023)

The Deep Learning Revolution

Transformers (2017)

The Scaling Hypothesis

Act IV: The Reunion (2024–present)

The Nobel Prize (2024)

Hinton's Forward-Forward Algorithm (2022)

Neuromorphic Hardware

Modern Hopfield Networks (2020–2025)

Predictive Coding and Active Inference

The Abandoned Ideas That Memory Systems Need

Where This Goes

References

Related Posts

Hopfield Networks, the Nobel Prize, and Why AI Memory Was Right All Along

Memory Decay and Forgetting Curves: The Math Behind Remembering

Cognitive Architectures for AI Agents: From ACT-R to Modern Memory Systems

$ subscribe