Shodh-Memory: A Cognitive Memory System for Edge-Native AI Agents

Q: How is shodh-memory different from a vector database?

Vector databases give you similarity search. Shodh-memory gives you cognition — memories strengthen when accessed together (Hebbian learning), decay naturally over time (power-law forgetting), and form associative networks via a knowledge graph. It's the difference between storage and memory.

Q: Does shodh-memory require an internet connection?

No. Shodh-memory runs 100% offline. The embeddings, vector index, knowledge graph — everything runs locally. Perfect for edge devices, air-gapped systems, or anywhere you need data privacy.

Q: What's the memory overhead?

The binary is ~30MB. Models add ~50MB (22MB MiniLM embeddings + 14MB NER model + 14MB ONNX runtime). Each memory entry uses roughly 2-5KB. A system with 10,000 memories uses approximately 50MB of storage.

Q: Can shodh-memory run on a Raspberry Pi?

Yes. Shodh-memory is designed for edge deployment. It runs on Raspberry Pi Zero, Jetson Nano, industrial PCs, and other resource-constrained devices. Graph lookups are under 1 microsecond.

Q: How does memory decay work?

Shodh-memory uses a hybrid model: exponential decay for the first 3 days (consolidation phase), then power-law decay for long-term retention. Memories accessed 10+ times become potentiated and decay 10x slower. Based on Wixted & Ebbesen (1991).

Q: What is Hebbian learning in AI agent memory?

Cells that fire together, wire together. When memories are accessed together, their connection strengthens. When memories compete, interference effects occur. It's how biological brains work, now applied to AI agent memory.

Q: Is there a cloud version?

No, and that's intentional. Shodh-memory is built for local-first, privacy-preserving AI. Your agent's memories stay on your hardware. If you need multi-device sync, you can replicate the RocksDB storage yourself.

Q: What languages and frameworks does shodh-memory support?

The core is Rust. We provide: an MCP server (for Claude, Cursor, and other AI agents), Python bindings (via PyO3/maturin), and a REST API. The Rust crate can be embedded directly in your application.

Q: How do I contribute?

Check out github.com/varun29ankuS/shodh-memory. Open issues, submit PRs, or join discussions. The codebase is well-documented with 688+ tests. All constants have neuroscience citations.

Varun Sharma

doi:10.5281/zenodo.18668709

2026-06-13•12 min read

Language Models Are Few-Shot Learners — But They're Amnesiacs

agentic-aiarchitectureneuroscience

language-models-few-shot-learners-amnesiacs.md

Language Models Are Few-Shot Learners — But They're Amnesiacs

In 2020, the GPT-3 paper carried one of the most quietly consequential titles in machine learning:Language Models are Few-Shot Learners. The claim was that a large enough model could learn a new task from a handful of examples placed directly in its prompt — no fine-tuning, no gradient updates, just a few demonstrations and a question. It was true, it was surprising, and it reorganized the field around in-context learning.

It is also only half a sentence. The full version is: language models are few-shot learners who forget everything the moment the context window scrolls.

Few-shot learning is real, and it is ephemeral

In-context learning is genuine learning in one narrow sense: the model's behavior adapts to the examples you show it. But that adaptation has the lifespan of a single context window. Scroll past the examples, start a new session, exceed the token budget — and the learning is gone. The model does notretain what it learned from you. It re-derives your context from scratch every time, or it forgets it entirely.

This is the central limitation that the scaling story papers over. A bigger model is abetter few-shot learner — it needs fewer examples, handles harder tasks — but it is not amore persistent one. Doubling the parameters does not give the model a memory of last Tuesday. The few-shot learner is brilliant and amnesiac at the same time, and the amnesia is structural, not a matter of scale.

Why this matters more in the agent era

For a chatbot answering one-off questions, ephemeral few-shot learning is fine. For an agent — something that acts over days, coordinates sub-tasks, accumulates context about a codebase or a customer or a household — it is crippling. An agent that re-learns its entire context every session is doing the equivalent of the movieGroundhog Day: maximally capable in the moment, incapable of compounding. The intelligence is real but it does not accumulate, and accumulation is the entire point of an agent.

The instinct in the field has been to fix this with more model: longer context windows, retrieval that stuffs more text into the prompt, and various schemes that try to bake experience back into the weights. These are serious and interesting. They also share an assumption worth questioning: that the place experience should live isinside the model.

The alternative: put the learning in the memory, not the model

There is a different architecture. Instead of making the model remember, give the agent a memory — a persistent, structured store that accumulates experience and retrieves the right fragment when it is needed. The model stays a few-shot learner; the memory is what makes the few-shot contextpersist across sessions, so the agent learns you once and keeps it.

This reframes the famous title. Language models are few-shot learners — so feed them the right few shots, every time, from a memory that has been watching. The few examples that make in-context learning work do not have to be re-supplied by a human each session; they can berecalled from an accumulated memory of everything the agent has experienced. Few-shot learning plus memory is what ephemeral few-shot learning was always supposed to become.

Memory that learns, without an LLM in the loop

Here is the key design choice. shodh-memory's faculties are designed as a frozen seed plus a small online adapter: the entity recognizer, the relation typer, even the embedder are conceived as a pre-trained seed that personalizes to a specific user, agent, or robotfrom the memory's own accumulated experience. The memory has a feedback loop — a dopaminergic prediction-error signal, in the spirit of Schultz's reward-learning work — that scales how much it learns from a retrieval by howsurprising the outcome was. The memory learns from the consequences of its own recalls.

That is a form of self-improvement from experience — but located in the memory substrate rather than in a large model's weights, and crucially with no LLM in the loop. The recognizers are small. The whole thing fits in tens of megabytes of RAM and runs on-device. The learning is private, it is yours, and it compounds with use in a way a re-reading model never can, because the model re-derives you while the memory has been adapting to you.

Why "no LLM in the loop" is the unlock, not a constraint

If your strategy for persistence is to bake experience into a large model's weights, you inherit the large model's costs: you pay tokens or training compute per unit of experience, you cannot run it on the edge, and the learned state is coupled to one specific model that will be obsolete in a year. If instead the learning lives in a small, LLM-free memory, the cost scales with CPU, it runs anywhere, and the accumulated memory outlives the model — you can swap your generation model and keep everything the agent has learned about you. The memory is the durable asset; the few-shot learner is the swappable part.

The takeaway

Language models are few-shot learners. That sentence is true and it is famous for good reason. But the missing half — that the learning evaporates without memory — is where the next frontier actually is. The agents that compound will not be the ones with the biggest few-shot learner; they will be the ones with the best memory feeding it. And the best memory is small, local, learning, and has no language model anywhere in its loop.

Related reading: [Why Your AI Agent's Memory Is Broken](/blog/why-your-ai-agents-memory-is-broken) · [Hebbian Learning for AI Agents](/blog/hebbian-learning-ai-agents) · [RAG Is Not Memory](/blog/rag-is-not-memory) · [LLM-Free Memory](/llm-free-memory).

Language Models Are Few-Shot Learners — But They're Amnesiacs

Language Models Are Few-Shot Learners — But They're Amnesiacs

Few-shot learning is real, and it is ephemeral

Why this matters more in the agent era

The alternative: put the learning in the memory, not the model

Memory that learns, without an LLM in the loop

Why "no LLM in the loop" is the unlock, not a constraint

The takeaway

Related Posts

Causal Retrieval: The Memory Problem Vector Search Can't Solve

Best AI Agent Frameworks 2026: LangChain, CrewAI, AutoGen, OpenAI Agents SDK Compared

Graph Databases for AI Memory: Why Your Agent Needs a Knowledge Graph

$ subscribe