Shodh-Memory: A Cognitive Memory System for Edge-Native AI Agents

Q: How is shodh-memory different from a vector database?

Vector databases give you similarity search. Shodh-memory gives you cognition — memories strengthen when accessed together (Hebbian learning), decay naturally over time (power-law forgetting), and form associative networks via a knowledge graph. It's the difference between storage and memory.

Q: Does shodh-memory require an internet connection?

No. Shodh-memory runs 100% offline. The embeddings, vector index, knowledge graph — everything runs locally. Perfect for edge devices, air-gapped systems, or anywhere you need data privacy.

Q: What's the memory overhead?

The binary is ~30MB. Models add ~50MB (22MB MiniLM embeddings + 14MB NER model + 14MB ONNX runtime). Each memory entry uses roughly 2-5KB. A system with 10,000 memories uses approximately 50MB of storage.

Q: Can shodh-memory run on a Raspberry Pi?

Yes. Shodh-memory is designed for edge deployment. It runs on Raspberry Pi Zero, Jetson Nano, industrial PCs, and other resource-constrained devices. Graph lookups are under 1 microsecond.

Q: How does memory decay work?

Shodh-memory uses a hybrid model: exponential decay for the first 3 days (consolidation phase), then power-law decay for long-term retention. Memories accessed 10+ times become potentiated and decay 10x slower. Based on Wixted & Ebbesen (1991).

Q: What is Hebbian learning in AI agent memory?

Cells that fire together, wire together. When memories are accessed together, their connection strengthens. When memories compete, interference effects occur. It's how biological brains work, now applied to AI agent memory.

Q: Is there a cloud version?

No, and that's intentional. Shodh-memory is built for local-first, privacy-preserving AI. Your agent's memories stay on your hardware. If you need multi-device sync, you can replicate the RocksDB storage yourself.

Q: What languages and frameworks does shodh-memory support?

The core is Rust. We provide: an MCP server (for Claude, Cursor, and other AI agents), Python bindings (via PyO3/maturin), and a REST API. The Rust crate can be embedded directly in your application.

Q: How do I contribute?

Check out github.com/varun29ankuS/shodh-memory. Open issues, submit PRs, or join discussions. The codebase is well-documented with 688+ tests. All constants have neuroscience citations.

Varun Sharma

doi:10.5281/zenodo.18668709

[llm-free]

Agent memory with zero LLM in the loop

shodh-memory forms, stores, and retrieves memories without a large language model anywhere in the loop. Extraction, ranking, and recall are deterministic code; small frozen models handle perception only. The result is memory you can audit, that never invents what it stores, and that never leaves the machine.

[00]

What "no LLM in the loop" actually means

It is a precise claim, not a slogan. The difference is where the language model sits — and whether it is allowed to decide what you remember.

where-the-llm-sits.md

# Most memory systems put an LLM inside the loop

  ingest -> [ LLM extracts "facts" ] -> store -> [ LLM decides recall ] -> agent
              ^ can fabricate                       ^ opaque, unauditable

  whatever the model invents or distorts is baked into storage as ground truth.

# shodh-memory keeps the loop deterministic

  ingest -> [ rules + frozen NER / embeddings ] -> store (with provenance)
              ^ perception only, no generation      ^ every memory -> a source
        -> [ deterministic ranking + decay ] -> agent
              ^ inspectable, reproducible

  the language model, if any, lives in the agent on top -- never in the memory.

[01..04]

Why it matters

01

Nothing is fabricated into storage

A system that uses an LLM to extract memories can write down things that were never said — a hallucination, stored as if it were ground truth. shodh-memory does not generate; it records. Every memory traces back to the source it came from, so you can always check where a belief originated.

02

Deterministic and auditable

What gets stored, strengthened, or surfaced is decided by inspectable rules with documented constants — not a model's opaque judgement. The same input produces the same memory every time, so you can audit exactly why anything surfaced, and reproduce it.

03

Nothing leaves the device

No cloud, no API keys, no external calls. Memory is formed and queried entirely on the machine. Nothing is transmitted to, cached by, or retained by a third party — there is no remote copy to lose.

04

No extraction attack surface

When an LLM reads untrusted text to build memory, that text can steer what gets stored — a poisoning vector hidden in ordinary input. Deterministic extraction has no prompt to hijack, so a whole class of memory-poisoning attacks simply has nowhere to land.

[where]

Where these properties tend to matter

We build the memory layer. Whether these properties are decisive for your system is your call — but they tend to matter most in places like these.

Regulated & audited environments

When you have to answer “why does the system believe this?”, memory with provenance and reproducible behaviour is auditable by construction, rather than after the fact.

Air-gapped & edge deployments

Where there is no network — or none is permitted — memory that forms and recalls entirely on-device keeps working through disconnection and restarts.

Security-sensitive agents

When the text an agent ingests cannot be trusted, an extraction path with no LLM to hijack removes the prompt-injection route into long-term memory.

Personal & on-device assistants

When memory holds personal or health data, keeping it on the machine means it is never somewhere else to be exposed.

[next]

Go deeper

Memory the agent can use — without handing the agent control over what it remembers.

One binary, one command, no LLM in the loop.

$ npm install View Source Read the docs