Shodh-Memory: A Cognitive Memory System for Edge-Native AI Agents

Q: How is shodh-memory different from a vector database?

Vector databases give you similarity search. Shodh-memory gives you cognition — memories strengthen when accessed together (Hebbian learning), decay naturally over time (power-law forgetting), and form associative networks via a knowledge graph. It's the difference between storage and memory.

Q: Does shodh-memory require an internet connection?

No. Shodh-memory runs 100% offline. The embeddings, vector index, knowledge graph — everything runs locally. Perfect for edge devices, air-gapped systems, or anywhere you need data privacy.

Q: What's the memory overhead?

The binary is ~30MB. Models add ~50MB (22MB MiniLM embeddings + 14MB NER model + 14MB ONNX runtime). Each memory entry uses roughly 2-5KB. A system with 10,000 memories uses approximately 50MB of storage.

Q: Can shodh-memory run on a Raspberry Pi?

Yes. Shodh-memory is designed for edge deployment. It runs on Raspberry Pi Zero, Jetson Nano, industrial PCs, and other resource-constrained devices. Graph lookups are under 1 microsecond.

Q: How does memory decay work?

Shodh-memory uses a hybrid model: exponential decay for the first 3 days (consolidation phase), then power-law decay for long-term retention. Memories accessed 10+ times become potentiated and decay 10x slower. Based on Wixted & Ebbesen (1991).

Q: What is Hebbian learning in AI agent memory?

Cells that fire together, wire together. When memories are accessed together, their connection strengthens. When memories compete, interference effects occur. It's how biological brains work, now applied to AI agent memory.

Q: Is there a cloud version?

No, and that's intentional. Shodh-memory is built for local-first, privacy-preserving AI. Your agent's memories stay on your hardware. If you need multi-device sync, you can replicate the RocksDB storage yourself.

Q: What languages and frameworks does shodh-memory support?

The core is Rust. We provide: an MCP server (for Claude, Cursor, and other AI agents), Python bindings (via PyO3/maturin), and a REST API. The Rust crate can be embedded directly in your application.

Q: How do I contribute?

Check out github.com/varun29ankuS/shodh-memory. Open issues, submit PRs, or join discussions. The codebase is well-documented with 688+ tests. All constants have neuroscience citations.

Varun Sharma

doi:10.5281/zenodo.18668709

2026-06-13•11 min read

Causal Retrieval: The Memory Problem Vector Search Can't Solve

knowledge-grapharchitectureneuroscience

causal-retrieval-ai-memory.md

Causal Retrieval: The Memory Problem Vector Search Can't Solve

Ask your AI agent: "Why did we decide to use Rust for the backend?"

A retrieval-augmented system embeds that question, searches its vector store for the nearest passages, and hands back whatever text issemantically similar to the words "decide," "Rust," and "backend." Sometimes that works — if the decision and its reasoning happen to sit in one passage that looks like the question. Often it does not, because thereason for a decision is rarely phrased like the decision itself, and the chain that led to it is scattered across many memories laid down at different times.

This is not a tuning problem. It is a structural one. Semantic similarity answers a specific question —"what looks like this?" — and it answers it well. But "why did this happen?" is a different question entirely, and no amount of better embeddings will turn a nearest-neighbor search into a causal explanation. Recent surveys of agent memory name this gap directly, listing hybrid retrievers that blend semantic similarity with temporal ordering and causal graph traversal as one of the field'slargely-unexplored frontiers. This post is about why that gap exists and what it actually takes to close it.

Two different questions

It helps to be precise about the distinction:

• "What looks like this?" — asimilarity query. Given a point, find the nearest points. This is what vector search does. It is associative, content-addressable, and order-free: it does not care when things happened or what led to what.

• "What caused this?" — acausal query. Given an outcome, find the chain of antecedents that produced it: the decision behind the result, the incident behind the decision, the observation behind the incident. This is awalk backward through a structure, not a lookup of nearby points.

These two questions need two different data structures. Similarity needs a metric space — a place where "near" is defined. Causality needs adirected graph — a place where "X led to Y" is an edge you can follow, in a direction. A vector store is a metric space. It has no edges and no direction. You cannot traverse a causal chain in a structure that has no chains in it.

Why flat retrieval cannot fake it

A natural objection: can't you just retrieve a big enough neighborhood and let the language model reconstruct the causality from the text? Sometimes, at small scale, with a strong model and a short history — yes. But it degrades exactly where it matters:

• The links are not local. Cause and effect are often many hops and many sessions apart. "We chose Rust because the Go garbage-collector pauses hurt the edge latency we'd measured during the incident in March." The cause (a latency measurement), the context (an incident), and the decision (Rust) may be three separate memories with little surface similarity. A neighborhood around the question reaches the decision and misses the chain.

• Similarity actively misleads. The passages mostsimilar to "why did we choose Rust" are often other discussions of Rust — not the reasoning that preceded the choice. Similarity pulls toward the topic and away from the cause.

• It does not scale. Reconstructing causality by stuffing a large neighborhood into a model's context is expensive, slow, and gets worse as the memory grows and the relevant chain gets buried deeper in the pool.

The honest summary: flat retrieval cansometimes surface a cause when it happens to be similar and nearby. It cannotreliably reconstruct a causal chain, because it has no representation of causality to reconstruct from.

What causal retrieval actually requires

To answer "what caused this?" you need three things that a vector store does not have:

1. Typed, directed edges. Not a generic "these two memories are related," but "this memorycaused that one" — a relation with a direction. The arrow matters: "the incident caused the decision" is true; "the decision caused the incident" is false. Direction is the whole point of causality.

2. A graph you can walk backward. Given an outcome node, follow its incoming causal edges to antecedents, then their antecedents, building the chain. This is a graph traversal, and it requires the graph to exist.

3. A way to score the walk. Real histories are noisy; not every "because" is load-bearing. The walk has to be scored and bounded — favoring the strongest causal paths, decaying with distance, stopping before it floods — so it returnsthe origin, not every weakly-connected ancestor.

How shodh-memory does it

shodh-memory is built on a knowledge graph where memories and the entities in them are nodes, and the relationships between them are typed, directed edges — including causal ones. When an experience carries causal structure (an outcome, a "because," a decision following an incident), that structure is recorded as directed edges at ingest, without sending anything to a large language model.

Recall over that graph includes a causal-origin walk: given a memory or a query, the system walksbackward along causal edges to reconstruct the chain of antecedents that led to it. The walk is a scored, bounded traversal — it follows the strongest causal paths, attenuates with each hop, and returns the top origins rather than every distant ancestor. The result is not "here are some passages that look like your question." It is "here is the decision, here is the incident behind it, here is the observation behind that" — the actual chain.

Two properties fall out of doing it this way. First, it is explainable: the answer to "why did I decide X" is a path you can read, not a ranking you have to trust. Second, it is deterministic and auditable: the same query against the same memory walks the same chain every time, which is exactly what a regulated or safety-critical deployment needs when it asks "show me what the decision was based on, and prove it."

The honest scope

Causal retrieval is acapability, not a leaderboard number. It is the thing flat retrieval structurally cannot do, and the thing the literature flags as underexplored — but it is not a claim that our overall recall beats every system on every benchmark, which is a separate and ongoing effort we report honestly elsewhere. What causal retrieval gives you is a differentkind of answer — the why behind the what — that a nearest-neighbor search cannot produce at any quality. The two are complementary: similarity finds what is relevant; the causal walk explains how it came to be.

Where this matters

• Agent debugging and trust. "Why did the agent do that?" answered by the chain of memories it acted on, not a post-hoc guess.

• Decision provenance. Months later, reconstruct why a choice was made and what it ruled out — across many sessions.

• Root-cause analysis. From a failure, walk back to the precursor conditions, the way a good engineer reconstructs an incident.

• Regulated and safety-critical systems. An inspectable, reproducible causal lineage is something an opaque, stochastic retrieval pipeline cannot offer.

The takeaway

"What looks like this?" and "what caused this?" are different questions that need different machinery. The first is a similarity search; the second is a backward walk through a directed, typed graph. Most AI memory systems only have the first, which is why "why did we decide X?" so often returns a plausible-looking near-miss. Causal retrieval is the missing half — and building it does not take a bigger model, it takes a graph with causality in it and a walk that knows which way the arrows point.

Related reading: [Knowledge Graphs and Spreading Activation](/blog/knowledge-graph-spreading-activation) · [RAG Is Not Memory](/blog/rag-is-not-memory) · [Why Not Just Vector Search?](/blog/why-not-just-vector-search) · [LLM-Free Memory](/llm-free-memory).

Causal Retrieval: The Memory Problem Vector Search Can't Solve

Causal Retrieval: The Memory Problem Vector Search Can't Solve

Two different questions

Why flat retrieval cannot fake it

What causal retrieval actually requires

How shodh-memory does it

The honest scope

Where this matters

The takeaway

Related Posts

Language Models Are Few-Shot Learners — But They're Amnesiacs

Cognitive Architectures for AI Agents: From ACT-R to Modern Memory Systems

Types of Memory in AI: From Working Memory to Long-Term Potentiation

$ subscribe