Shodh-Memory: A Cognitive Memory System for Edge-Native AI Agents

Q: How is shodh-memory different from a vector database?

Vector databases give you similarity search. Shodh-memory gives you cognition — memories strengthen when accessed together (Hebbian learning), decay naturally over time (power-law forgetting), and form associative networks via a knowledge graph. It's the difference between storage and memory.

Q: Does shodh-memory require an internet connection?

No. Shodh-memory runs 100% offline. The embeddings, vector index, knowledge graph — everything runs locally. Perfect for edge devices, air-gapped systems, or anywhere you need data privacy.

Q: What's the memory overhead?

The binary is ~30MB. Models add ~50MB (22MB MiniLM embeddings + 14MB NER model + 14MB ONNX runtime). Each memory entry uses roughly 2-5KB. A system with 10,000 memories uses approximately 50MB of storage.

Q: Can shodh-memory run on a Raspberry Pi?

Yes. Shodh-memory is designed for edge deployment. It runs on Raspberry Pi Zero, Jetson Nano, industrial PCs, and other resource-constrained devices. Graph lookups are under 1 microsecond.

Q: How does memory decay work?

Shodh-memory uses a hybrid model: exponential decay for the first 3 days (consolidation phase), then power-law decay for long-term retention. Memories accessed 10+ times become potentiated and decay 10x slower. Based on Wixted & Ebbesen (1991).

Q: What is Hebbian learning in AI agent memory?

Cells that fire together, wire together. When memories are accessed together, their connection strengthens. When memories compete, interference effects occur. It's how biological brains work, now applied to AI agent memory.

Q: Is there a cloud version?

No, and that's intentional. Shodh-memory is built for local-first, privacy-preserving AI. Your agent's memories stay on your hardware. If you need multi-device sync, you can replicate the RocksDB storage yourself.

Q: What languages and frameworks does shodh-memory support?

The core is Rust. We provide: an MCP server (for Claude, Cursor, and other AI agents), Python bindings (via PyO3/maturin), and a REST API. The Rust crate can be embedded directly in your application.

Q: How do I contribute?

Check out github.com/varun29ankuS/shodh-memory. Open issues, submit PRs, or join discussions. The codebase is well-documented with 688+ tests. All constants have neuroscience citations.

Varun Sharma

doi:10.5281/zenodo.18668709

2026-03-19•10 min read

How to Make Your AI Agent Remember Between Sessions

tutorialagentic-aiarchitecture

how-to-make-ai-remember-between-sessions.md

How to Make Your AI Agent Remember Between Sessions

Your AI agent is smart — for exactly one session. Then it forgets everything. Your preferences. The decisions you made. The bugs you already debugged together. Every new session is a cold start.

This isn't a minor annoyance. For coding agents working on the same codebase for weeks, research agents building understanding across papers, or workflow agents coordinating multi-step processes — statelessness is a dealbreaker.

Here's how to fix it. Not with chat history. Not with RAG. With actual persistent memory.

Why Chat History Isn't Memory

The most common "fix" for agent amnesia is appending every message to a growing list and sending it back as context on the next turn. This breaks in three ways:

1. It doesn't scale. After 50 sessions, you have 200K tokens of undifferentiated noise. There's no prioritization — "user prefers Rust" gets the same weight as "thanks, looks good."

2. It doesn't learn. The 100th time your agent retrieves a specific piece of knowledge, it has exactly the same retrieval confidence as the first time. There's no reinforcement, no strengthening of useful connections.

3. It doesn't forget. Everything is retained forever. That one-time discussion about a library you didn't end up using? Still there, consuming context budget, potentially misleading future retrievals.

Real memory is selective. It strengthens what matters and lets noise fade away.

The Three Layers of Agent Memory

Effective persistent memory has three components, each solving a different part of the problem:

Layer 1: Semantic Storage

Every piece of knowledge gets embedded into a vector and stored with metadata (timestamp, importance, access count, emotional valence). This enables similarity search — when the agent needs context, it can find semantically related memories.

But semantic storage alone is just a vector database. The next two layers are what make it memory.

Layer 2: Knowledge Graph

Entities extracted from memories form a graph. When you discuss "authentication" and then "JWT tokens," the system creates an edge between those concepts. Over time, the graph captures the relationships between everything your agent has learned.

The graph enables spreading activation — when you query for "authentication," related concepts like "JWT," "session cookies," and "OAuth middleware" get a partial activation boost. The agent retrieves not just direct matches, but contextually related knowledge it wouldn't find with vector search alone.

Layer 3: Cognitive Dynamics

This is the layer most systems skip entirely. It includes:

• Memory decay: Memories that aren't accessed lose strength over time, following biologically plausible forgetting curves. Noise fades naturally without explicit garbage collection.

• Hebbian strengthening: When two memories are accessed together, the connection between them strengthens. Frequently useful knowledge becomes strongly encoded.

• Tier promotion: New memories start in working memory (volatile, fast). If accessed again, they promote to session memory, then to long-term storage. Not all memories are equal, and the system reflects that.

• Consolidation: Background processes replay high-importance memories, strengthening their connections and promoting them through tiers — analogous to how sleep consolidates human memory.

Implementation: Adding Persistent Memory to Your Agent

Option 1: MCP Server (Claude Code, Cursor, any MCP client)

If you're using an MCP-compatible tool, this is the fastest path:

```

Install

npm install -g @shodh/memory-mcp

Run

shodh-memory serve

```

Add to your MCP client configuration:

```

{

"mcpServers": {

"shodh-memory": {

"command": "shodh-memory",

"args": ["serve"]

}

```

The MCP server exposes 45 tools that the AI agent can call: remember, recall, forget, add_todo, list_todos, and many more. Memory happens automatically through hooks — when the agent uses tools, reads files, or has conversations, the system captures and stores relevant context.

Option 2: REST API (Any Language)

For custom integrations, shodh-memory exposes a REST API on port 3030:

```

Store a memory

curl -X POST http://localhost:3030/api/remember \

-H 'Content-Type: application/json' \

-d '{

"content": "User prefers Rust for systems code, TypeScript for web",

"tags": ["preference", "languages"],

"source_type": "user"

}'

Recall memories

curl 'http://localhost:3030/api/recall?query=language+preferences&limit=5'

Get context for current conversation

curl -X POST http://localhost:3030/api/proactive_context \

-H 'Content-Type: application/json' \

-d '{"context": "setting up a new web project"}'

```

Option 3: Python Bindings

```

pip install shodh-memory

```

from shodh_memory import ShodhMemory

memory = ShodhMemory()

Store

memory.remember("The auth service uses JWT with 24h expiry")

Recall

results = memory.recall("authentication tokens")

Proactive context

context = memory.proactive_context("debugging the login flow")

```

Option 4: Docker (Zero Install)

```

docker run -d -p 3030:3030 -v shodh-data:/data ghcr.io/varun29ankus/shodh-memory:latest

```

One command. Persistent volume. Full API available on localhost:3030.

What Changes After Adding Memory

Day 1: The agent stores your preferences, project structure, and initial decisions in working memory.

Week 1: Frequently accessed knowledge promotes to session memory. The knowledge graph develops clusters around your most discussed topics. When you ask about authentication, the agent automatically surfaces related memories about your session handling, JWT configuration, and that CORS issue you fixed on Tuesday.

Month 1: The agent has accumulated long-term knowledge about your codebase, your preferences, your team's conventions. Context surfaces before you ask for it. Memories you never revisited have naturally decayed. The knowledge graph has strong, well-tested connections between related concepts.

The difference is qualitative, not just quantitative. The agent goes from a smart tool that needs constant context-setting to a collaborator that knows your project.

The Key Insight

Making an AI agent remember between sessions isn't about storage. It's about cognition. Any database can store text. The hard part is deciding what to strengthen, what to let decay, how to connect related knowledge, and when to surface context automatically.

That's why shodh-memory is built on neuroscience research, not database research. The brain solved this problem billions of years ago. We just translated the solution to code.

How to Make Your AI Agent Remember Between Sessions

How to Make Your AI Agent Remember Between Sessions

Why Chat History Isn't Memory

The Three Layers of Agent Memory

Layer 1: Semantic Storage

Layer 2: Knowledge Graph

Layer 3: Cognitive Dynamics

Implementation: Adding Persistent Memory to Your Agent

Option 1: MCP Server (Claude Code, Cursor, any MCP client)

Install

Run

Option 2: REST API (Any Language)

Store a memory

Recall memories

Get context for current conversation

Option 3: Python Bindings

Store

Recall

Proactive context

Option 4: Docker (Zero Install)

What Changes After Adding Memory

The Key Insight

Related Posts

MCP Memory Server: The Complete Guide to Adding Persistent Memory to Any AI Agent

Best AI Agent Frameworks 2026: LangChain, CrewAI, AutoGen, OpenAI Agents SDK Compared

Graph Databases for AI Memory: Why Your Agent Needs a Knowledge Graph

$ subscribe