Shodh-Memory: A Cognitive Memory System for Edge-Native AI Agents

Q: How is shodh-memory different from a vector database?

Vector databases give you similarity search. Shodh-memory gives you cognition — memories strengthen when accessed together (Hebbian learning), decay naturally over time (power-law forgetting), and form associative networks via a knowledge graph. It's the difference between storage and memory.

Q: Does shodh-memory require an internet connection?

No. Shodh-memory runs 100% offline. The embeddings, vector index, knowledge graph — everything runs locally. Perfect for edge devices, air-gapped systems, or anywhere you need data privacy.

Q: What's the memory overhead?

The binary is ~30MB. Models add ~50MB (22MB MiniLM embeddings + 14MB NER model + 14MB ONNX runtime). Each memory entry uses roughly 2-5KB. A system with 10,000 memories uses approximately 50MB of storage.

Q: Can shodh-memory run on a Raspberry Pi?

Yes. Shodh-memory is designed for edge deployment. It runs on Raspberry Pi Zero, Jetson Nano, industrial PCs, and other resource-constrained devices. Graph lookups are under 1 microsecond.

Q: How does memory decay work?

Shodh-memory uses a hybrid model: exponential decay for the first 3 days (consolidation phase), then power-law decay for long-term retention. Memories accessed 10+ times become potentiated and decay 10x slower. Based on Wixted & Ebbesen (1991).

Q: What is Hebbian learning in AI agent memory?

Cells that fire together, wire together. When memories are accessed together, their connection strengthens. When memories compete, interference effects occur. It's how biological brains work, now applied to AI agent memory.

Q: Is there a cloud version?

No, and that's intentional. Shodh-memory is built for local-first, privacy-preserving AI. Your agent's memories stay on your hardware. If you need multi-device sync, you can replicate the RocksDB storage yourself.

Q: What languages and frameworks does shodh-memory support?

The core is Rust. We provide: an MCP server (for Claude, Cursor, and other AI agents), Python bindings (via PyO3/maturin), and a REST API. The Rust crate can be embedded directly in your application.

Q: How do I contribute?

Check out github.com/varun29ankuS/shodh-memory. Open issues, submit PRs, or join discussions. The codebase is well-documented with 688+ tests. All constants have neuroscience citations.

Varun Sharma

doi:10.5281/zenodo.18668709

2026-04-03•15 min read

Cognitive Architectures for AI Agents: From ACT-R to Modern Memory Systems

cognitionneurosciencearchitecture

cognitive-architectures-for-ai-agents-guide.md

Cognitive Architectures for AI Agents: From ACT-R to Modern Memory Systems

GPT-4 has no architecture. It has a context window. There is a difference.

When cognitive scientists design intelligent systems, they do not start with "how much text can we fit in a prompt." They start with how does cognition work — perception, attention, memory encoding, retrieval, decay, learning. These principles produced architectures like SOAR, ACT-R, and Global Workspace Theory that have modeled human cognition for decades.

Modern AI agents have none of this. They have a language model, a system prompt, and maybe a vector database. When the context window fills up, memories are silently dropped. There is no decay function. There is no working memory distinct from long-term memory. There is no learning — every session starts from zero.

This post traces the lineage from classical cognitive architectures to modern AI agent memory systems, explains why the limited memory AI problem persists, and shows how shodh-memory implements a cognitive architecture grounded in Cowan's embedded processes model, Hebbian learning, and Wixted's hybrid decay.

---

What Is a Cognitive Architecture?

A cognitive architecture is a blueprint for how an intelligent system perceives, reasons, remembers, and acts. It is not a model — it is thescaffolding on which models operate.

Three architectures have dominated cognitive science for decades:

SOAR (State, Operator, And Result) — Developed at University of Michigan. Models cognition as search through a problem space. Learns by "chunking" successful problem-solving sequences into rules.

ACT-R (Adaptive Control of Thought — Rational) — Developed at Carnegie Mellon. Models cognition as interaction between declarative memory (facts) and procedural memory (rules). Memory items have activation levels that decay over time and are boosted by retrieval.

Global Workspace Theory — Proposed by Bernard Baars. Consciousness is a "global workspace" that broadcasts information to specialized processors.

```

┌─────────────────────────────────────────────────────────────────────────┐

│ CLASSICAL COGNITIVE ARCHITECTURE (ACT-R style) │

├─────────────────────────────────────────────────────────────────────────┤

│ │

│ ┌─────────────┐ ┌──────────────────┐ ┌─────────────────┐ │

│ │ Perception │────▶│ Working Memory │────▶│ Motor Output │ │

│ │ Module │ │ (limited capacity)│ │ Module │ │

│ └─────────────┘ └────────┬─────────┘ └─────────────────┘ │

│ │ │

│ ┌────────────┴────────────┐ │

│ ▼ ▼ │

│ ┌──────────────────┐ ┌──────────────────┐ │

│ │ Declarative │ │ Procedural │ │

│ │ Memory │ │ Memory │ │

│ │ (facts, events) │ │ (rules, skills) │ │

│ │ │ │ │ │

│ │ activation decay │ │ utility learning │ │

│ │ retrieval boost │ │ conflict resoln │ │

│ └──────────────────┘ └──────────────────┘ │

│ │

│ Key principle: Memory activation = base_level + context_boost │

│ Base level decays with time, increases with frequency of access │

│ │

└─────────────────────────────────────────────────────────────────────────┘

```

These architectures share a critical insight: memory is not a database lookup. It is a dynamic process with encoding, consolidation, decay, interference, and retrieval-dependent strengthening.

Classical Architecture vs Modern LLM Agent

```

┌─────────────────────────────────────────────────────────────────────────┐

│ COGNITIVE ARCHITECTURE vs MODERN LLM AGENT │

├──────────────────────────────┬──────────────────────────────────────────┤

│ Cognitive Architecture │ Typical LLM Agent (2024-2025) │

├──────────────────────────────┼──────────────────────────────────────────┤

│ Sensory buffer │ User message (current turn) │

│ │ │ │ │

│ ▼ │ ▼ │

│ Attention filter │ System prompt (static) │

│ │ │ │ │

│ ▼ │ ▼ │

│ Working memory (4±1 items) │ Context window (128K tokens) │

│ │ │ (no capacity management) │

│ ┌────┴────┐ │ │ │

│ ▼ ▼ │ ▼ │

│ Episodic Semantic │ Vector DB (optional RAG) │

│ Memory Memory │ (no decay, no learning) │

│ │ │ │ │ │

│ ▼ ▼ │ ▼ │

│ Decay Consolidation │ LLM generates response │

│ Retrieval boost │ (stateless, no memory update) │

│ │ │

├──────────────────────────────┼──────────────────────────────────────────┤

│ Memory is dynamic, │ Memory is static storage. │

│ self-modifying, adaptive. │ Append-only. No learning. │

└──────────────────────────────┴──────────────────────────────────────────┘

```

Memory Types: Cognitive Science → AI Mapping

```

┌───────────────────┬──────────────────────┬──────────────────────────────┐

│ Cognitive Type │ Human Brain │ AI Agent Equivalent │

├───────────────────┼──────────────────────┼──────────────────────────────┤

│ Sensory memory │ ~250ms buffer │ Current message/input │

│ Working memory │ 4±1 items, seconds │ Context window (imprecise) │

│ Episodic memory │ Personal experiences │ Conversation history │

│ Semantic memory │ General knowledge │ Training data (frozen) │

│ Procedural memory │ Skills, habits │ Fine-tuning (rare) │

│ Prospective memory │ Future intentions │ Usually missing entirely │

│ Metamemory │ Knowing what you know│ Missing entirely │

└───────────────────┴──────────────────────┴──────────────────────────────┘

```

Most AI agents have a crude approximation of working memory (the context window) and nothing else.

Cowan's Embedded Processes Model

Nelson Cowan's (2001) embedded processes model describes human memory as a set of nested activation states. shodh-memory implements this model directly:

```

┌─────────────────────────────────────────────────────────────────────────┐

│ COWAN'S MODEL → shodh-memory IMPLEMENTATION │

├─────────────────────────────────────────────────────────────────────────┤

│ │

│ ┌─── Long-Term Memory (inactive but available) ───────────────────┐ │

│ │ │ │

│ │ shodh-memory: All stored memories (RocksDB) │ │

│ │ Decay: hybrid exponential→power-law (Wixted 2004) │ │

│ │ │ │

│ │ ┌─── Activated Long-Term Memory (primed) ──────────────────┐ │ │

│ │ │ │ │ │

│ │ │ shodh-memory: Session-tier memories │ │ │

│ │ │ Graph edges with spreading activation │ │ │

│ │ │ │ │ │

│ │ │ ┌─── Focus of Attention (4±1 items) ───────────────┐ │ │ │

│ │ │ │ │ │ │ │

│ │ │ │ shodh-memory: Working-tier memories │ │ │ │

│ │ │ │ Current query context │ │ │ │

│ │ │ │ Top-K retrieval results │ │ │ │

│ │ │ │ │ │ │ │

│ │ │ └───────────────────────────────────────────────────┘ │ │ │

│ │ └───────────────────────────────────────────────────────────┘ │ │

│ └───────────────────────────────────────────────────────────────────┘ │

│ │

│ Key mapping: │

│ Focus of attention → Working tier (minutes, high activation) │

│ Activated LTM → Session tier (hours, medium activation) │

│ Inactive LTM → Long-term tier (days+, low activation) │

│ │

└─────────────────────────────────────────────────────────────────────────┘

```

Hebbian Learning: "Cells That Fire Together, Wire Together"

Donald Hebb proposed in 1949 that when two neurons repeatedly fire at the same time, the synaptic connection between them strengthens. shodh-memory implements this in its knowledge graph:

• Boosting is slow (additive +0.025) — strong connections require repeated evidence

• Forgetting is fast (multiplicative *0.90) — unused connections fade quickly

• The result: only consistently co-occurring concepts form strong, permanent connections

Memory Decay: Ebbinghaus → Wixted → shodh-memory

Hermann Ebbinghaus (1885) discovered that forgetting follows a curve. John Wixted (2004) showed the best fit is hybrid: exponential decay for the first few days, transitioning to power-law decay for longer periods.

```

┌─────────────────────────────────────────────────────────────────┐

│ MEMORY DECAY MODEL │

├─────────────────────────────────────────────────────────────────┤

│ │

│ Strength │

│ 1.0 ┤██ │

│ │ ██ │

│ 0.8 ┤ ███ │

│ │ ████ Exponential phase (0-3 days) │

│ 0.6 ┤ ████ s(t) = e^(-λt) │

│ │ ████ │

│ 0.4 ┤ ····· │

│ │ ········ │

│ 0.3 ┤ ········· Power-law (3+ days)│

│ │ ··········· s(t)=t^α│

│ 0.2 ┤ ·········· │

│ └─────┬─────┬─────┬─────┬─────┬─────┬─────┬─────┬──▶ │

│ 1d 3d 1w 2w 1m 2m 3m 6m Time │

│ ▲ │

│ │ transition point │

│ │

│ Each retrieval resets the clock (retrieval-dependent boost) │

│ │

└─────────────────────────────────────────────────────────────────┘

```

The Limited Memory AI Problem

The "limited memory AI" is not theoretical. It is the daily reality of every AI agent in production:

• ChatGPT loses all conversation context between sessions

• Claude has project knowledge but no persistent memory that grows with use

• Coding agents rediscover codebase architecture every session

• Customer support bots ask the same diagnostic questions every time

The root cause is architectural: these systems have no cognitive architecture. They have a language model and a context window.

Cognitive Capability Comparison

```

┌──────────────────────┬───────────────┬───────────────┬────────────────┐

│ Capability │ cognee │ mem0 │ shodh-memory │

├──────────────────────┼───────────────┼───────────────┼────────────────┤

│ Memory tiers │ No │ No │ 3 (Cowan) │

│ Memory decay │ No │ No │ Hybrid (Wixted)│

│ Retrieval boost │ No │ No │ Yes (ACT-R) │

│ Knowledge graph │ Yes (Neo4j) │ Graph layer │ Embedded graph │

│ Hebbian learning │ No │ No │ Yes (3-tier) │

│ Spreading activation │ No │ No │ Yes │

│ Prospective memory │ No │ No │ Yes (reminders)│

│ LLM dependency │ Required │ Required │ None (local) │

│ Cognitive model │ None cited │ None cited │ Cowan 2001 │

│ Decay model │ None │ None │ Wixted 2004 │

│ Academic citations │ Few │ None │ 200+ constants │

│ Privacy │ Cloud-first │ Cloud/local │ 100% local │

└──────────────────────┴───────────────┴───────────────┴────────────────┘

```

Getting Started

shodh-memory ships as a single Rust binary with no external dependencies:

```bash

Docker

docker run -d -p 3030:3030 -v shodh-data:/data ghcr.io/varun29ankus/shodh-memory:latest

npm (MCP server)

npm install -g @shodh/memory-mcp

Rust crate

cargo install shodh-memory

Python

pip install shodh-memory

```

• GitHub

• Documentation

• npm

• Crates.io

• PyPI

• DOI: 10.5281/zenodo.18668709

---

A context window is not a cognitive architecture. It is a buffer. The difference between an AI that uses a buffer and an AI that uses a cognitive architecture is the difference between a calculator and a mind.

Cognitive Architectures for AI Agents: From ACT-R to Modern Memory Systems

Cognitive Architectures for AI Agents: From ACT-R to Modern Memory Systems

What Is a Cognitive Architecture?

Classical Architecture vs Modern LLM Agent

Memory Types: Cognitive Science → AI Mapping

Cowan's Embedded Processes Model

Hebbian Learning: "Cells That Fire Together, Wire Together"

Memory Decay: Ebbinghaus → Wixted → shodh-memory

The Limited Memory AI Problem

Cognitive Capability Comparison

Getting Started

Docker

npm (MCP server)

Rust crate

Python

Related Posts

Types of Memory in AI: From Working Memory to Long-Term Potentiation

Why Your AI Agent's Memory Is Broken — And How Neuroscience Fixes It

Cognitive Architecture for AI Systems: What Neuroscience Actually Teaches Us

$ subscribe