Shodh-Memory: A Cognitive Memory System for Edge-Native AI Agents

Q: How is shodh-memory different from a vector database?

Vector databases give you similarity search. Shodh-memory gives you cognition — memories strengthen when accessed together (Hebbian learning), decay naturally over time (power-law forgetting), and form associative networks via a knowledge graph. It's the difference between storage and memory.

Q: Does shodh-memory require an internet connection?

No. Shodh-memory runs 100% offline. The embeddings, vector index, knowledge graph — everything runs locally. Perfect for edge devices, air-gapped systems, or anywhere you need data privacy.

Q: What's the memory overhead?

The binary is ~30MB. Models add ~50MB (22MB MiniLM embeddings + 14MB NER model + 14MB ONNX runtime). Each memory entry uses roughly 2-5KB. A system with 10,000 memories uses approximately 50MB of storage.

Q: Can shodh-memory run on a Raspberry Pi?

Yes. Shodh-memory is designed for edge deployment. It runs on Raspberry Pi Zero, Jetson Nano, industrial PCs, and other resource-constrained devices. Graph lookups are under 1 microsecond.

Q: How does memory decay work?

Shodh-memory uses a hybrid model: exponential decay for the first 3 days (consolidation phase), then power-law decay for long-term retention. Memories accessed 10+ times become potentiated and decay 10x slower. Based on Wixted & Ebbesen (1991).

Q: What is Hebbian learning in AI agent memory?

Cells that fire together, wire together. When memories are accessed together, their connection strengthens. When memories compete, interference effects occur. It's how biological brains work, now applied to AI agent memory.

Q: Is there a cloud version?

No, and that's intentional. Shodh-memory is built for local-first, privacy-preserving AI. Your agent's memories stay on your hardware. If you need multi-device sync, you can replicate the RocksDB storage yourself.

Q: What languages and frameworks does shodh-memory support?

The core is Rust. We provide: an MCP server (for Claude, Cursor, and other AI agents), Python bindings (via PyO3/maturin), and a REST API. The Rust crate can be embedded directly in your application.

Q: How do I contribute?

Check out github.com/varun29ankuS/shodh-memory. Open issues, submit PRs, or join discussions. The codebase is well-documented with 688+ tests. All constants have neuroscience citations.

Varun Sharma

doi:10.5281/zenodo.18668709

2026-04-03•12 min read

Vector Databases Explained: How Semantic Search Powers AI Memory

architecturetutorialcomparison

vector-databases-how-they-work.md

Vector Databases Explained: How Semantic Search Powers AI Memory

Ctrl+F finds exact matches. Vector search finds meaning. Here is how.

When you search your codebase for connection_pool_timeout, you find every file that contains that exact string. But what about the file that discusses "how long to wait before dropping idle database connections"? That is the same concept, described in different words. Traditional search misses it. Vector search finds it.

Vector databases are the backbone of modern AI applications — RAG pipelines, semantic search, recommendation engines, and memory systems. This post explains what vectors are, how similarity search works, why indexing algorithms matter, and how shodh-memory uses Vamana and SPANN to power its semantic memory layer.

---

What Is a Vector? What Is an Embedding?

A vector is a list of numbers. An embedding is a vector produced by a neural network that captures themeaning of its input. Texts with similar meanings produce vectors that are close together in vector space.

```

┌─────────────────────────────────────────────────────────────────────┐

│ TEXT → EMBEDDING → VECTOR SPACE │

├─────────────────────────────────────────────────────────────────────┤

│ │

│ "The database connection timed out" │

│ │ │

│ ▼ │

│ ┌──────────────────────┐ │

│ │ Embedding Model │ (MiniLM-L6-v2, 384 dimensions) │

│ │ (neural network) │ │

│ └──────────┬───────────┘ │

│ │ │

│ ▼ │

│ [0.12, -0.45, 0.78, 0.03, ..., -0.22] (384 numbers) │

│ │

│ Vector Space (simplified to 2D): │

│ │

│ ▲ dim 2 │

│ │ │

│ │ ● "DB connection failed" │

│ │ ● "connection timed out" │

│ │ ● "database timeout error" │

│ │ │

│ │ ● "deploy to staging" │

│ │ ● "push to production" │

│ │ │

│ │ ● "the weather is nice" │

│ │ │

│ └──────────────────────────────────▶ dim 1 │

│ │

│ Nearby points = semantically similar text │

│ │

└─────────────────────────────────────────────────────────────────────┘

```

shodh-memory uses MiniLM-L6-v2 (via ONNX Runtime), which produces 384-dimensional embeddings — no GPU required, no API calls, everything runs locally.

How Vector Search Works

Given a query, vector search has three steps:

1. Embed the query — convert query text to a vector using the same model

2. Find nearest neighbors — search the database for vectors closest to the query vector

3. Return results — return the original texts associated with those vectors

"Closest" is measured by a distance metric:

```

┌────────────────────┬────────────────────────────┬────────────────────┐

│ Metric │ Formula │ Best For │

├────────────────────┼────────────────────────────┼────────────────────┤

│ Cosine Similarity │ dot(A,B) / (|A| × |B|) │ Text similarity │

│ Dot Product │ Σ(Aᵢ × Bᵢ) │ Normalized vectors │

│ L2 (Euclidean) │ √Σ(Aᵢ - Bᵢ)² │ Dense clustering │

└────────────────────┴────────────────────────────┴────────────────────┘

```

Cosine similarity is the most common for text. It measures the angle between vectors, ignoring magnitude.

Indexing Algorithms: From Brute Force to Vamana

The naive approach — compare the query against every vector — is O(n) per query. For 1,000,000 vectors, it is unacceptable. This is where ANN indexing comes in.

```

┌─────────────────────────────────────────────────────────────────────┐

│ NEAREST NEIGHBOR SEARCH (2D simplified) │

├─────────────────────────────────────────────────────────────────────┤

│ │

│ ▲ │

│ │ · │

│ │ · · │

│ │ · ★ query │

│ │ · ◉ ←nearest · │

│ │ ◉ ←2nd nearest │

│ │ · · │

│ │ · · · · │

│ └──────────────────────────────────▶ │

│ │

│ With index: only compare against candidates from graph/clusters │

│ Accuracy: ~95-99% Speed: 10-1000× faster │

│ │

└─────────────────────────────────────────────────────────────────────┘

```

IVF — Partition vectors into clusters. Only search nearest clusters.

HNSW — Multi-layer navigable graph. Very fast, very memory-hungry.

Vamana — Single-layer graph with aggressive pruning. HNSW-quality recall with less memory. Developed by Microsoft for DiskANN.

SPANN — Combines IVF with disk-based storage and product quantization. Designed for billion-scale.

```

┌─────────────────────────────────────────────────────────────────────┐

│ VAMANA GRAPH STRUCTURE (simplified) │

├─────────────────────────────────────────────────────────────────────┤

│ │

│ Each node = one vector. Edges connect nearby vectors. │

│ Search: greedy walk from entry point toward query. │

│ │

│ ┌─────┐ │

│ ┌───▶│ A │◀────┐ │

│ │ └──┬──┘ │ │

│ │ │ │ │

│ ┌────┴──┐ │ ┌────┴──┐ │

│ │ B │◀───┘ │ C │ │

│ └───┬───┘ └───┬───┘ │

│ │ │ │

│ ┌────┴──┐ ┌───┴────┐ │

│ │ D │ │ E │ │

│ └───────┘ └────────┘ │

│ │

│ Max out-degree R (e.g., 64) keeps memory low │

│ RobustPrune removes redundant edges │

│ │

└─────────────────────────────────────────────────────────────────────┘

```

shodh-memory's Approach: Adaptive Vamana + SPANN

shodh-memory automatically switches based on dataset size:

```

┌─────────────────────────────────────────────────────────────────────┐

│ shodh-memory ADAPTIVE INDEX SELECTION │

├─────────────────────────────────────────────────────────────────────┤

│ │

│ Memory count Index Storage Query latency │

│ ────────────── ───────────── ──────────── ────────────── │

│ 1 - 1K Brute force In-memory < 0.1ms │

│ 1K - 100K Vamana graph In-memory < 1ms │

│ 100K - 10M+ SPANN + PQ Disk + RAM < 5ms │

│ │

│ PQ compression: 384 × f32 (1,536 bytes) → 48 bytes (~32×) │

│ Recall@10: > 95% across all tiers │

│ │

└─────────────────────────────────────────────────────────────────────┘

```

Vector Database Comparison

```

┌────────────────────┬───────────┬───────────┬───────────┬───────────┬──────────────┐

│ Feature │ Pinecone │ Milvus │ Weaviate │ Qdrant │ shodh-memory │

├────────────────────┼───────────┼───────────┼───────────┼───────────┼──────────────┤

│ Deployment │ Cloud │ Self-host │ Self-host │ Self-host │ Embedded │

│ Index types │ Propriet. │ IVF,HNSW │ HNSW │ HNSW │ Vamana,SPANN │

│ Embedding included │ No │ No │ Yes │ No │ Yes (MiniLM) │

│ Knowledge graph │ No │ No │ No │ No │ Yes (Hebbian)│

│ Memory decay │ No │ No │ No │ No │ Yes (Wixted) │

│ Latency (p99) │ ~20ms │ ~10ms │ ~15ms │ ~5ms │ < 1ms │

│ Privacy │ Cloud │ Optional │ Optional │ Optional │ 100% local │

│ Pricing │ Per-query │ Open src │ Open src │ Open src │ Free (Apache)│

│ Purpose │ General │ General │ General │ General │ AI memory │

└────────────────────┴───────────┴───────────┴───────────┴───────────┴──────────────┘

```

Why Vector Search Alone Is Not Memory

Vector databases solve retrieval but miss three critical aspects:

1. Decay

A memory stored a year ago has the same retrieval probability as one stored today. shodh-memory applies hybrid decay (Wixted 2004) that deprioritizes stale memories without deleting them.

2. Strengthening

When you retrieve a memory, it should becomeeasier to retrieve next time. Vector databases treat retrieval as read-only. shodh-memory updates access counts, timestamps, and graph edge weights on every recall.

3. Relationships

Vector similarity captures semantic relatedness but not causal, temporal, or structural relationships. Only a knowledge graph can capture "the server crashed" → "we increased the connection pool."

The Full Stack: Vectors + Graph + Decay

```

┌─────────────────────────────────────────────────────────────────────┐

│ THE COGNITIVE MEMORY STACK │

├─────────────────────────────────────────────────────────────────────┤

│ │

│ ┌─────────────────────────────────────────────────────────────┐ │

│ │ Layer 5: RANKED RESULTS │ │

│ │ Final ranked memories returned to the agent │ │

│ └──────────────────────────┬──────────────────────────────────┘ │

│ │ │

│ ┌──────────────────────────┴──────────────────────────────────┐ │

│ │ Layer 4: FUSION (Reciprocal Rank Fusion) │ │

│ │ Merge vector results + graph results + fact boosts │ │

│ └──────────┬─────────────────────────────────┬────────────────┘ │

│ │ │ │

│ ┌──────────┴──────────┐ ┌──────────┴──────────────┐ │

│ │ Layer 3: GRAPH │ │ Layer 3: DECAY │ │

│ │ Spreading activ. │ │ Time-based weighting │ │

│ └──────────┬──────────┘ └──────────┬──────────────┘ │

│ │ │ │

│ ┌──────────┴──────────────────────────────────┴──────────────┐ │

│ │ Layer 2: VECTOR SEARCH │ │

│ │ Vamana / SPANN nearest neighbor search │ │

│ └──────────────────────────┬──────────────────────────────────┘ │

│ │ │

│ ┌──────────────────────────┴──────────────────────────────────┐ │

│ │ Layer 1: EMBEDDING │ │

│ │ MiniLM-L6-v2 (ONNX, local, no API calls) │ │

│ └─────────────────────────────────────────────────────────────┘ │

│ │

└─────────────────────────────────────────────────────────────────────┘

```

Vector search is Layer 2 of a 5-layer retrieval pipeline. Each layer adds intelligence that a standalone vector database cannot provide.

Getting Started

shodh-memory includes its own vector database — no external service required:

```bash

Docker

docker run -d -p 3030:3030 -v shodh-data:/data ghcr.io/varun29ankus/shodh-memory:latest

npm (MCP server)

npm install -g @shodh/memory-mcp

Rust crate

cargo install shodh-memory

Python

pip install shodh-memory

```

• GitHub

• Documentation

• npm

• Crates.io

• PyPI

• DOI: 10.5281/zenodo.18668709

---

Vector databases answer "what is similar?" That is necessary but not sufficient for memory. Memory requires decay, strengthening, and relationships. shodh-memory gives you all three — vectors, graph, and cognitive decay — in a single embedded system.

Vector Databases Explained: How Semantic Search Powers AI Memory

Vector Databases Explained: How Semantic Search Powers AI Memory

What Is a Vector? What Is an Embedding?

How Vector Search Works

Indexing Algorithms: From Brute Force to Vamana

shodh-memory's Approach: Adaptive Vamana + SPANN

Vector Database Comparison

Why Vector Search Alone Is Not Memory

1. Decay

2. Strengthening

3. Relationships

The Full Stack: Vectors + Graph + Decay

Getting Started

Docker

npm (MCP server)

Rust crate

Python

Related Posts

Best AI Agent Frameworks 2026: LangChain, CrewAI, AutoGen, OpenAI Agents SDK Compared

shodh-memory vs mem0 vs Zep vs MemGPT: Which AI Agent Memory System Should You Use?

How to Make Your AI Agent Remember Between Sessions

$ subscribe