← Back to blog
14 min read

MCP Memory Server: The Complete Guide to Adding Persistent Memory to Any AI Agent

mcptutorialagentic-ai
mcp-memory-server-guide.md

MCP Memory Server: The Complete Guide to Adding Persistent Memory to Any AI Agent

Your AI agent resets every session. It forgets your codebase, your preferences, your past decisions, your project context. You explain the same things over and over. The context window is not memory — it is a whiteboard that gets erased every time someone closes the door.

The Model Context Protocol changes that. MCP gives AI agents a standardized way to access external tools and data sources — including persistent memory. Instead of stuffing everything into the context window and hoping it fits, your agent can store and recall information from a dedicated memory system that persists across sessions, across machines, across months.

This guide covers everything: what MCP is, why memory via MCP is the right architecture, how to set up shodh-memory's MCP server, what all 45 tools do, and how to wire automatic memory into Claude Code, Cursor, and any MCP-compatible client.

```

┌─────────────────────────────────────────────────────────────┐

│ MCP MEMORY ARCHITECTURE │

│ │

│ ┌──────────────┐ stdio/SSE ┌──────────────────┐ │

│ │ AI Client │ ──────────────→ │ MCP Server │ │

│ │ (Claude Code, │ ←────────────── │ (shodh-memory) │ │

│ │ Cursor, │ tool results │ │ │

│ │ Windsurf) │ │ 45 tools │ │

│ └──────────────┘ │ 60+ endpoints │ │

│ │ │ │

│ Your agent └────────┬──────────┘ │

│ stays the same │ │

│ ┌────────┴──────────┐ │

│ │ Memory Engine │ │

│ │ ┌──────────────┐ │ │

│ │ │ RocksDB │ │ │

│ │ │ Vector Index │ │ │

│ │ │ Knowledge │ │ │

│ │ │ Graph │ │ │

│ │ └──────────────┘ │ │

│ └───────────────────┘ │

│ │

│ All data stays on your machine. No cloud. No API keys. │

└─────────────────────────────────────────────────────────────┘

```

---

What Is MCP?

The Model Context Protocol is an open standard created by Anthropic that defines how AI applications communicate with external tools and data sources. Think of it as USB-C for AI — a universal connector that lets any AI client talk to any tool server using a standardized protocol.

Before MCP, every integration was custom. Want your agent to access a database? Write a custom plugin. Want it to read files? Another plugin. Want memory? Build it yourself. Every AI client had its own plugin format, its own API surface, its own way of describing tools.

MCP standardizes all of this into one protocol:

```

┌─────────────────────────────────────────────────────────────┐

│ THE MCP PROTOCOL │

│ │

│ Client (AI App) Server (Tool Provider) │

│ ────────────── ────────────────────── │

│ │

│ 1. Discovery │

│ Client ──→ "What tools do you have?" │

│ Server ←── List of tools + JSON schemas │

│ │

│ 2. Invocation │

│ Client ──→ "Call remember(content='...')" │

│ Server ←── { success: true, id: 'abc-123' } │

│ │

│ 3. Transport │

│ stdio — local process, piped I/O (fastest) │

│ SSE — HTTP server-sent events (remote) │

│ │

│ No custom plugins. No proprietary APIs. │

│ One protocol, any client, any server. │

└─────────────────────────────────────────────────────────────┘

```

The key insight is that MCP servers are just processes that expose a set of tools via JSON-RPC. The AI model sees the tool descriptions, decides when to call them, and the MCP client handles the plumbing. The server can be written in any language — TypeScript, Python, Rust — and the client does not care.

---

Why Memory via MCP? (vs REST, vs Embedded SDK)

There are three ways to add memory to an AI agent: embed a library directly, call a REST API, or use an MCP server. Each has tradeoffs.

```

┌───────────────────────────────────────────────────────────────┐

│ MEMORY INTEGRATION APPROACHES │

│ │

│ Approach Setup Portability Agent Access Best │

│ ───────── ───── ─────────── ──────────── ──── │

│ Embedded SDK Hard One language Direct call Perf │

│ REST API Medium Any language HTTP client Web │

│ MCP Server Easy Any client Native tools AI │

│ │

│ KEY DIFFERENCE: │

│ MCP tools appear in the model's tool list automatically. │

│ The AI decides WHEN to remember and recall — you don't │

│ write any glue code. REST requires you to manually wire │

│ every call. SDK requires you to ship memory with your app. │

└───────────────────────────────────────────────────────────────┘

```

MCP wins for AI agents because the model sees memory tools as first-class capabilities. When you configure an MCP memory server, the AI client discovers the tools at startup, includes their descriptions in the system prompt, and the model calls remember, recall, proactive_context whenever it deems appropriate. No prompt engineering. No manual API calls. The agent just has memory.

REST wins for custom applications where you control the orchestration logic and want fine-grained control over when and how memory is accessed. shodh-memory exposes 60+ REST endpoints for this use case.

Embedded SDK wins for performance-critical paths where you cannot tolerate any IPC overhead. shodh-memory's Python bindings (PyO3) give you direct access to the Rust engine.

---

Setting Up the shodh-memory MCP Server

Option 1: npx (Recommended)

The fastest way to start. No installation, no Docker, no configuration files.

```bash

npx @shodh/memory-mcp@latest

```

This downloads the platform-specific binary (Linux, macOS, Windows, ARM64), starts the Rust memory engine on port 3030, and launches the MCP server over stdio. Everything runs locally.

Option 2: Claude Desktop (claude_desktop_config.json)

Add shodh-memory to Claude Desktop by editing the MCP config file:

```json

{

"mcpServers": {

"shodh-memory": {

"command": "npx",

"args": ["-y", "@shodh/memory-mcp@latest"]

}

}

}

```

On macOS: ~/Library/Application Support/Claude/claude_desktop_config.json

On Windows: %APPDATA%\Claude\claude_desktop_config.json

On Linux: ~/.config/Claude/claude_desktop_config.json

Restart Claude Desktop. The memory tools appear automatically.

Option 3: Cursor

In Cursor, open Settings, navigate to the MCP section, and add a new server:

```json

{

"mcpServers": {

"shodh-memory": {

"command": "npx",

"args": ["-y", "@shodh/memory-mcp@latest"]

}

}

}

```

Cursor discovers the 45 tools and makes them available to the AI in every session.

Option 4: Claude Code

Claude Code supports MCP servers via its settings. Add to your project's .mcp.json:

```json

{

"mcpServers": {

"shodh-memory": {

"command": "npx",

"args": ["-y", "@shodh/memory-mcp@latest"]

}

}

}

```

---

The 45 Tools: What They Do

shodh-memory exposes 45 MCP tools organized into six categories. Every tool is available to the AI model as a callable function with typed parameters and structured responses.

Core Memory (6 tools)

```

┌────────────────────────────────────────────────────────────────┐

│ CORE MEMORY TOOLS │

│ │

│ Tool Description │

│ ──────────────── ────────────────────────────────────── │

│ remember Store a memory with type, tags, emotion │

│ recall Semantic search across all memories │

│ recall_by_tags Find memories by tag (exact match) │

│ read_memory Get full content of a specific memory │

│ forget Delete a memory by ID │

│ list_memories List all stored memories (paginated) │

└────────────────────────────────────────────────────────────────┘

```

The remember tool is the primary write path. It accepts content (the memory text), type (Observation, Decision, Learning, Error, Discovery, Pattern, Context, Task, and more), tags for categorization, emotional valence and arousal, importance override, and optional parent/preceding IDs for building memory chains.

The recall tool performs hybrid retrieval: semantic vector search, associative graph traversal, or a combined hybrid mode. It returns memories ranked by relevance with scores, timestamps, and metadata.

Proactive Context (1 tool)

```

┌────────────────────────────────────────────────────────────────┐

│ PROACTIVE CONTEXT │

│ │

│ Tool Description │

│ ──────────────────── ────────────────────────────────── │

│ proactive_context Auto-surface relevant memories for │

│ the current conversation turn. │

│ Call with EVERY user message. │

│ Also auto-ingests the context. │

└────────────────────────────────────────────────────────────────┘

```

This is the most important tool. It takes the current conversation context, finds semantically relevant memories, and returns them — all in one call. It also automatically stores the context as a Conversation-type memory for future retrieval. The AI should call this first on every turn.

Todo System (10 tools)

```

┌────────────────────────────────────────────────────────────────┐

│ TODO / GTD TOOLS │

│ │

│ Tool Description │

│ ──────────────── ────────────────────────────────────── │

│ add_todo Create a task with priority, project │

│ list_todos List/search todos (semantic or filter) │

│ update_todo Change status, priority, content │

│ complete_todo Mark done (auto-creates next recurring) │

│ delete_todo Remove permanently │

│ reorder_todo Move up/down within status group │

│ add_todo_comment Add progress/resolution notes │

│ list_todo_comments View comment history │

│ list_subtasks Get child tasks of a parent │

│ todo_stats Counts by status, overdue items │

└────────────────────────────────────────────────────────────────┘

```

The todo system follows GTD methodology. Tasks have statuses (backlog, todo, in_progress, blocked, done, cancelled), priorities (urgent through none), contexts (@computer, @phone, @errands), due dates, projects, subtasks, and recurrence patterns. The AI agent can manage your task list across sessions without losing track.

Projects (4 tools)

```

┌────────────────────────────────────────────────────────────────┐

│ PROJECT TOOLS │

│ │

│ Tool Description │

│ ──────────────── ────────────────────────────────────── │

│ add_project Create a project (optional parent) │

│ list_projects All projects with todo counts │

│ archive_project Hide project (can restore) │

│ delete_project Permanent removal (optional: delete │

│ todos too) │

└────────────────────────────────────────────────────────────────┘

```

Backup and System (8 tools)

```

┌────────────────────────────────────────────────────────────────┐

│ BACKUP & SYSTEM TOOLS │

│ │

│ Tool Description │

│ ──────────────── ────────────────────────────────────── │

│ backup_create Full backup (memories, todos, graph) │

│ backup_list List available backups with metadata │

│ backup_verify SHA-256 integrity check │

│ backup_restore Restore from backup ID │

│ backup_purge Delete old backups (keep N) │

│ memory_stats Total counts, index health │

│ verify_index Check for orphaned vectors │

│ repair_index Re-index orphaned memories │

└────────────────────────────────────────────────────────────────┘

```

Reminders and Advanced (6 tools)

```

┌────────────────────────────────────────────────────────────────┐

│ REMINDERS & ADVANCED TOOLS │

│ │

│ Tool Description │

│ ────────────────── ────────────────────────────────────── │

│ set_reminder Time, duration, or keyword trigger │

│ list_reminders View pending/triggered/dismissed │

│ dismiss_reminder Acknowledge a triggered reminder │

│ context_summary Condensed recent learnings/decisions │

│ consolidation_report Memory strengthening/decay events │

│ token_status Context window usage tracking │

└────────────────────────────────────────────────────────────────┘

```

The reminder system supports three trigger types: time (fire at a specific ISO timestamp), duration (fire after N seconds), and context (fire when certain keywords appear in conversation). Context-triggered reminders use semantic similarity matching — you set keywords and a threshold, and the reminder surfaces when the conversation topic is close enough.

---

Advanced: Hooks Integration for Automatic Memory

The MCP tools above arereactive — the AI calls them when it decides to. But the best memory is automatic. You should not have to ask your agent to remember; it should happen as a side effect of working.

shodh-memory ships with Claude Code hooks that make memory fully automatic:

```

┌─────────────────────────────────────────────────────────────┐

│ AUTOMATIC MEMORY FLOW (HOOKS) │

│ │

│ User sends message │

│ │ │

│ ▼ │

│ ┌─────────────────────┐ │

│ │ PreToolUse Hook │──→ proactive_context(message) │

│ │ (before each turn) │ surfaces relevant memories │

│ └─────────────────────┘ auto-ingests conversation │

│ │ │

│ ▼ │

│ Agent works normally │

│ (reads files, edits code, runs commands) │

│ │ │

│ ▼ │

│ ┌─────────────────────┐ │

│ │ PostToolUse Hook │──→ remember(tool actions) │

│ │ (after each tool) │ stores what the agent did │

│ └─────────────────────┘ │

│ │ │

│ ▼ │

│ ┌─────────────────────┐ │

│ │ Session End Hook │──→ context_summary() │

│ │ (on exit) │ consolidates session learnings │

│ └─────────────────────┘ │

│ │

│ Result: Agent remembers everything without being asked. │

└─────────────────────────────────────────────────────────────┘

```

To enable hooks in Claude Code, add the hook configuration to your project's .claude/settings.json:

```json

{

"hooks": {

"PreToolUse": [

{

"matcher": "*",

"hook": "node hooks/memory-hook.ts pre"

}

],

"PostToolUse": [

{

"matcher": "*",

"hook": "node hooks/memory-hook.ts post"

}

]

}

}

```

The hooks run as lightweight TypeScript scripts that call the shodh-memory REST API. They add zero latency to the agent's main workflow because they execute asynchronously with timeouts.

---

Comparison: MCP Memory Servers

There are several MCP-compatible memory systems in 2026. Here is how they compare:

```

┌──────────────────────────────────────────────────────────────────────────┐

│ MCP MEMORY SERVER COMPARISON │

├────────────────────┬──────────────┬──────────────┬────────────┬─────────┤

│ Feature │ shodh-memory │ mem0 │ Zep │ Custom │

│ │ │ │ │ REST │

├────────────────────┼──────────────┼──────────────┼────────────┼─────────┤

│ MCP Tools │ 45 │ 4 │ 6 │ 0 │

│ Runs Locally │ Yes │ Cloud only │ Cloud only │ Varies │

│ Vector Search │ Vamana/SPANN │ External │ External │ DIY │

│ Knowledge Graph │ Hebbian │ No │ Basic │ DIY │

│ Decay Model │ Hybrid │ No │ No │ DIY │

│ Todo System │ Full GTD │ No │ No │ DIY │

│ Reminders │ 3 triggers │ No │ No │ DIY │

│ Backup/Restore │ Full │ N/A │ N/A │ DIY │

│ Hooks (auto-mem) │ Yes │ No │ No │ DIY │

│ Embeddings │ Local ONNX │ Cloud API │ Cloud API │ Varies │

│ Privacy │ 100% local │ Data leaves │ Data leaves│ Varies │

│ Cost │ Free (OSS) │ Paid plans │ Paid plans │ Dev time│

│ Language │ Rust core │ Python │ Python │ Any │

│ Research Paper │ Yes (DOI) │ No │ No │ No │

└────────────────────┴──────────────┴──────────────┴────────────┴─────────┘

```

shodh-memory is the only MCP memory server that runs entirely on your machine with no cloud dependency. It embeds its own vector search engine (Vamana for <100K memories, SPANN for >100K), generates embeddings locally via ONNX Runtime (MiniLM-L6-v2, 384 dimensions), and stores everything in RocksDB. No API keys, no data leaving your network, no monthly bill.

mem0 offers a clean API but requires cloud infrastructure. Your memories live on their servers. The MCP integration provides basic remember/recall but lacks the cognitive features — no decay, no knowledge graph, no spreading activation.

Zep focuses on conversation memory for chatbots. It extracts facts from conversations and stores them in a graph, but requires external infrastructure (PostgreSQL, embedding API) and is primarily designed for customer-facing applications rather than developer tools.

Custom REST means building your own memory system and wrapping it in an MCP server. This gives maximum control but requires significant engineering effort. You need to implement vector search, embedding generation, relevance scoring, decay, and the MCP protocol itself.

---

The Cognitive Advantage

MCP gives your agent memory access. But access alone is not enough. What matters is thequality of retrieval — whether the right memories surface at the right time.

shodh-memory uses a multi-stage retrieval pipeline that goes far beyond simple vector similarity:

```

┌─────────────────────────────────────────────────────────────┐

│ 5-LAYER RETRIEVAL PIPELINE │

│ │

│ Layer 0.7 Fact pre-fetch │

│ Query entities → graph → supporting facts │

│ │ │

│ Layer 1 Vector search (Vamana/SPANN) │

│ Semantic similarity on 384-dim embeddings │

│ │ │

│ Layer 2 Graph traversal (spreading activation) │

│ Hebbian-weighted edges, 3-tier LTP │

│ │ │

│ Layer 3 Reciprocal Rank Fusion │

│ Merge vector + graph candidates │

│ │ │

│ Layer 4 Contextual re-ranking │

│ Recency, importance, emotional arousal │

│ │ │

│ Layer 4.8 Fact boost │

│ High-confidence facts boost source memories │

│ │ │

│ Layer 5 Diversity filter │

│ Prevent redundant results │

│ ▼ │

│ Final ranked results │

└─────────────────────────────────────────────────────────────┘

```

This pipeline means your agent does not just find memories thatmention the right keywords. It finds memories that aresemantically related,structurally connected via the knowledge graph,temporally relevant, andemotionally salient. A memory stored three months ago about a critical production bug will surface when you discuss the same system — even if you use completely different words.

---

Getting Started in 60 Seconds

```bash

Install and run

npx @shodh/memory-mcp@latest

```

That is it. The server starts, discovers your platform, downloads the binary if needed, and begins accepting MCP connections. Add it to your AI client's MCP config, restart, and your agent has persistent memory.

No Docker. No database setup. No API keys. No cloud account. One command.

```

┌─────────────────────────────────────────────────────────────┐

│ QUICK REFERENCE │

│ │

│ Install: npx @shodh/memory-mcp@latest │

│ API Port: 3030 (default) │

│ Storage: ~/.shodh-memory/ │

│ Binary: ~30MB (platform-specific) │

│ Embeddings: MiniLM-L6-v2 (local ONNX, 384-dim) │

│ Latency: <1ms write, 34-58ms semantic search │

│ License: Apache 2.0 │

│ Tests: 1089 passing │

└─────────────────────────────────────────────────────────────┘

```

Your AI agent should not start every session from zero. MCP makes memory a first-class capability. shodh-memory makes it work.

GitHub
npm

$ subscribe

Get updates on releases, features, and AI memory research.