How to Make Your AI Agent Remember Between Sessions
How to Make Your AI Agent Remember Between Sessions
Your AI agent is smart — for exactly one session. Then it forgets everything. Your preferences. The decisions you made. The bugs you already debugged together. Every new session is a cold start.
This isn't a minor annoyance. For coding agents working on the same codebase for weeks, research agents building understanding across papers, or workflow agents coordinating multi-step processes — statelessness is a dealbreaker.
Here's how to fix it. Not with chat history. Not with RAG. With actual persistent memory.
Why Chat History Isn't Memory
The most common "fix" for agent amnesia is appending every message to a growing list and sending it back as context on the next turn. This breaks in three ways:
**1. It doesn't scale.** After 50 sessions, you have 200K tokens of undifferentiated noise. There's no prioritization — "user prefers Rust" gets the same weight as "thanks, looks good."
**2. It doesn't learn.** The 100th time your agent retrieves a specific piece of knowledge, it has exactly the same retrieval confidence as the first time. There's no reinforcement, no strengthening of useful connections.
**3. It doesn't forget.** Everything is retained forever. That one-time discussion about a library you didn't end up using? Still there, consuming context budget, potentially misleading future retrievals.
Real memory is selective. It strengthens what matters and lets noise fade away.
The Three Layers of Agent Memory
Effective persistent memory has three components, each solving a different part of the problem:
Layer 1: Semantic Storage
Every piece of knowledge gets embedded into a vector and stored with metadata (timestamp, importance, access count, emotional valence). This enables similarity search — when the agent needs context, it can find semantically related memories.
But semantic storage alone is just a vector database. The next two layers are what make it memory.
Layer 2: Knowledge Graph
Entities extracted from memories form a graph. When you discuss "authentication" and then "JWT tokens," the system creates an edge between those concepts. Over time, the graph captures the relationships between everything your agent has learned.
The graph enables spreading activation — when you query for "authentication," related concepts like "JWT," "session cookies," and "OAuth middleware" get a partial activation boost. The agent retrieves not just direct matches, but contextually related knowledge it wouldn't find with vector search alone.
Layer 3: Cognitive Dynamics
This is the layer most systems skip entirely. It includes:
Implementation: Adding Persistent Memory to Your Agent
Option 1: MCP Server (Claude Code, Cursor, any MCP client)
If you're using an MCP-compatible tool, this is the fastest path:
Install
npm install -g @shodh/memory-mcp
Run
shodh-memory serve
Add to your MCP client configuration:
{
"mcpServers": {
"shodh-memory": {
"command": "shodh-memory",
"args": ["serve"]
}
}
}
The MCP server exposes 45 tools that the AI agent can call: `remember`, `recall`, `forget`, `add_todo`, `list_todos`, and many more. Memory happens automatically through hooks — when the agent uses tools, reads files, or has conversations, the system captures and stores relevant context.
Option 2: REST API (Any Language)
For custom integrations, shodh-memory exposes a REST API on port 3030:
Store a memory
curl -X POST http://localhost:3030/api/remember \
-H 'Content-Type: application/json' \
-d '{
"content": "User prefers Rust for systems code, TypeScript for web",
"tags": ["preference", "languages"],
"source_type": "user"
}'
Recall memories
curl 'http://localhost:3030/api/recall?query=language+preferences&limit=5'
Get context for current conversation
curl -X POST http://localhost:3030/api/proactive_context \
-H 'Content-Type: application/json' \
-d '{"context": "setting up a new web project"}'
Option 3: Python Bindings
pip install shodh-memory
from shodh_memory import ShodhMemory
memory = ShodhMemory()
Store
memory.remember("The auth service uses JWT with 24h expiry")
Recall
results = memory.recall("authentication tokens")
Proactive context
context = memory.proactive_context("debugging the login flow")
Option 4: Docker (Zero Install)
docker run -d -p 3030:3030 -v shodh-data:/data ghcr.io/varun29ankus/shodh-memory:latest
One command. Persistent volume. Full API available on localhost:3030.
What Changes After Adding Memory
**Day 1:** The agent stores your preferences, project structure, and initial decisions in working memory.
**Week 1:** Frequently accessed knowledge promotes to session memory. The knowledge graph develops clusters around your most discussed topics. When you ask about authentication, the agent automatically surfaces related memories about your session handling, JWT configuration, and that CORS issue you fixed on Tuesday.
**Month 1:** The agent has accumulated long-term knowledge about your codebase, your preferences, your team's conventions. Context surfaces before you ask for it. Memories you never revisited have naturally decayed. The knowledge graph has strong, well-tested connections between related concepts.
The difference is qualitative, not just quantitative. The agent goes from a smart tool that needs constant context-setting to a collaborator that knows your project.
The Key Insight
Making an AI agent remember between sessions isn't about storage. It's about cognition. Any database can store text. The hard part is deciding what to strengthen, what to let decay, how to connect related knowledge, and when to surface context automatically.
That's why shodh-memory is built on neuroscience research, not database research. The brain solved this problem billions of years ago. We just translated the solution to code.