← Back to blog
2026-03-1910 min read

How to Make Your AI Agent Remember Between Sessions

tutorialagentic-aiarchitecture
how-to-make-ai-remember-between-sessions.md

How to Make Your AI Agent Remember Between Sessions

Your AI agent is smart — for exactly one session. Then it forgets everything. Your preferences. The decisions you made. The bugs you already debugged together. Every new session is a cold start.

This isn't a minor annoyance. For coding agents working on the same codebase for weeks, research agents building understanding across papers, or workflow agents coordinating multi-step processes — statelessness is a dealbreaker.

Here's how to fix it. Not with chat history. Not with RAG. With actual persistent memory.

Why Chat History Isn't Memory

The most common "fix" for agent amnesia is appending every message to a growing list and sending it back as context on the next turn. This breaks in three ways:

**1. It doesn't scale.** After 50 sessions, you have 200K tokens of undifferentiated noise. There's no prioritization — "user prefers Rust" gets the same weight as "thanks, looks good."

**2. It doesn't learn.** The 100th time your agent retrieves a specific piece of knowledge, it has exactly the same retrieval confidence as the first time. There's no reinforcement, no strengthening of useful connections.

**3. It doesn't forget.** Everything is retained forever. That one-time discussion about a library you didn't end up using? Still there, consuming context budget, potentially misleading future retrievals.

Real memory is selective. It strengthens what matters and lets noise fade away.

The Three Layers of Agent Memory

Effective persistent memory has three components, each solving a different part of the problem:

Layer 1: Semantic Storage

Every piece of knowledge gets embedded into a vector and stored with metadata (timestamp, importance, access count, emotional valence). This enables similarity search — when the agent needs context, it can find semantically related memories.

But semantic storage alone is just a vector database. The next two layers are what make it memory.

Layer 2: Knowledge Graph

Entities extracted from memories form a graph. When you discuss "authentication" and then "JWT tokens," the system creates an edge between those concepts. Over time, the graph captures the relationships between everything your agent has learned.

The graph enables spreading activation — when you query for "authentication," related concepts like "JWT," "session cookies," and "OAuth middleware" get a partial activation boost. The agent retrieves not just direct matches, but contextually related knowledge it wouldn't find with vector search alone.

Layer 3: Cognitive Dynamics

This is the layer most systems skip entirely. It includes:

**Memory decay:** Memories that aren't accessed lose strength over time, following biologically plausible forgetting curves. Noise fades naturally without explicit garbage collection.
**Hebbian strengthening:** When two memories are accessed together, the connection between them strengthens. Frequently useful knowledge becomes strongly encoded.
**Tier promotion:** New memories start in working memory (volatile, fast). If accessed again, they promote to session memory, then to long-term storage. Not all memories are equal, and the system reflects that.
**Consolidation:** Background processes replay high-importance memories, strengthening their connections and promoting them through tiers — analogous to how sleep consolidates human memory.

Implementation: Adding Persistent Memory to Your Agent

Option 1: MCP Server (Claude Code, Cursor, any MCP client)

If you're using an MCP-compatible tool, this is the fastest path:

```

Install

npm install -g @shodh/memory-mcp

Run

shodh-memory serve

```

Add to your MCP client configuration:

```

{

"mcpServers": {

"shodh-memory": {

"command": "shodh-memory",

"args": ["serve"]

}

}

}

```

The MCP server exposes 45 tools that the AI agent can call: `remember`, `recall`, `forget`, `add_todo`, `list_todos`, and many more. Memory happens automatically through hooks — when the agent uses tools, reads files, or has conversations, the system captures and stores relevant context.

Option 2: REST API (Any Language)

For custom integrations, shodh-memory exposes a REST API on port 3030:

```

Store a memory

curl -X POST http://localhost:3030/api/remember \

-H 'Content-Type: application/json' \

-d '{

"content": "User prefers Rust for systems code, TypeScript for web",

"tags": ["preference", "languages"],

"source_type": "user"

}'

Recall memories

curl 'http://localhost:3030/api/recall?query=language+preferences&limit=5'

Get context for current conversation

curl -X POST http://localhost:3030/api/proactive_context \

-H 'Content-Type: application/json' \

-d '{"context": "setting up a new web project"}'

```

Option 3: Python Bindings

```

pip install shodh-memory

```
```

from shodh_memory import ShodhMemory

memory = ShodhMemory()

Store

memory.remember("The auth service uses JWT with 24h expiry")

Recall

results = memory.recall("authentication tokens")

Proactive context

context = memory.proactive_context("debugging the login flow")

```

Option 4: Docker (Zero Install)

```

docker run -d -p 3030:3030 -v shodh-data:/data ghcr.io/varun29ankus/shodh-memory:latest

```

One command. Persistent volume. Full API available on localhost:3030.

What Changes After Adding Memory

**Day 1:** The agent stores your preferences, project structure, and initial decisions in working memory.

**Week 1:** Frequently accessed knowledge promotes to session memory. The knowledge graph develops clusters around your most discussed topics. When you ask about authentication, the agent automatically surfaces related memories about your session handling, JWT configuration, and that CORS issue you fixed on Tuesday.

**Month 1:** The agent has accumulated long-term knowledge about your codebase, your preferences, your team's conventions. Context surfaces before you ask for it. Memories you never revisited have naturally decayed. The knowledge graph has strong, well-tested connections between related concepts.

The difference is qualitative, not just quantitative. The agent goes from a smart tool that needs constant context-setting to a collaborator that knows your project.

The Key Insight

Making an AI agent remember between sessions isn't about storage. It's about cognition. Any database can store text. The hard part is deciding what to strengthen, what to let decay, how to connect related knowledge, and when to surface context automatically.

That's why shodh-memory is built on neuroscience research, not database research. The brain solved this problem billions of years ago. We just translated the solution to code.

$ subscribe

Get updates on releases, features, and AI memory research.