How to Add Real Memory to a LangChain Agent (Beyond ConversationBufferMemory)
How to Add Real Memory to a LangChain Agent (Beyond ConversationBufferMemory)
LangChain is the most popular framework for building AI agents. But its built-in memory system — `ConversationBufferMemory`, `ConversationSummaryMemory`, `VectorStoreRetrieverMemory` — solves the wrong problem.
These are chat history managers, not memory systems. They store what was said. They don't learn from it.
Here's how to give your LangChain agent actual persistent memory that strengthens with use, decays naturally, and forms associative networks — while keeping your existing LangChain code intact.
What's Wrong with LangChain's Built-in Memory
ConversationBufferMemory
Appends every message to a list. Sends the full list back as context on each turn. After 20 conversations, you're burning 50K+ tokens on undifferentiated chat history. No prioritization, no decay, no learning.
ConversationSummaryMemory
Uses the LLM to summarize the conversation history. Better than the raw buffer, but you're paying LLM inference to compress what should be a data engineering problem. And the summary loses specifics — the exact error message, the precise config value, the particular file path.
VectorStoreRetrieverMemory
Stores messages in a vector database and retrieves by similarity. Closer to real memory, but still missing the cognitive layer: no decay (old irrelevant memories persist forever), no learning (retrieved memories don't strengthen connections), no knowledge graph (no associative retrieval).
The Pattern
All three treat memory as a storage problem. Real memory is a learning problem.
Architecture: LangChain + shodh-memory
The integration is straightforward: shodh-memory runs as a sidecar server, and your LangChain agent communicates with it via REST API. You keep your existing chain/agent structure and add memory as a tool or as context injection.
┌────────────────────┐ ┌──────────────────────┐
│ LangChain Agent │────▶│ shodh-memory API │
│ │◀────│ localhost:3030 │
│ - Tools │ │ │
│ - Chains │ │ - Vector index │
│ - Prompts │ │ - Knowledge graph │
│ │ │ - Hebbian learning │
│ │ │ - Decay engine │
└────────────────────┘ └──────────────────────┘
Step 1: Start shodh-memory
Via npm
npm install -g @shodh/memory-mcp
shodh-memory serve
Or via Docker
docker run -d -p 3030:3030 -v shodh-data:/data ghcr.io/varun29ankus/shodh-memory:latest
Step 2: Create a Memory Tool
Give your LangChain agent tools to interact with shodh-memory:
import requests
from langchain.tools import tool
SHODH_URL = "http://localhost:3030"
@tool
def remember(content: str, tags: list[str] = None) -> str:
"""Store an important observation, decision, or learning for future recall."""
payload = {"content": content}
if tags:
payload["tags"] = tags
r = requests.post(f"{SHODH_URL}/api/remember", json=payload)
return "Stored in memory" if r.ok else f"Failed: {r.text}"
@tool
def recall(query: str, limit: int = 5) -> str:
"""Search memories for context relevant to the current task."""
r = requests.get(f"{SHODH_URL}/api/recall",
params={"query": query, "limit": limit})
if not r.ok:
return "No memories found"
memories = r.json().get("memories", [])
return "\n".join(m["content"] for m in memories) or "No relevant memories"
@tool
def get_context(current_task: str) -> str:
"""Get proactive context based on current task. Call this at the start of each session."""
r = requests.post(f"{SHODH_URL}/api/proactive_context",
json={"context": current_task})
if not r.ok:
return "No context available"
memories = r.json().get("memories", [])
return "\n".join(m["content"] for m in memories) or "No relevant context"
Step 3: Add Memory to Your Agent
from langchain.agents import AgentExecutor, create_openai_tools_agent
from langchain_openai import ChatOpenAI
from langchain.prompts import ChatPromptTemplate, MessagesPlaceholder
llm = ChatOpenAI(model="gpt-4")
tools = [remember, recall, get_context, ...your_other_tools]
prompt = ChatPromptTemplate.from_messages([
("system", """You are an AI assistant with persistent memory.
At the start of each session, call get_context with a description of the current task.
When you learn something important, call remember to store it.
When you need past context, call recall with a relevant query.
Your memory strengthens with use — frequently recalled knowledge becomes easier to find."""),
MessagesPlaceholder(variable_name="chat_history"),
("human", "{input}"),
MessagesPlaceholder(variable_name="agent_scratchpad"),
])
agent = create_openai_tools_agent(llm, tools, prompt)
executor = AgentExecutor(agent=agent, tools=tools)
Step 4: Automatic Memory via Post-Processing
For automatic memory capture (without relying on the agent to call `remember`), add a post-processing step:
def run_with_memory(executor, user_input: str, session_context: str = ""):
# Inject proactive context
context = requests.post(f"{SHODH_URL}/api/proactive_context",
json={"context": user_input}).json()
relevant_memories = context.get("memories", [])
memory_context = "\n".join(m["content"] for m in relevant_memories)
augmented_input = user_input
if memory_context:
augmented_input = f"[Relevant memory context:]\n{memory_context}\n\n{user_input}"
# Run agent
result = executor.invoke({"input": augmented_input, "chat_history": []})
# Auto-store the interaction
requests.post(f"{SHODH_URL}/api/remember", json={
"content": f"User asked: {user_input}\nAgent responded: {result['output'][:500]}",
"source_type": "ai_generated",
"tags": ["conversation", "auto-captured"]
})
return result
What You Get That LangChain's Memory Doesn't Provide
Hebbian Learning
When the agent recalls a memory and it turns out to be useful (you keep working in that direction), the memory strengthens. The next time a related question comes up, that memory surfaces with higher confidence. LangChain's memory treats every retrieval identically.
Automatic Decay
That one-off discussion about a library you didn't end up using? It naturally fades over 3 days (exponential decay) then slowly over weeks (power-law decay). No manual cleanup. LangChain's memory keeps everything forever.
Knowledge Graph
Entities are extracted from memories and connected in a graph. When you query "authentication," spreading activation surfaces related memories about JWT tokens, session handling, and that CORS fix — connections that vector similarity alone wouldn't find.
Cross-Session Persistence
Kill your Python process. Restart it tomorrow. Your agent's memories are still there. shodh-memory stores everything in RocksDB on disk. LangChain's memory is in-process and disappears when the process exits (unless you manually configure external storage).
Multi-Agent Sharing
Multiple agents can share the same shodh-memory instance. A planning agent's decisions are accessible to an execution agent. A research agent's findings are available to a writing agent. LangChain's memory is per-chain.
Migration Path
You don't have to rip out your existing LangChain memory. The recommended path:
1. **Start shodh-memory alongside your existing setup.** Run as a sidecar. Add the tools to your agent.
2. **Let both systems run in parallel.** LangChain's memory handles in-session context. shodh-memory handles cross-session persistence.
3. **Remove LangChain memory when confident.** Once shodh-memory has accumulated enough context (typically 1-2 weeks), you can simplify your chain by removing `ConversationBufferMemory` entirely. shodh-memory's proactive context handles what chat history used to do, but better.
The result: a LangChain agent that actually remembers what it learned, strengthens useful knowledge, forgets noise, and surfaces context before you ask for it.