2025-12-25•7 min read

Why Vector Search Alone Isn't Enough for Agent Memory

Name: shodh-memory
Author: Shodh

architecturevector-searchgraphs

why-not-just-vector-search.md

Why Vector Search Alone Isn't Enough

Vector databases are having a moment. But for AI agent memory, they're necessary—not sufficient.

What Vectors Do Well

Semantic similarity. Given a query, find memories that mean similar things. This is genuinely useful:

```

Query: "How do I optimize database queries?"

Match: "Use EXPLAIN ANALYZE to find slow operations"

```

The query and match share no keywords but are semantically related. Vectors handle this beautifully.

What Vectors Miss

1. Relationships

Vectors represent points in space. But memory is a graph. Consider:

• "The user prefers Rust" (node A)

• "The user is building a web server" (node B)

• A → B implies: prefer Axum over Express

Vector search finds A and B independently. It doesn't understand that A should influence recommendations when B is active.

2. Temporal Context

Vectors are timeless. But memory has sequence:

• 9:00am: User starts debugging auth flow

• 9:15am: User asks about JWT tokens

• 9:30am: User asks about token refresh

A pure vector search for "token" might surface unrelated token memories from weeks ago. Temporal context knows that recent auth-related memories are more relevant.

3. Importance

All vectors are born equal. But not all memories matter equally:

• "User mentioned liking dark mode" (strength: 0.3)

• "User's production system uses PostgreSQL" (strength: 0.9)

Vector similarity doesn't capture that the PostgreSQL fact is load-bearing knowledge while dark mode is a preference.

Our Hybrid Approach

Shodh-memory combines three systems:

```rust

pub struct MemoryCore {

vectors: HnswIndex, // Semantic similarity

graph: KnowledgeGraph, // Relationships

temporal: TemporalIndex, // Time-based access

}

```

Retrieval fuses all three signals:

```rust

fn retrieve(query: &str) -> Vec<Memory> {

let semantic = self.vectors.search(query, 20);

let graph = self.graph.spread_activation(query);

let temporal = self.temporal.recent_context();

// RRF fusion with learned weights

fuse_results(semantic, graph, temporal)

}

```

Benchmarks

On our agent-task benchmark (coding assistant scenarios):

| Approach | Precision@5 | MRR |

|----------|-------------|-----|

| Vector only | 0.67 | 0.71 |

| Vector + Graph | 0.79 | 0.83 |

| Full hybrid | 0.86 | 0.89 |

The graph adds relationship context. Temporal adds recency bias. Together they significantly outperform vectors alone.