2026-01-22•9 min read

RocksDB for AI Workloads: Lessons from Building a Memory Engine

Name: shodh-memory
Author: Shodh

storageengineeringrocksdb

rocksdb-for-ai-workloads.md

RocksDB for AI Workloads: Lessons from Building a Memory Engine

When we started building shodh-memory, the storage question seemed straightforward. SQLite? PostgreSQL? Custom B-tree? We ended up with RocksDB, and the reasons are worth understanding.

Why Not SQLite

SQLite is excellent for structured relational data. But AI memory workloads are weird:

• Variable-length blobs (serialized memory records, embeddings)

• High write throughput (every interaction generates memories)

• Prefix scans ("find all episodes for entity X")

• Column-family isolation (memories, embeddings, entities, edges — each with different access patterns)

SQLite handles the first three adequately. The fourth is the dealbreaker. In SQLite, everything shares one B-tree namespace. In RocksDB, column families are independent LSM trees with separate compaction, separate bloom filters, and separate block caches.

When you're scanning the entity-episodes index (prefix scan, sequential reads), you don't want that competing with random-read point lookups on the embeddings column family. Column families give you this isolation.

Why Not PostgreSQL

Two words: single binary. shodh-memory ships as one 28MB executable. Adding a PostgreSQL dependency means your users need a database server. On a Raspberry Pi. In an air-gapped factory.

No.

The RocksDB Architecture

shodh-memory uses 12+ column families:

```

memories — Core memory records (MessagePack)

embeddings — 384-dim float32 vectors

entities — Knowledge graph nodes

edges — Knowledge graph relationships

entity_episodes — Entity-to-episode index

todos — GTD task records

projects — Todo project metadata

reminders — Prospective memory triggers

facts — Extracted factual assertions

files — File access records

feedback — Implicit feedback signals

audit — Operation audit log

```

Each column family has tuned options. Embeddings use larger block sizes (64KB) because reads are always full-vector. The entity_episodes index uses prefix bloom filters for fast prefix scans. The audit log uses FIFO compaction to auto-prune old entries.

MessagePack Over JSON

We serialize memory records with MessagePack instead of JSON. The reasons:

• 30-40% smaller on disk (no key repetition, binary encoding)

• 2-3x faster serialization/deserialization

• Native binary support (embeddings don't need base64 encoding)

For backward compatibility, we have a 4-level deserialization fallback: MessagePack → JSON (legacy) → bincode (historical) → raw bytes. Migrations happen lazily — records are upgraded when they're next written.

Write-Ahead Logging

Every memory write goes through RocksDB's WAL before it's acknowledged. This means a power failure can't corrupt your memory database. On edge devices where power is unreliable (robots, IoT), this is non-negotiable.

We default to async writes (<1ms latency) for normal operations and sync writes (2-10ms) for critical paths like backup. The async mode doesn't skip the WAL — it just doesn't wait for the OS to flush to disk. In practice, you lose at most the last few milliseconds of writes on a hard crash.

Prefix Iterators for Graph Traversal

The knowledge graph's entity-episode index is keyed as `{entity_uuid}:{episode_uuid}`. To find all episodes for an entity, we use RocksDB's prefix iterator:

```

let prefix = format!("{entity_uuid}:");

let iter = db.prefix_iterator(prefix.as_bytes());

```

This is a seek + sequential scan, hitting only the relevant key range. With prefix bloom filters enabled, the seek is O(1) amortized. Compare this to a SQL query that would scan an index and then do random page reads for each row.

Lessons Learned

• **Tune per column family.** Default settings are mediocre for everything. Profile your actual access patterns.

• **Use multi_get for batch reads.** A single multi_get call with 50 keys is 5-10x faster than 50 individual get calls. We learned this the hard way (issue #20: 48-second cold-start queries).

• **Compaction matters.** Level compaction for frequently-read data. FIFO for append-only logs. Universal compaction for write-heavy workloads.

• **Don't forget bloom filters.** A 10-bit bloom filter reduces negative lookups from O(log N) to O(1). For a memory system where "not found" is a common result, this is transformative.