2026-02-05•8 min read

Why We Chose Rust for AI Infrastructure (And When You Shouldn't)

Name: shodh-memory
Author: Shodh

rustengineeringperformance

rust-for-ai-infrastructure.md

Why We Chose Rust for AI Infrastructure (And When You Shouldn't)

Every engineering decision is a tradeoff. Choosing Rust for shodh-memory was one of the most consequential decisions we made — and one of the most debated.

Here's the honest version.

The Case For Rust

Memory Safety Without Garbage Collection

AI memory systems are long-running processes that accumulate state over months. In a GC language, you eventually hit the GC pause lottery — a 200ms stop-the-world pause right when your agent needs sub-millisecond recall.

Rust's ownership model eliminates this entirely. No GC. No pauses. Deterministic memory behavior. When you're running on a Raspberry Pi with 1GB RAM, this isn't academic — it's the difference between working and OOM-killing.

Cross-Compilation to Everything

shodh-memory ships as a single binary for Linux, macOS, Windows, and ARM64. One `cargo build --target` command per platform. No runtime dependencies, no virtual environments, no "works on my machine."

```

$ ls -lh target/release/shodh-memory

-rwxr-xr-x 1 user user 28M Feb 1 12:00 shodh-memory

One binary. 28MB. Everything included.

```

Try shipping a Python AI system as a single binary to ARM. We'll wait.

Zero-Cost Abstractions

Our Vamana graph search uses generic iterators over neighbor lists. In Python, this would involve heap allocations per iteration. In Rust, the compiler monomorphizes generics and inlines iterator chains — the abstraction compiles away to raw pointer arithmetic.

This matters when you're doing graph traversal over 33,000 edges. The difference between 48 seconds and 800 milliseconds wasn't algorithmic — it was systemic overhead elimination.

Fearless Concurrency

The memory system handles concurrent reads from multiple MCP clients while consolidation runs in the background. In most languages, this means careful mutex management and race condition debugging.

Rust's type system makes data races a compile error. Not a runtime bug. Not a "sometimes crashes in production." A compile error. The time we saved debugging concurrency issues paid for the Rust learning curve twice over.

The Case Against Rust

Iteration Speed

Compile times matter. A clean build of shodh-memory takes 3-4 minutes. Incremental builds are faster (15-30 seconds), but compared to Python's "just run it," there's friction.

When we were prototyping the decay model, we'd tweak a constant and wait 20 seconds to see the result. In Python, that's instant. For research-phase work, this friction adds up.

Ecosystem Gaps

ONNX Runtime bindings for Rust (the `ort` crate) work, but they're still at a release candidate. We depend on `ort 2.0.0-rc.11`. The Python ONNX ecosystem is years more mature.

Same for ML utilities. Need to tokenize text? In Python, `from transformers import AutoTokenizer`. In Rust, you're pulling in `tokenizers` and handling the raw API yourself.

When You Shouldn't Use Rust

• Prototyping and research: Use Python. Iterate fast, validate ideas, then port.

• Web APIs that don't need edge deployment: Go or TypeScript will ship faster.

• If your team doesn't know Rust: The learning curve is real. Budget 2-3 months for productivity.

Our Verdict

For a memory system that needs to run on edge devices, handle concurrent access, and operate for months without restarts — Rust was the right choice. The binary size, performance, and reliability benefits are structural advantages that compound over time.

But we still write the MCP server in TypeScript and the prototype scripts in Python. Use the right tool for the job.