Local-first memory for coding agents in 2026

Three of the five serious agent memory tools send your code context to a cloud. The two that keep everything on your disk are the ones worth running in 2026, and they don't overlap, so a solo dev can run both.

I write code alone. I don't want transcripts of my half-finished product sitting in someone else's Postgres. That makes the memory layer for my coding agents a harder call than most write-ups admit, because the obvious tools all assume you're fine shipping your corpus off the machine.

Here's how the landscape sorts in 2026, and what I run now.

What "local-first" means here

The model still runs in a datacenter. Claude, GPT, Gemini: all remote. So "local-first" isn't about the LLM. It's about the layer around it.

Your memory layer stores three things that matter: the corpus of past sessions, the extracted claims or facts, and the index used to recall them. Local-first means those three live on your disk, in your filesystem, readable with your own tools. The agent still phones home for inference. Your history doesn't.

That distinction is the whole argument. If you're okay with a vendor holding a searchable archive of every coding session you've run, the cloud options are fine. If that sentence made you flinch, keep reading.

Why the cloud tools still dominate

I want to be fair here. Zep Cloud, mem0 hosted, and the Letta API won on real merits.

Deployment is one command. You paste an API key, you get memory. No daemon to babysit, no SQLite file to back up, no Rust binary to update. For a team, the cloud versions give you shared memory across devices, a web UI for inspecting what the agent remembers, role-based access, and in Zep's case SOC 2 and HIPAA posture you can hand to a procurement team.

If you're a startup with five engineers and a compliance officer, the cloud tools are the right call. They solve problems I don't have.

What you get by going local

For a solo dev, the math flips. Going local gives you five things the SaaS versions can't:

  • Ownership. The memory file is yours. Grep it, delete it, git it, zip it.
  • No account. No signup, no billing, no deprovisioning when a vendor gets acquired.
  • No usage caps. You're not rate-limited on how much you remember.
  • Air-gap capable. If your code is under NDA, or you're on a plane, or the vendor is having a bad API day, memory still works.
  • No vendor dependency. The tool vanishing doesn't delete your history.

That last point is the one I think about most. I've watched enough dev tools get acquired and shut down to stop trusting that a hosted memory service will exist in three years.

The local options today

There are four tools in the conversation. Two are genuinely local. Two are sold as local but lean on a cloud API by default.

Genuinely local

claude-mem is a Bun daemon that runs on port 37777, with SQLite for storage, Chroma for vector search, and FTS5 for keyword lookup. AGPL-3.0. It hooks into Claude Code and the Gemini CLI, captures transcripts, and serves them back as context on your next session. Everything stays on disk. Scope is narrow by design: it's for the two CLIs it names, and it does that well.

Munin (this site) is a Rust binary that reads ~/.claude/, ~/.codex/, and the other agent session directories directly. Four commands: resume, nudge, promote, friction. The model is observation to claim to rule, so repeated corrections across agents get surfaced as friction you can actually fix. It's pre-customer, so expect rough edges, but the architecture is the thing: your data never leaves ~/.

Local in theory, cloud in practice

mem0 OSS installs from pip or npm and stores locally, which sounds great until you read the defaults: fact extraction calls OpenAI. So every memory write hits an external API even when the database is on your laptop. You can swap the extractor for a local model, but the out-of-the-box experience is not what the README implies.

Letta (formerly MemGPT) is self-hostable. You'll need Postgres running, plus the Letta server, plus whatever model backend you point it at. It works, but the dependency footprint is heavy for a solo setup. The Letta Code CLI is the interesting piece if you already run Postgres for something else.

The trade-offs you're making

Local-first isn't free. Here's what you give up:

  • No team view. Your memory is yours. If your co-founder wants the same context, they run their own instance and it's a separate corpus.
  • No cross-device sync. Memory on the desktop doesn't follow you to the laptop unless you sync the directory yourself with Syncthing, rsync, or a git repo.
  • No managed backups. If your disk dies and you didn't back up ~/.claude/ or the tool's data directory, the history is gone.
  • You're the ops team. Updates, schema migrations, daemon restarts: your job.

For me these are acceptable. I already back up my home directory. I don't have a team. I'd rather pay with ten minutes of setup than with a running subscription and a permanent copy of my work on someone else's server.

What I'd install

Concrete advice, based on where you spend your time:

If you live in Claude Code and that's 90% of your agent work, install claude-mem. It's the tightest fit for that workflow and the setup is a single bun install away. You'll get session continuity in the tool you already use, with zero data leaving your machine.

If you jump between Claude Code and Codex, and you're tired of correcting the same mistake in both tools, keep an eye on Munin. The cross-agent angle is the point: one source of truth for "I already told you not to use em dashes" that works across the CLIs you've actually tested.

If you want both, run both. They don't fight each other. claude-mem handles in-session recall for Claude Code specifically. Munin handles cross-agent patterns and corrections. Different layers of the same problem.

Pick one tonight. If you're evaluating Munin, join the beta list first and wait for the public crate release. You'll still want usable memory living in a directory you control, not on someone else's server.

Try Munin.

A local Rust binary. Four commands. No SaaS, no account, no telemetry.