Live replay is in beta

Your agents should know what to do next.

Munin reads every past Claude Code and Codex session on your machine, learns what works for you, and produces next-step suggestions tied to your strategy.

Private beta signup

Early access invites go out in small batches. No spam. No sales sequence.

Tested in the agents we actually use

  • Claude Code
  • Codex CLI
Munin Brain

Stop rereading the wrong million tokens.

In a normal session, the agent can burn a 1M-token window rereading stale detours just to recover your goals, current task, and constraints. Often 95% of that context is irrelevant. Munin Brain puts the live ask, project state, and watchouts into one compact surface first.

brain nudge recall friction
$ munin brain --format prompt
<session_brain provider="codex" built_at="2026-04-15T02:48:55Z">
<agenda>
  - current ask: remove unsupported agent claims + show Munin Brain
</agenda>
<state>
  - verified: beta signup flow deployed to muninmemory.com
  - finding: live build source is .firecrawl/eag/builds/munin.sitesorted.co
</state>
<project>
  - active ask: tighten product claims to Claude Code + Codex only
  - codebase map: docs.html · features.html · index.html · pricing.html · resources.html · blog
</project>
<user_operating_model>
  - profile: execution-heavy workflow; compact state beats transcript replay
  - friction: front-load active ask + project capsule before wide search
</user_operating_model>
<guidance>
  - start from the current ask
  - avoid raw transcript dumps and stale cross-project noise
</guidance>
</session_brain>
$ munin nudge
<next_nudge>
priority:        P0
scope:           billing-retry
action:          Land retry-window fix before Thursday freeze
evidence:
  - 4 past sessions · retry window debate
  - 2 corrections · "window must match refund SLA"
  - 1 open ticket #482
confidence:      0.82
expected_effect: unblocks 3 downstream tickets
first_verify:    pnpm test -- src/billing/retry.spec.ts
</next_nudge>
$ munin recall "refund SLA"
3 matches across 6 months:

 2026-03-18 · feature/refund-window
  "refund SLA = 72h Pro, 24h Standard"
  session: claude-code · confirmed 2×

 2026-02-04 · docs/billing
  "see policy.md §3 for SLA table"
  session: codex-cli

 2026-01-22 · fix/retry-window
  "retry window MUST match refund SLA"
  session: claude-code · promoted to rule 0.96
$ munin friction --agent codex --last 30d
142 sessions analysed · 3 patterns detected

rewrites_bun_to_npm        prevent?
reorders_imports           prevent?
misses_idempotency_lock    promote to rule?

$ munin promote "use bun, not npm"
 rule active for codex
 precedent: 7 corrections
 confidence: 0.91

Default session replay

  • The agent keeps rereading giant windows just to reconstruct your goals, current task, and constraints.
  • A 1M-token context still includes retries, dead branches, stale prompts, and garbage from old turns.
  • Most of that history is not relevant to what you need right now.

Munin Brain

  • Surface the active ask, project capsule, operating model, and watchouts first.
  • Give the agent the compact state it needs before it fans out into the wider corpus.
  • Less rereading, less garbage, faster alignment on the work that matters.
Four pillars

Memory, actions, learning, organisation.

Most AI memory products store notes your agent may or may not read. Munin reads your whole session corpus, learns what works for you, and produces ranked actions backed by your evidence.

Full session corpus ingestion

Munin reads every prior Claude Code and Codex session across all projects, all time. It journals, deduplicates, and reconciles every command, correction, redirect, and verification into one shared working history.

Munin builds the memory from what already happened.

Proactive next-step nudges

Import your strategy document: goals, KPIs, initiatives, constraints. Munin compiles it into a deterministic kernel and joins it with your evidence corpus. Each nudge carries the task, why it matters now, supporting evidence, confidence, and expected effect.

The agent stops asking and starts proposing.

Memory that promotes and forgets

Action candidates earn their way to rules. Munin ranks by precedent count, success count, and failure count across every session. Candidates promote to assertions, assertions promote to rules, observe-only by default. Stale memory ages out. Re-confirmed memory strengthens.

Freshness, stability, and confidence scores on every claim.

Per-project workspace organisation

Messy repo, messy agent. Munin maintains a per-project kernel: open loops, live claims, checkpoints, current recommendation, active risks. Every session re-entry arrives with a first question to answer and a first verification to run.

The agent hits the right target because the target is explicit.

Inside the memory kernel

Every rule carries its evidence.

Candidates become assertions, assertions become rules. Nothing promotes without a precedent count, a freshness score, and a confidence number.

Rules Evidence Nudges
integration-testsUse real Postgres, not mocks
Rule
evidence: 7 sessions · 3 corrections · 0 redirects freshness 0.94 stability 0.91 confidence 0.96
billing-retryLand retry-window fix before Thursday freeze
Nudge
evidence: 4 sessions · 2 corrections · 1 ticket freshness 0.82 confidence 0.82
stripe-webhookSignature name invented 3×. Prevention active
Hook
evidence: 3 redirects across 2 months freshness 0.88 stability 0.77
integration-tests7 sessions agreeing: "real Postgres beats mocks"
Evidence
first seen 2025-10-14 last confirmed 2h ago corrections 3
billing-retry4 sessions debating retry window vs. refund SLA
Evidence
open ticket #482 precedent count 4 confidence delta +0.18
stripe-webhookSignature parity re-invented 3× across 2 months
Evidence
redirects 3 time cost ~46 min prevention hook active
P0 · billing-retryLand retry-window fix before Thursday freeze
Active
due Thu 16 Apr unblocks 3 tickets confidence 0.82
P1 · integration-testsPromote assertion to rule. Evidence threshold met
Queued
precedent 7 sessions stability 0.91 auto-promote ready
P2 · docs/policyUpdate policy.md §3 with new SLA table
Follow-up
source nudge #1 confidence 0.71 ~10 min
Every claim, evidenced

A live timeline of what your agents know.

Every live claim carries freshness, stability, and confidence. Stale memory ages out. Re-confirmed memory strengthens. You never wonder if a rule is still true.

Live claims · sitesorted 12 claims
  • integration-tests Use real Postgres, not mocks 0.94 freshness0.91 stabilityrule
  • billing-retry Land retry-window fix before Thursday freeze 0.82 confidence4 sessionsnudge
  • stripe-webhook Signature header name invented 3×. Prevention hook active 0.88 freshness3 redirectshook
  • codex-style Enforce repo prettier config on import rewrites 0.77 confidenceCodex onlydirective
  • bun-runtime Prefer bun over node for dev scripts 0.31 freshness ↓last seen 42d agoaging out
From memory to action

Memories become things to do.

Every promoted claim generates at least one of five action types.

  1. Observation

    You've been pushing on billing retry reliability for two weeks. Thursday is the merge freeze. Strategy says Q2 priority is "reduce failed-checkout spillover."

    Action: next strategic task

    Land the retry-window fix before Thursday. It's the last open loop tied to your highest-priority metric. Evidence: 4 sessions, 2 corrections, 1 ticket. Confidence 0.82.

  2. Observation

    Three times across two months, the agent invented a Stripe webhook signature header that doesn't exist. Each time you corrected it by reading the real payload.

    Action: prevention hook

    Add a pre-edit hook that blocks any Stripe webhook handler write until stripe listen payload is cited in the diff. Silent on everything else.

  3. Observation

    Every time you ship a migration you run the same five verification steps. You've redirected the agent through them twelve times.

    Action: reusable skill

    Promote "pre-migration verification" to a skill. Five steps, ordered, with the exact commands. Invokable by name; auto-suggested when a migration file is staged.

  4. Observation

    Seven sessions agree: in this repo, integration tests must hit a real Postgres, not a mock. You've corrected mock-first approaches every time.

    Action: guidance rule

    Promote to a rule: "integration tests use real Postgres." Attached evidence, freshness 0.94. Loaded into context on any test-file edit.

  5. Observation

    Codex keeps rewriting your TypeScript in the house style it prefers. Claude already follows your preferences. Two agents, one repo, different outputs.

    Action: per-agent directive

    Tell Codex specifically: "follow repo prettier config, do not restructure imports." Claude stays clean because it does not need the extra directive.

Proof-gated continuity

No silent regressions. No trust-me-bro memory.

Resume and handoff briefs stay active while the latest replay/eval against a held-out corpus verifies. If proof drops, Munin falls back to a deterministic packet path.

99.5%
vitest run token reduction
Tokens compressed · 12k sessions
90%
cargo test token reduction
Tokens compressed · 4k sessions
87%
next build token reduction
Tokens compressed · 8k sessions

Token reductions are a side effect, not the headline. The compressed-read surface exists so the agent can read more and decide better. Not so you save dollars on a model bill.

Replay-verified

Every resume carries a signed proof packet.

Resume briefs stay active while the latest replay against a held-out corpus verifies. If proof drops, Munin falls back to a deterministic path.

$ munin prove --last-resume ✓ Verified
[13:42:04] Replaying 12 live claims against held-out corpus...
[13:42:06] Reconciled 847 prior sessions across 2 active agents.
[13:42:08] Freshness drift: <0.02   Stability drift: <0.01
[13:42:09] Proof packet: ✓ 12/12 claims verified
[13:42:09] Signed: 2026-04-14T13:42:09Z

resume brief accepted — all continuity checks passed.
Two tested surfaces

Tell Claude and Codex only what they need.

Friction reports identify repeated corrections, redirect statistics, and behaviour-change recommendations per agent. Munin keeps shared context in one place while still letting Claude and Codex receive different guidance when they need it.

Claude Code

Full continuity brief on session start. Prevention hooks for things you've corrected more than twice. Skills promoted from repeated workflows.

Codex CLI

House-style directives where Codex diverges from your repo. Silence where it doesn't. Hooks re-expressed in the format Codex honours.

Munin Brain

Munin Brain exposes the active ask, project capsule, user operating model, and guidance hints before the agent starts rereading the wider corpus.

One surface, two tested agents

The same rules reach every file in Claude and Codex.

Open a file and Munin loads only the rules that apply to it. Claude and Codex both read from the same kernel, while Codex-specific directives stay Codex-specific.

sitesorted
  • src/billing/
  • retry.ts
  • src/webhook/
  • stripe.ts
  • .munin/
  • rules.yaml
  • journal/
src/billing/retry.ts
// retry window must match refund SLA (rule · 0.96 conf)
export async function scheduleRetry(evt: RefundEvent) {
  const window = retryWindowFor(evt.reason);
  await idempotencyLock(evt.id);
  return queue.push(evt, { delay: window });
}
Rules active here
  • Idempotency lock firstEvery retry path must hold the ID lock before enqueue. Precedent: 6 sessions.
  • Retry window = refund SLAThursday freeze · confidence 0.82 · nudge active.
  • Codex only: follow prettierHouse-style directive. Claude stays clean.
Beta

Join the beta before the public release.

Munin is not publicly available yet. Leave your email and we’ll send early access invites, rollout notes, and the first stable build when it’s ready.

We’ll only use this for Munin beta access and rollout updates.

Private list Email updates only Access invites in batches
FAQ

The usual questions.

Is Munin a SaaS?

No. It's a local Rust binary that reads your sessions on disk and writes memory to your machine.

Does it send my code anywhere?

No. There's no outbound network call. Local-only, no account, no telemetry.

How is this different from Claude Code's built-in memory?

Claude's built-in memory is notes you write. Munin builds the memory itself from your session corpus and produces ranked actions tied to your strategy.

What stops the memory from going stale?

Every promoted claim carries freshness, stability, and confidence. Replay/eval proof-gates the cutover. If evidence drops, the claim falls back.

How does Munin suggest next steps?

You import a strategy document. Munin joins it with your evidence and produces bounded nudges with evidence, confidence, and expected effect.

Does it work with Claude and Codex at once?

Yes. That is the point: one memory surface across the Claude Code and Codex sessions you're already running.

What happens if Munin crashes mid-session?

The last good brief is in .munin/journal as plain text. Open it, paste it into your agent, keep going. Munin is a local Rust binary. No server, no recovery flow.

Private beta

Get the first invite.

Join the beta list and we’ll email you when Munin is ready for early access.