Pick up where your coding agent lost the thread.

A short handoff brief plus fixes to suggest first.

Munin turns your local Claude Code and Codex sessions into useful context, then runs morning checks so the next agent starts with the right fix.

See the replay check ->

Works for Claude Code Codex CLI

The fast version

Useful before the agent types a word.

Munin starts the next session with the context a coding agent actually needs: the goal, decisions, dead ends, and next move.

See the full flow on Features ->

ARRIVE ORIENTED

Start with the handoff brief.

The goal, the solid facts, and the next move are already on screen.

CORRECTIONS LEARNED

Corrections stick.

If you keep fixing the same mistake, Munin saves that lesson so you do not type it again next week.

CHECKED FIRST

Bad briefs get blocked.

Munin checks the handoff against prior sessions. If the check fails, it backs off instead of guessing.

One picture

How Munin works in one pass.

Past sessions go in. A short handoff brief and proactive fix queue come out. The check at the bottom decides whether to trust it.

Left side: where context breaks. Middle: the shared project memory. Right side: what the agent gets back before it waits.

See the full flow ->

Munin proactivity

Agents that are proactive. They suggest fixes first.

The morning runner checks strategy, continuity, proof freshness, and repeated friction before your first prompt. When there is a useful fix, Munin prepares the intervention instead of waiting for the agent to drift.

Scheduled

Starts before you do.

The schedule-install command adds the morning run on Windows, macOS, or Linux. New installs default to auto-spawn.

Munin Strategy

Finds work you did not spell out.

Munin Strategy joins your goals, KPIs, open loops, and session evidence, then suggests the task the agent should raise even when you never explicitly assigned it.

Learning

Records what happened.

Complete, defer, suppress, or fail the job. Tomorrow's run learns from the result, not just the suggestion.

$ munin proactivity status --scope sitesorted-business
schedule_local: 08:00
auto_spawn:     true
pending_jobs:   1
first_fix:      refresh replay proof before cutover

$ munin proactivity approve --job-id morning-2026-04-20
ok launched queued agent intervention

Three moments

Where you lose the thread. Where Munin picks it up.

These are the three moments where losing context hurts most.

You took a break.

You come back later and the agent has lost the thread.

Munin: it brings back the goal, what you already ruled out, and the next question.

You switched branch.

You check another branch, then come back. Now the agent is talking about the wrong files.

Munin: it keeps the handoff tied to the repo and branch you are actually in.

Your debug got interrupted.

You get pulled away mid-debug. The next day the agent treats dead ideas like they still matter.

Munin: it keeps the live ideas, drops stale ones, and carries forward what still looks true.

Why it works

Not chat search. Startup context your agent can use.

Most memory tools search old chats. Munin starts the next session with the useful context already selected.

Retrieval-first memory

Stores old chats and searches them later.
The agent still has to ask the right question.
Old advice and current advice look the same.
As history grows, the search gets noisier.

Munin: a checked startup brief

Starts with a short handoff brief, not a search box.
Keeps the fixes that repeat.
Drops advice that no longer holds up.
Checks the summary before trusting it.

What Munin does

Four outcomes, not four mechanisms.

Four things matter here: it remembers the work, keeps the useful parts, suggests fixes first, and stays tied to the repo you are in.

It remembers the work.

Munin reads your past Claude Code and Codex sessions and pulls out the parts that keep coming up.

Built from the work you already did.

It suggests fixes first.

If you give Munin priorities, it can point at the fix that matters most right now and explain why.

Less "what next?", more useful first moves.

It keeps what still holds up.

Rules get stronger when they keep being right. Old or shaky ones fade instead of hanging around forever.

You can see what looks solid and what does not.

It stays scoped.

The advice changes with the repo and branch, so the agent is less likely to drag old context into the wrong place.

The right project gets the right memory.

Munin Brain

Your agent reads the wrong tokens.

A lot of agent context is junk: retries, dead ends, old prompts. Munin puts the useful part first.

brain next step recall mistakes

$ munin brain --format prompt
<session_brain provider="codex" built_at="2026-04-15T02:48:55Z">
<agenda>
  - current ask: tighten agent-support wording + show Munin Brain
</agenda>
<state>
  - verified: beta signup flow deployed to muninmemory.com
  - finding: live build source is .firecrawl/eag/builds/munin.sitesorted.co
</state>
<project>
  - active ask: make product wording match Claude Code + Codex only
  - codebase map: docs.html  -  features.html  -  index.html  -  pricing.html  -  resources.html  -  blog
</project>
<user_operating_model>
  - profile: execution-heavy workflow; compact state beats transcript replay
  - watchout: front-load active ask + project capsule before wide search
</user_operating_model>
<guidance>
  - start from the current ask
  - avoid raw transcript dumps and stale cross-project noise
</guidance>
</session_brain>

$ munin nudge
<next_nudge>
priority:        P0
scope:           billing-retry
action:          Land retry-window fix before Thursday freeze
evidence:
  - 4 past sessions  -  retry window debate
  - 2 corrections  -  "window must match refund SLA"
  - 1 open ticket #482
confidence:      0.82
expected_effect: unblocks 3 downstream tickets
first_verify:    pnpm test -- src/billing/retry.spec.ts
</next_nudge>

$ munin recall "refund SLA"
3 matches across 6 months:

> 2026-03-18  -  feature/refund-window
  "refund SLA = 72h Pro, 24h Standard"
  session: claude-code  -  confirmed 2x

> 2026-02-04  -  docs/billing
  "see policy.md section 3 for SLA table"
  session: codex-cli

> 2026-01-22  -  fix/retry-window
  "retry window MUST match refund SLA"
  session: claude-code  -  rule candidate 0.96

$ munin friction --agent codex --last 30d
142 sessions analysed  -  3 patterns detected

rewrites_bun_to_npm      7x  prevent?
reorders_imports         5x  prevent?
misses_idempotency_lock  3x  record candidate?

$ munin promote "use bun, not npm"
ok observe-only rule candidate recorded
ok enforcement not active yet
ok review in friction/proof surfaces

What changes next

The short version beats replaying everything.

Raw replay makes the agent re-read stale retries and dead ends. Munin Brain starts with the goal, the project, and the watchouts that still matter.

Default session replay

The agent rereads too much just to get back on track.
Old retries and dead ends still take up space.
A lot of that history is not useful right now.

Munin Brain

Show the goal, project, and watchouts first.
Give the agent the handoff brief before the long history.
Less junk in context. Faster recovery.

Inside project memory

Every rule carries its evidence.

A rule needs receipts. It does not go live because it sounded good once.

Rules Evidence Next steps

integration-testsUse real Postgres, not mocks

Rule

evidence: 7 sessions - 3 corrections - 0 redirects freshness 0.94 stability 0.91 confidence 0.96

billing-retryLand retry-window fix before Thursday freeze

Next step

evidence: 4 sessions - 2 corrections - 1 ticket freshness 0.82 confidence 0.82

stripe-webhookSignature name invented 3x. Guardrail active

Hook

evidence: 3 redirects across 2 months freshness 0.88 stability 0.77

integration-tests7 sessions agreeing: "real Postgres beats mocks"

Evidence

first seen 2025-10-14 last confirmed 2h ago corrections 3

billing-retry4 sessions debating retry window vs. refund SLA

Evidence

open ticket #482 precedent count 4 confidence delta +0.18

stripe-webhookSignature parity re-invented 3x across 2 months

Evidence

redirects 3 time cost ~46 min guardrail active

P0 - billing-retryLand retry-window fix before Thursday freeze

Active

due Thu 16 Apr unblocks 3 tickets confidence 0.82

P1 - integration-testsPromote assertion to rule. Evidence threshold met

Queued

precedent 7 sessions stability 0.91 manual review ready

P2 - docs/policyUpdate policy.md section 3 with new SLA table

Follow-up

source next step #1 confidence 0.71 ~10 min

Every active memory, evidenced

A live timeline
of what your agents know.

Each active memory shows how fresh it is, how steady it is, and how much evidence it has.

Active memories - sitesorted 12 items

integration-tests Use real Postgres, not mocks 0.94 freshness0.91 stabilityrule
billing-retry Land retry-window fix before Thursday freeze 0.82 confidence4 sessionsnext step
stripe-webhook Signature header name invented 3x. Guardrail active 0.88 freshness3 redirectshook
codex-style Enforce repo prettier config on import rewrites 0.77 confidenceCodex onlydirective
bun-runtime Prefer bun over node for dev scripts 0.31 freshness downlast seen 42d agoaging out

From memory to action

Memories become things to do.

The point is not memory by itself. The point is what it changes next.

Observation
You've been pushing on billing retry reliability for two weeks. Thursday is the merge freeze. Strategy says Q2 priority is "reduce failed-checkout spillover."

Action: next strategic task
Land the retry-window fix before Thursday. It's the last open loop tied to your highest-priority metric. Evidence: 4 sessions, 2 corrections, 1 ticket. Confidence 0.82.
Observation
Three times across two months, the agent invented a Stripe webhook signature header that doesn't exist. Each time you corrected it by reading the real payload.

Action: guardrail hook
Add a pre-edit guardrail that blocks any Stripe webhook handler write until stripe listen payload is cited in the diff. Silent on everything else.
Observation
Every time you ship a migration you run the same five verification steps. You've redirected the agent through them twelve times.

Action: reusable skill
Promote "pre-migration verification" to a skill. Five steps, ordered, with the exact commands. Invokable by name; auto-suggested when a migration file is staged.
Observation
Seven sessions agree: in this repo, integration tests must hit a real Postgres, not a mock. You've corrected mock-first approaches every time.

Action: guidance rule
Promote to a rule: "integration tests use real Postgres." Attached evidence, freshness 0.94. Loaded into context on any test-file edit.
Observation
Codex keeps rewriting your TypeScript in the house style it prefers. Claude already follows your preferences. Two agents, one repo, different outputs.

Action: per-agent directive
Tell Codex specifically: "follow repo prettier config, do not restructure imports." Claude stays clean because it does not need the extra directive.

Measured, not promised

Startup briefs are proof-gated.

Munin reports whether the Memory OS read path has the required replay proof. When proof is missing, the status is blocked rather than silently trusted.

1.0000

replay check score

latest public replay check - 2026-04-13

blocked

until required proof rows exist

strict promotion gate - no silent trust

silent failures accepted

If the replay check drops, Munin falls back instead of guessing.

$ munin prove --last-resume Proof status

strict_gate_enabled: true
required_splits: proposed-kernel / test-private, adversarial-private
eligible: false
missing: independent verified replay rows

blocked is safer than pretending the brief is proven.

A fuller benchmark is coming before launch. The numbers on the receipts page are there so you can see what changed and when.

One project memory. Every file. Both agents.

Scoped to the file in front of you.

Open a file and Munin loads the rules that fit that repo and that part of the codebase.

sitesorted

src/billing/
retry.ts
src/webhook/
stripe.ts
.munin/
rules.yaml
journal/

src/billing/retry.ts

// retry window must match refund SLA (rule  -  0.96 conf)
export async function scheduleRetry(evt: RefundEvent) {
  const window = retryWindowFor(evt.reason);
  await idempotencyLock(evt.id);
  return queue.push(evt, { delay: window });
}

Rules active here

Idempotency lock firstEvery retry path must hold the ID lock before enqueue. Precedent: 6 sessions.
Retry window = refund SLAThursday freeze - confidence 0.82 - next step active.
Codex only: follow prettierHouse-style directive. Claude stays clean.

Beta

Join the beta before the public release.

Munin is still in beta. Join the list and we'll email you when invites open up.

Private list Email updates only Access invites in batches

Read the docs Full features ->

FAQ

The usual questions.

Is Munin a SaaS?

No. It's a local Rust binary that reads your agent sessions on disk and writes project memory to your machine.

Does it send my code anywhere?

No. There's no outbound network call. Local-only, no account, no telemetry.

How is this different from Claude Code's built-in memory?

Claude's built-in memory is notes you save yourself. Munin reads your local session history and builds the startup brief for you.

What stops the memory from going stale?

Each rule keeps a freshness score and an evidence trail. If the replay check gets weaker, Munin stops trusting it.

How does Munin suggest next steps?

You can give Munin priorities. It uses them to suggest the next task and explain why it picked it.

Does it work with Claude and Codex at once?

Yes. That is the point: one shared startup context across the Claude Code and Codex sessions you're already running.

What happens if Munin crashes mid-session?

The last good brief is saved as plain text. Open it, paste it into your agent, and keep going.

Private beta

Get the first invite.

Join the beta list and we'll email you when Munin is ready for early access.

Join beta Read the docs ->

Pick up where your coding agent lost the thread.

Useful before the agent types a word.

Start with the handoff brief.

Corrections stick.

Bad briefs get blocked.

How Munin works in one pass.

Agents that are proactive. They suggest fixes first.

Starts before you do.

Finds work you did not spell out.

Records what happened.

Where you lose the thread. Where Munin picks it up.

You took a break.

You switched branch.

Your debug got interrupted.

Not chat search. Startup context your agent can use.

Retrieval-first memory

Munin: a checked startup brief

Four outcomes, not four mechanisms.

It remembers the work.

It suggests fixes first.

It keeps what still holds up.

It stays scoped.

Your agent reads the wrong tokens.

The short version beats replaying everything.

Default session replay

Munin Brain

Every rule carries its evidence.

A live timelineof what your agents know.

Memories become things to do.

Startup briefs are proof-gated.

Scoped to the file in front of you.

Join the beta before the public release.

The usual questions.

Get the first invite.

A live timeline
of what your agents know.