New: morning proactivity checks

Pick up where your coding agent lost the thread.

A short handoff brief plus fixes to suggest first.

Munin turns your local Claude Code and Codex sessions into useful context, then runs morning checks so the next agent starts with the right fix.

Private beta signup

Early access invites go out in small batches. No spam. No sales sequence.

Works for Claude Code Codex CLI

The fast version

Useful before the agent types a word.

Munin starts the next session with the context a coding agent actually needs: the goal, decisions, dead ends, and next move.

See the full flow on Features ->
ARRIVE ORIENTED

Start with the handoff brief.

The goal, the solid facts, and the next move are already on screen.

CORRECTIONS LEARNED

Corrections stick.

If you keep fixing the same mistake, Munin saves that lesson so you do not type it again next week.

CHECKED FIRST

Bad briefs get blocked.

Munin checks the handoff against prior sessions. If the check fails, it backs off instead of guessing.

One picture

How Munin works in one pass.

Past sessions go in. A short handoff brief and proactive fix queue come out. The check at the bottom decides whether to trust it.

Orb diagram. Three context-loss moments feed a shared session journal. The journal feeds a circular memory core. The core produces a handoff brief, a fix suggestion, and a proactivity queue. A replay check below the core decides whether the handoff is safe to use.

Left side: where context breaks. Middle: the shared project memory. Right side: what the agent gets back before it waits.

See the full flow ->

Munin proactivity

Agents that are proactive. They suggest fixes first.

The morning runner checks strategy, continuity, proof freshness, and repeated friction before your first prompt. When there is a useful fix, Munin prepares the intervention instead of waiting for the agent to drift.

Scheduled

Starts before you do.

The schedule-install command adds the morning run on Windows, macOS, or Linux. New installs default to auto-spawn.

Munin Strategy

Finds work you did not spell out.

Munin Strategy joins your goals, KPIs, open loops, and session evidence, then suggests the task the agent should raise even when you never explicitly assigned it.

Learning

Records what happened.

Complete, defer, suppress, or fail the job. Tomorrow's run learns from the result, not just the suggestion.

$ munin proactivity status --scope sitesorted-business
schedule_local: 08:00
auto_spawn:     true
pending_jobs:   1
first_fix:      refresh replay proof before cutover

$ munin proactivity approve --job-id morning-2026-04-20
ok launched queued agent intervention
Three moments

Where you lose the thread. Where Munin picks it up.

These are the three moments where losing context hurts most.

You took a break.

You come back later and the agent has lost the thread.

Munin: it brings back the goal, what you already ruled out, and the next question.

You switched branch.

You check another branch, then come back. Now the agent is talking about the wrong files.

Munin: it keeps the handoff tied to the repo and branch you are actually in.

Your debug got interrupted.

You get pulled away mid-debug. The next day the agent treats dead ideas like they still matter.

Munin: it keeps the live ideas, drops stale ones, and carries forward what still looks true.

Why it works

Not chat search. Startup context your agent can use.

Most memory tools search old chats. Munin starts the next session with the useful context already selected.

Retrieval-first memory

  • Stores old chats and searches them later.
  • The agent still has to ask the right question.
  • Old advice and current advice look the same.
  • As history grows, the search gets noisier.

Munin: a checked startup brief

  • Starts with a short handoff brief, not a search box.
  • Keeps the fixes that repeat.
  • Drops advice that no longer holds up.
  • Checks the summary before trusting it.
What Munin does

Four outcomes, not four mechanisms.

Four things matter here: it remembers the work, keeps the useful parts, suggests fixes first, and stays tied to the repo you are in.

It remembers the work.

Munin reads your past Claude Code and Codex sessions and pulls out the parts that keep coming up.

Built from the work you already did.

It suggests fixes first.

If you give Munin priorities, it can point at the fix that matters most right now and explain why.

Less "what next?", more useful first moves.

It keeps what still holds up.

Rules get stronger when they keep being right. Old or shaky ones fade instead of hanging around forever.

You can see what looks solid and what does not.

It stays scoped.

The advice changes with the repo and branch, so the agent is less likely to drag old context into the wrong place.

The right project gets the right memory.

Munin Brain

Your agent reads the wrong tokens.

A lot of agent context is junk: retries, dead ends, old prompts. Munin puts the useful part first.

brain next step recall mistakes
$ munin brain --format prompt
<session_brain provider="codex" built_at="2026-04-15T02:48:55Z">
<agenda>
  - current ask: tighten agent-support wording + show Munin Brain
</agenda>
<state>
  - verified: beta signup flow deployed to muninmemory.com
  - finding: live build source is .firecrawl/eag/builds/munin.sitesorted.co
</state>
<project>
  - active ask: make product wording match Claude Code + Codex only
  - codebase map: docs.html  -  features.html  -  index.html  -  pricing.html  -  resources.html  -  blog
</project>
<user_operating_model>
  - profile: execution-heavy workflow; compact state beats transcript replay
  - watchout: front-load active ask + project capsule before wide search
</user_operating_model>
<guidance>
  - start from the current ask
  - avoid raw transcript dumps and stale cross-project noise
</guidance>
</session_brain>
$ munin nudge
<next_nudge>
priority:        P0
scope:           billing-retry
action:          Land retry-window fix before Thursday freeze
evidence:
  - 4 past sessions  -  retry window debate
  - 2 corrections  -  "window must match refund SLA"
  - 1 open ticket #482
confidence:      0.82
expected_effect: unblocks 3 downstream tickets
first_verify:    pnpm test -- src/billing/retry.spec.ts
</next_nudge>
$ munin recall "refund SLA"
3 matches across 6 months:

> 2026-03-18  -  feature/refund-window
  "refund SLA = 72h Pro, 24h Standard"
  session: claude-code  -  confirmed 2x

> 2026-02-04  -  docs/billing
  "see policy.md section 3 for SLA table"
  session: codex-cli

> 2026-01-22  -  fix/retry-window
  "retry window MUST match refund SLA"
  session: claude-code  -  rule candidate 0.96
$ munin friction --agent codex --last 30d
142 sessions analysed  -  3 patterns detected

rewrites_bun_to_npm      7x  prevent?
reorders_imports         5x  prevent?
misses_idempotency_lock  3x  record candidate?

$ munin promote "use bun, not npm"
ok observe-only rule candidate recorded
ok enforcement not active yet
ok review in friction/proof surfaces
What changes next

The short version beats replaying everything.

Raw replay makes the agent re-read stale retries and dead ends. Munin Brain starts with the goal, the project, and the watchouts that still matter.

Default session replay

  • The agent rereads too much just to get back on track.
  • Old retries and dead ends still take up space.
  • A lot of that history is not useful right now.

Munin Brain

  • Show the goal, project, and watchouts first.
  • Give the agent the handoff brief before the long history.
  • Less junk in context. Faster recovery.
Inside project memory

Every rule carries its evidence.

A rule needs receipts. It does not go live because it sounded good once.

Rules Evidence Next steps
integration-testsUse real Postgres, not mocks
Rule
evidence: 7 sessions - 3 corrections - 0 redirects freshness 0.94 stability 0.91 confidence 0.96
billing-retryLand retry-window fix before Thursday freeze
Next step
evidence: 4 sessions - 2 corrections - 1 ticket freshness 0.82 confidence 0.82
stripe-webhookSignature name invented 3x. Guardrail active
Hook
evidence: 3 redirects across 2 months freshness 0.88 stability 0.77
integration-tests7 sessions agreeing: "real Postgres beats mocks"
Evidence
first seen 2025-10-14 last confirmed 2h ago corrections 3
billing-retry4 sessions debating retry window vs. refund SLA
Evidence
open ticket #482 precedent count 4 confidence delta +0.18
stripe-webhookSignature parity re-invented 3x across 2 months
Evidence
redirects 3 time cost ~46 min guardrail active
P0 - billing-retryLand retry-window fix before Thursday freeze
Active
due Thu 16 Apr unblocks 3 tickets confidence 0.82
P1 - integration-testsPromote assertion to rule. Evidence threshold met
Queued
precedent 7 sessions stability 0.91 manual review ready
P2 - docs/policyUpdate policy.md section 3 with new SLA table
Follow-up
source next step #1 confidence 0.71 ~10 min
Every active memory, evidenced

A live timeline
of what your agents know.

Each active memory shows how fresh it is, how steady it is, and how much evidence it has.

Active memories - sitesorted 12 items
  • integration-tests Use real Postgres, not mocks 0.94 freshness0.91 stabilityrule
  • billing-retry Land retry-window fix before Thursday freeze 0.82 confidence4 sessionsnext step
  • stripe-webhook Signature header name invented 3x. Guardrail active 0.88 freshness3 redirectshook
  • codex-style Enforce repo prettier config on import rewrites 0.77 confidenceCodex onlydirective
  • bun-runtime Prefer bun over node for dev scripts 0.31 freshness downlast seen 42d agoaging out
From memory to action

Memories become things to do.

The point is not memory by itself. The point is what it changes next.

  1. Observation

    You've been pushing on billing retry reliability for two weeks. Thursday is the merge freeze. Strategy says Q2 priority is "reduce failed-checkout spillover."

    Action: next strategic task

    Land the retry-window fix before Thursday. It's the last open loop tied to your highest-priority metric. Evidence: 4 sessions, 2 corrections, 1 ticket. Confidence 0.82.

  2. Observation

    Three times across two months, the agent invented a Stripe webhook signature header that doesn't exist. Each time you corrected it by reading the real payload.

    Action: guardrail hook

    Add a pre-edit guardrail that blocks any Stripe webhook handler write until stripe listen payload is cited in the diff. Silent on everything else.

  3. Observation

    Every time you ship a migration you run the same five verification steps. You've redirected the agent through them twelve times.

    Action: reusable skill

    Promote "pre-migration verification" to a skill. Five steps, ordered, with the exact commands. Invokable by name; auto-suggested when a migration file is staged.

  4. Observation

    Seven sessions agree: in this repo, integration tests must hit a real Postgres, not a mock. You've corrected mock-first approaches every time.

    Action: guidance rule

    Promote to a rule: "integration tests use real Postgres." Attached evidence, freshness 0.94. Loaded into context on any test-file edit.

  5. Observation

    Codex keeps rewriting your TypeScript in the house style it prefers. Claude already follows your preferences. Two agents, one repo, different outputs.

    Action: per-agent directive

    Tell Codex specifically: "follow repo prettier config, do not restructure imports." Claude stays clean because it does not need the extra directive.

Measured, not promised

Startup briefs are proof-gated.

Munin reports whether the Memory OS read path has the required replay proof. When proof is missing, the status is blocked rather than silently trusted.

1.0000
replay check score
latest public replay check - 2026-04-13
blocked
until required proof rows exist
strict promotion gate - no silent trust
0
silent failures accepted
If the replay check drops, Munin falls back instead of guessing.
$ munin prove --last-resume Proof status
strict_gate_enabled: true
required_splits: proposed-kernel / test-private, adversarial-private
eligible: false
missing: independent verified replay rows

blocked is safer than pretending the brief is proven.

A fuller benchmark is coming before launch. The numbers on the receipts page are there so you can see what changed and when.

One project memory. Every file. Both agents.

Scoped to the file in front of you.

Open a file and Munin loads the rules that fit that repo and that part of the codebase.

sitesorted
  • src/billing/
  • retry.ts
  • src/webhook/
  • stripe.ts
  • .munin/
  • rules.yaml
  • journal/
src/billing/retry.ts
// retry window must match refund SLA (rule  -  0.96 conf)
export async function scheduleRetry(evt: RefundEvent) {
  const window = retryWindowFor(evt.reason);
  await idempotencyLock(evt.id);
  return queue.push(evt, { delay: window });
}
Rules active here
  • Idempotency lock firstEvery retry path must hold the ID lock before enqueue. Precedent: 6 sessions.
  • Retry window = refund SLAThursday freeze - confidence 0.82 - next step active.
  • Codex only: follow prettierHouse-style directive. Claude stays clean.
Beta

Join the beta before the public release.

Munin is still in beta. Join the list and we'll email you when invites open up.

We'll only use this for Munin beta access and rollout updates.

Private list Email updates only Access invites in batches
FAQ

The usual questions.

Is Munin a SaaS?

No. It's a local Rust binary that reads your agent sessions on disk and writes project memory to your machine.

Does it send my code anywhere?

No. There's no outbound network call. Local-only, no account, no telemetry.

How is this different from Claude Code's built-in memory?

Claude's built-in memory is notes you save yourself. Munin reads your local session history and builds the startup brief for you.

What stops the memory from going stale?

Each rule keeps a freshness score and an evidence trail. If the replay check gets weaker, Munin stops trusting it.

How does Munin suggest next steps?

You can give Munin priorities. It uses them to suggest the next task and explain why it picked it.

Does it work with Claude and Codex at once?

Yes. That is the point: one shared startup context across the Claude Code and Codex sessions you're already running.

What happens if Munin crashes mid-session?

The last good brief is saved as plain text. Open it, paste it into your agent, and keep going.

Private beta

Get the first invite.

Join the beta list and we'll email you when Munin is ready for early access.