Claude Code has two background processes running alongside every conversation that most users never see.
The first runs at the end of each query: a forked sub-agent scans the transcript and writes durable
facts to disk. The second runs across sessions: a consolidation agent wakes up after enough time and
sessions have accumulated, reads the memory files, and rewrites them into something tighter.
Together they form a two-layer memory lifecycle — extraction and consolidation — implemented in
services/extractMemories/ and services/autoDream/.
services/extractMemories/extractMemories.ts ·
services/extractMemories/prompts.ts ·
services/autoDream/autoDream.ts ·
services/autoDream/config.ts ·
services/autoDream/consolidationLock.ts ·
services/autoDream/consolidationPrompt.ts ·
memdir/memdir.ts ·
memdir/memoryTypes.ts ·
memdir/paths.ts ·
memdir/memoryScan.ts ·
memdir/findRelevantMemories.ts ·
memdir/memoryAge.ts
Extract Memories
After each final response, a forked agent reviews new messages and writes topic files to disk.
Auto Dream
After 24 h + 5 sessions, a consolidation agent merges, prunes, and re-indexes the memory directory.
Relevant Recall
A Sonnet side-query selects up to 5 topic files that match the current query and injects them.
All persistent memory lives under a path computed by getAutoMemPath()
in memdir/paths.ts. The resolution order is:
// memdir/paths.ts — resolution order (first defined wins) 1. CLAUDE_COWORK_MEMORY_PATH_OVERRIDE // env var — Cowork space-scoped mount 2. getSettingsForSource('localSettings').autoMemoryDirectory // settings.json 3. `~/.claude/projects/<sanitized-git-root>/memory/` // default
The path is keyed on the canonical git root — all worktrees of the same repo share
one memory directory. The function is memoized on getProjectRoot() because render-path
callers fire on every tool-use re-render and the underlying logic (four getSettingsForSource
calls each involving realpathSync) is not free.
projectSettings (committed .claude/settings.json) is intentionally excluded from
the autoMemoryDirectory override. A malicious repo could otherwise silently redirect
Claude's writes to ~/.ssh.
Directory structure
~/.claude/projects/<slug>/memory/ MEMORY.md # index — always injected into system prompt user_role.md # topic files — one memory per file feedback_testing.md project_auth.md .consolidate-lock # mtime == lastConsolidatedAt logs/ # KAIROS/assistant-mode only: append-only daily logs 2026/03/2026-03-31.md
MEMORY.md is the index. It is always loaded into the system prompt
(truncated at 200 lines / 25 KB). Each line is a short pointer:
- [Title](file.md) — one-line hook. The actual content lives in topic files.
Memories are constrained to a closed taxonomy defined in memdir/memoryTypes.ts.
The taxonomy exists to prevent content that is derivable from the current project state
from polluting the store — code patterns, architecture, and git history are explicitly excluded.
User Profile
Role, goals, knowledge level, communication preferences. Helps Claude tailor future responses to this specific person.
Behavioral Guidance
Corrections AND confirmations. Stores the rule + Why: + How to apply: so edge cases can be judged, not blindly followed.
Project Context
Ongoing work, deadlines, incidents. Not derivable from code or git. Decays fast — the "why" line tells future-you if the memory is still load-bearing.
External Pointers
Where to find information in external systems: Linear projects, Grafana dashboards, Slack channels.
Each topic file uses YAML frontmatter:
--- name: feedback_no_db_mocks description: Integration tests must use a real DB — mocks masked a migration failure type: feedback --- # Don't mock the database in integration tests **Why:** Prior incident where mock/prod divergence let a broken migration slip to production. **How to apply:** Any test that exercises a DB query path must use a real (test) database.
CLAUDE.md. These exclusions apply even when the user explicitly asks. The rationale:
code is always live-readable; memories of it go stale and generate false authority.
At the end of every query loop (when the model produces a final response with no tool calls),
handleStopHooks fires executeExtractMemories() from
services/extractMemories/extractMemories.ts. It uses the forked agent pattern
— a perfect copy of the main conversation that shares the parent's prompt cache.
Closure-scoped state
All mutable state lives inside initExtractMemories() — not at module level.
This is the same pattern used by confidenceRating.ts. Tests call
initExtractMemories() in beforeEach to get a fresh closure.
The key state variables are:
let lastMemoryMessageUuid: string | undefined // cursor — only new messages since last run let inProgress: boolean // overlap guard let pendingContext: ... // stash for trailing run let turnsSinceLastExtraction: number // throttle counter
Overlap coalescing
If an extraction is already running when a new turn completes, the new context is stashed in
pendingContext. When the running extraction finishes, it fires one trailing run using
the stashed context — so no extraction is lost, but runs never overlap. Only the latest
stashed context matters (it has the most messages), so repeated coalescing overwrites the stash.
Mutual exclusion with the main agent
The main agent's system prompt always includes full memory-save instructions.
When the main agent writes to a memory path itself (detected via
hasMemoryWritesSince()), the forked extraction is skipped entirely —
advancing the cursor past that range so the two paths never double-write the same turn.
// extractMemories.ts — mutual exclusion check if (hasMemoryWritesSince(messages, lastMemoryMessageUuid)) { // advance cursor, log event, return early logEvent('tengu_extract_memories_skipped_direct_write', { message_count }) return }
The extraction flow
Tool permissions for the forked agent
createAutoMemCanUseTool() returns a permission function shared by both
extractMemories and autoDream. It enforces a strict allow-list:
// extractMemories.ts — createAutoMemCanUseTool() if (tool.name === FILE_READ_TOOL_NAME || GREP || GLOB) return allow() if (tool.name === BASH_TOOL_NAME && tool.isReadOnly(parsed.data)) return allow() if ((EDIT || WRITE) && isAutoMemPath(input.file_path)) return allow() return denyAutoMemTool(tool, reason) // logs + fires analytics event
The extraction prompt
The prompt sent to the forked agent (from prompts.ts) has a deliberate
efficiency instruction that maps directly to the turn budget:
maxTurns: 5 is sufficient for well-behaved runs.
The prompt also pre-injects the memory directory manifest (from scanMemoryFiles())
so the agent does not spend a turn on ls. The manifest format:
// memoryScan.ts — formatMemoryManifest() // Output: one line per file "- [feedback] feedback_no_db_mocks.md (2026-03-30T12:00:00Z): Integration tests must use a real DB"
The dream system runs the same forked agent pattern but on a much longer cadence.
It fires at the end of a query loop (via executeAutoDream()) only when three
gates pass in order, cheapest first:
Time Gate
Hours since lastConsolidatedAt ≥ minHours (default: 24). Cost: one stat() call per turn.
Session Gate
Transcript files with mtime > lastConsolidatedAt ≥ minSessions (default: 5). Cost: one directory scan, throttled to once per 10 minutes.
Lock Gate
No other process mid-consolidation. Implemented via a PID file whose mtime is the timestamp.
The lock file design
The lock file is .consolidate-lock inside the memory directory. Its design is elegant:
the body is the holder's PID; the mtime of the file IS lastConsolidatedAt.
This means reading "when did we last consolidate?" costs exactly one stat().
// consolidationLock.ts — acquire sequence const [s, raw] = await Promise.all([stat(path), readFile(path, 'utf8')]) // If holder PID is still running AND lock isn't stale (>1h) → bail if (isProcessRunning(holderPid)) return null // Otherwise reclaim: write our PID, verify we won the race await writeFile(path, String(process.pid)) const verify = await readFile(path, 'utf8') if (parseInt(verify) !== process.pid) return null // lost the race
On failure, rollbackConsolidationLock(priorMtime) rewinds the mtime via
utimes(), restoring the pre-acquire timestamp. On crash, the dead PID is
detectable and the next process reclaims the lock.
The consolidation prompt — four phases
The dream prompt (from consolidationPrompt.ts) divides work into four phases.
The agent runs as a forked agent with the same createAutoMemCanUseTool() permissions:
Orient
ls the memory dir, read MEMORY.md, skim topic files to avoid creating duplicates.
Gather recent signal
Check daily logs (KAIROS mode), grep transcripts narrowly for specific context. Transcripts are large JSONL — never read whole files.
Consolidate
Write or update topic files. Merge near-duplicates. Convert relative dates ("yesterday") to absolute dates so memories remain interpretable.
Prune and index
Update MEMORY.md — keep it under 200 lines / 25 KB. Remove stale pointers. Demote verbose index lines into topic files.
Dream progress tracking
Unlike the per-turn extractor, the dream agent registers a DreamTask in the app state,
allowing the UI to show live progress. makeDreamProgressWatcher() streams each assistant
message from the fork, extracts text blocks for display, and collects file paths from Edit/Write
tool calls for the completion summary.
// autoDream.ts — progress watcher function makeDreamProgressWatcher(taskId, setAppState) { return msg => { for (const block of msg.message.content) { if (block.type === 'text') text += block.text if (block.type === 'tool_use') toolUseCount++ if (EDIT || WRITE) touchedPaths.push(input.file_path) } addDreamTurn(taskId, { text, toolUseCount }, touchedPaths, setAppState) } }
At query time, Claude Code does not inject all memory files into context — that would balloon
the prompt with stale or irrelevant facts. Instead, findRelevantMemories() in
memdir/findRelevantMemories.ts uses a Sonnet side-query to select up to
five topic files relevant to the current query.
The two-step recall pipeline
// findRelevantMemories.ts const memories = (await scanMemoryFiles(memoryDir, signal)) .filter(m => !alreadySurfaced.has(m.filePath)) const selectedFilenames = await selectRelevantMemories( query, memories, signal, recentTools ) // sideQuery → Sonnet, max_tokens: 256, JSON schema output // Returns: { selected_memories: string[] }
The selector system prompt has a notable exception: if the model is actively using a tool
(e.g. mcp__X__spawn is in recentTools), that tool's reference documentation
memory is suppressed — the conversation already contains working usage, and surfacing docs is noise.
Warnings and gotchas about those tools are still surfaced.
Staleness signals
memdir/memoryAge.ts computes human-readable age strings and injects a staleness
caveat when a memory file is more than one day old:
// memoryAge.ts — freshnessText for memories > 1 day old "This memory is 47 days old. Memories are point-in-time observations, not live state — claims about code behavior or file:line citations may be outdated. Verify against current code before asserting as fact."
When the TEAMMEM feature flag is enabled, the memory system gains a second directory
alongside the private one. The combined mode uses buildExtractCombinedPrompt()
which adds <scope> tags to each memory type block. The routing guidance is
baked into each type definition rather than a separate routing section.
Scope rules per type
- user — always private. Personal profile should never be shared.
- feedback — default private; team only when the guidance is a project-wide convention (testing policy, build invariant), not a personal style preference.
- project — private or team, strongly bias toward team. Context behind shared work belongs in the shared store.
- reference — usually team. External system pointers are useful to all collaborators.
Sensitive data (API keys, credentials) must never be saved to team memories — the combined extraction prompt includes an explicit prohibition.
In long-lived assistant sessions (feature flag KAIROS), the memory model shifts from
a live index to an append-only daily log. The agent writes to
logs/YYYY/MM/YYYY-MM-DD.md as it works, rather than maintaining MEMORY.md directly.
A separate nightly /dream skill distills the logs into topic files and updates
MEMORY.md.
The log path pattern is stored in the prompt without today's literal date — the prompt is cached
by systemPromptSection('memory', ...) and must not be invalidated on midnight rollover.
The model derives the current date from a date_change attachment that is appended
when midnight rolls, not from the prompt itself.
The forked agent pattern is chosen precisely because it shares the parent's prompt cache. The forked agent gets the same system prompt and message history prefix as the main conversation, so those tokens are already cached — the fork only pays for new tokens past the cache boundary.
The extraction code logs cache metrics on every run:
// extractMemories.ts — cache hit logging const hitPct = ((cache_read_tokens / totalInput) * 100).toFixed(1) logForDebugging(`[extractMemories] cache: read=${cache_read} create=${cache_create} (${hitPct}% hit)`)
The tool list for the forked agent must match the main agent's tool list for cache sharing to work —
the tools are part of the cache key. This is why createAutoMemCanUseTool() uses runtime
permission denial rather than providing a different tool list to the fork.
Key Takeaways
- Extraction runs at the end of every query turn as a forked sub-agent — the fork shares the prompt cache, so it's cheap. The cursor (
lastMemoryMessageUuid) ensures only new messages are considered each run. - The lock file's mtime is the
lastConsolidatedAttimestamp — no separate metadata store. Reading when we last consolidated costs exactly onestat(). - The four-type taxonomy (
user,feedback,project,reference) exists to keep memory stores free of content that is derivable from live code — memories should capture context that cannot be re-derived. - Recall is selective: a Sonnet side-query with a 256-token budget picks up to 5 relevant topic files per query, suppressing reference docs for tools currently in active use.
- The "Before recommending from memory" section header in the system prompt outperformed abstract variants in evals — position and framing at the decision point matters for compliance.
- Team memory scopes are enforced via prompt guidance, not filesystem permissions — the prompt's
<scope>tags are the routing mechanism.
Knowledge Check
createAutoMemCanUseTool() returns a function that denies disallowed operations at runtime..consolidate-lock file stores the last consolidation timestamp. Where exactly?.consolidate-lock IS lastConsolidatedAt. The body just holds the PID. Reading the last consolidation time costs exactly one stat().stat() call with no parsing required.hasMemoryWritesSince() scans assistant messages for Edit/Write tool calls targeting an auto-memory path. If found, the fork is skipped and the cursor advances — the main agent and background agent are mutually exclusive per turn.hasMemoryWritesSince() — if the main agent wrote to a memory file in the new messages, the forked extraction is skipped. This prevents the two paths from double-writing the same turn.recentTools contains a tool the model is actively exercising, its reference/API documentation is not surfaced — the live conversation already shows working usage. Warnings and gotchas about those tools are still included.