mdENG -- Lesson 08 — Claude Code Memory System

01 Three Memory Layers

Claude Code ships three distinct memory subsystems, each solving a different scope of persistence. They share a common file format but differ in lifetime, audience, and sync behavior.

Layer 1

Auto Memory

Persistent facts about you — user profile, feedback, project context, external references. Survives across all future sessions.

~/.claude/projects/<slug>/memory/

Layer 2

Session Memory

In-session notes updated in the background as context grows. Powers context compaction so work survives past the context window.

~/.claude/session-memory/<uuid>.md

Layer 3

Team Memory

Shared memories synced to a server API, scoped to a GitHub repo. Every org member reading the same repo sees the same team facts.

…/memory/team/ ↔ /api/claude_code/team_memory

graph TB subgraph AL["Auto Memory (per-project, per-user)"] AM[MEMORY.md index] --> UF[user_role.md] AM --> FF[feedback_testing.md] AM --> PF[project_deadline.md] AM --> RF[reference_linear.md] end subgraph SL["Session Memory (per-session)"] SM[session-<uuid>.md] --> CS[Current State] SM --> TS[Task Spec] SM --> WL[Worklog] end subgraph TL["Team Memory (per-repo, shared)"] TM[team/MEMORY.md] --> TF1[team/patterns.md] TF1 --> API[(Server API)] TM --> API end UA[User & Claude conversation] -->|"extractMemories
background agent"| AL UA -->|"extractSessionMemory
post-sampling hook"| SL UA -->|"write to team/
+ watcher push"| TL style AL fill:#22201d,stroke:#7d9ab8,color:#b8b0a4 style SL fill:#22201d,stroke:#6e9468,color:#b8b0a4 style TL fill:#22201d,stroke:#c47a50,color:#b8b0a4

02 Auto Memory — The Core Layer

Auto memory is the primary persistent store. It lives at ~/.claude/projects/<sanitized-git-root>/memory/, is enabled by default, and uses a two-level structure: a master index file called MEMORY.md and individual topic files.

MEMORY.md — The Index

The index is loaded into every conversation's system prompt. It is capped at 200 lines / 25,000 bytes. Lines beyond that are silently truncated with a warning. The index is never where content lives — it's a pointer list:

# MEMORY.md (index file — no frontmatter)
- [User Role](user_role.md) — Senior engineer, Go expert, new to React frontend
- [Feedback — No mock DB](feedback_no_mock_db.md) — Always hit real DB in tests
- [Auth Rewrite Context](project_auth_rewrite.md) — Compliance-driven, not tech debt
- [Linear Project](reference_linear.md) — Pipeline bugs tracked in "INGEST"

Topic Files — The Memory Itself

Each memory lives in its own Markdown file with YAML frontmatter declaring four required fields:

---
name: Feedback — No Mock Database
description: Integration tests must hit a real database, never mocks
type: feedback
---

Don't mock the database in tests.

**Why:** We got burned last quarter when mocked tests passed but the prod
migration failed. Mock/prod divergence masked a broken migration.

**How to apply:** Any test touching the data layer must use a real DB
instance. This is a project-wide testing policy, not a personal preference.

Important

The description field is not cosmetic. It is the text the selector model sees when deciding which files to load for relevance. Write it as a precise one-liner that would match the right user queries.

Memory Types — A Closed Taxonomy

The code enforces exactly four types. Type is validated at parse time — unknown values degrade gracefully (the file still loads, but the type field is undefined).

user

Always private. Role, goals, expertise level. Helps Claude tailor explanations — a Go expert and a first-time coder need different answers to the same question.

feedback

Default private, team only for project-wide conventions. Corrections AND confirmations. Include a Why: line so Claude can judge edge cases later rather than following the rule blindly.

project

Bias toward team. Ongoing work, goals, initiatives, incidents — information NOT derivable from code or git history. Always convert relative dates to absolute dates when saving.

reference

Usually team. Pointers to external systems: Linear projects, Grafana dashboards, Slack channels. Tells Claude where to look, not what the content says.

What NOT to Save

The source explicitly excludes entire categories — even when the user asks:

// From memoryTypes.ts — WHAT_NOT_TO_SAVE_SECTION
// Code patterns, conventions, architecture, file paths → read the project
// Git history, recent changes, who-changed-what → `git log` is authoritative
// Debugging solutions or fix recipes → fix is in the code, commit msg has context
// Anything already in CLAUDE.md files
// Ephemeral task details: in-progress work, current conversation context
//
// "These exclusions apply EVEN when the user explicitly asks you to save."
// If they ask to save a PR list → ask what was *surprising* about it.

Deep dive — Path resolution and security

The memory path is resolved through a layered priority chain:

CLAUDE_COWORK_MEMORY_PATH_OVERRIDE env var — used by Cowork/SDK to redirect to a space-scoped mount
autoMemoryDirectory in settings.json — supports ~/ expansion, but only from trusted sources (policy/local/user settings). Project settings (.claude/settings.json committed to the repo) are intentionally excluded to prevent a malicious repo from setting autoMemoryDirectory: "~/.ssh"
Default: <memoryBase>/projects/<sanitized-git-root>/memory/

Worktrees of the same git repo share one memory directory because the path resolution uses findCanonicalGitRoot() — the main repo's root, not the worktree path.

export const getAutoMemPath = memoize((): string => {
  const override = getAutoMemPathOverride() ?? getAutoMemPathSetting()
  if (override) return override
  const projectsDir = join(getMemoryBaseDir(), 'projects')
  return join(projectsDir, sanitizePath(getAutoMemBase()), 'memory') + sep
}, () => getProjectRoot())

03 The Extraction Pipeline

Memories don't accumulate during conversation — they are extracted after each complete query loop. Two distinct agents handle this:

Query ends

Main agent stops tool calls

Gate check

Feature flag + cursor delta

Scan memdir

Read frontmatter headers only

Fork agent

Perfect clone, shared cache

Write + notify

Max 5 turns, then cursor advance

Mutual Exclusion with the Main Agent

The main agent has full save instructions in its system prompt at all times. When it writes memory files itself, the extraction agent skips that range entirely — it detects this via hasMemoryWritesSince(), which scans assistant messages for Write/Edit tool calls targeting the memory path:

function hasMemoryWritesSince(
  messages: Message[],
  sinceUuid: string | undefined,
): boolean {
  for (const message of messages) {
    // ... scan assistant message content blocks
    const filePath = getWrittenFilePath(block)
    if (filePath !== undefined && isAutoMemPath(filePath)) {
      return true
    }
  }
  return false
}

Relevance Recall — The Selector Model

When new messages arrive mid-session, Claude doesn't load all memory files. Instead a lightweight Sonnet call acts as a selector — it reads the frontmatter manifest and picks up to 5 relevant files:

// findRelevantMemories.ts — SELECT_MEMORIES_SYSTEM_PROMPT (excerpt)
"Return a list of filenames for the memories that will clearly be useful
to Claude Code as it processes the user's query (up to 5). Only include
memories that you are certain will be helpful based on their name and
description. Be selective and discerning."

// If the model is actively using a tool, its reference docs are skipped
// (the conversation already contains working usage — adding docs is noise)
// BUT warnings, gotchas, and known issues ARE still selected.

Performance insight

The scan reads only the first 30 lines of each file (the frontmatter range), not the full body. The manifest the selector receives is one line per file: [type] filename (ISO-timestamp): description. This is why description quality matters so much — it's the only signal the selector has.

Deep dive — Staleness detection and the drift caveat

Every memory file carries an mtime. When a relevant memory is surfaced to the model, a freshness note is computed:

export function memoryFreshnessText(mtimeMs: number): string {
  const d = memoryAgeDays(mtimeMs)
  if (d <= 1) return ''
  return (
    `This memory is ${d} days old. ` +
    'Memories are point-in-time observations, not live state — ' +
    'claims about code behavior or file:line citations may be outdated. ' +
    'Verify against current code before asserting as fact.'
  )
}

The model is also instructed in the system prompt under a section titled "Before recommending from memory" — a section header that deliberately avoids abstract names like "Trusting what you recall" because eval data showed abstract headers cause the instructions to be ignored (0/3 vs 3/3 compliance in A/B tests).

04 Session Memory

Session memory solves a different problem from auto memory: it is ephemeral, intra-session state that keeps long conversations coherent past the context window. It integrates with auto-compaction.

The Template

Each session memory file follows a fixed section structure. Custom templates can be placed at ~/.claude/session-memory/config/template.md. The default sections:

# Session Title         ← distinctive 5-10 word title, info-dense
# Current State         ← active work, pending tasks, immediate next steps
# Task specification    ← what the user asked to build + design decisions
# Files and Functions   ← important files + why they're relevant
# Workflow              ← bash commands, order, how to read output
# Errors & Corrections  ← errors encountered + what failed + what to avoid
# Codebase and System Documentation
# Learnings             ← what worked, what didn't
# Key results           ← exact outputs the user requested (tables, answers)
# Worklog               ← terse step-by-step of what was attempted

Extraction Triggers

Session memory extraction fires from a post-sampling hook. It is throttled by two independent thresholds that must both be met:

// sessionMemoryUtils.ts — DEFAULT_SESSION_MEMORY_CONFIG
{
  minimumMessageTokensToInit:  10_000, // init threshold
  minimumTokensBetweenUpdate:   5_000, // growth since last extraction
  toolCallsBetweenUpdates:          3, // AND tool call count
}
// OR: if no tool calls in the last turn AND token threshold is met
// → extract at natural conversation breaks even without tool activity

Compaction integration

Session memory only initializes when autoCompactEnabled is true. The session notes file is injected into the compact message at context-window boundaries, giving the post-compact conversation full context about what was being worked on.

Deep dive — Token budget and section size enforcement

The session memory file is capped at 12,000 tokens total, with each section limited to 2,000 tokens. When a section exceeds the limit, the extraction agent receives an explicit warning:

// If over budget, the update prompt includes:
"CRITICAL: The session memory file is currently ~N tokens, which exceeds
the maximum of 12000 tokens. You MUST condense the file to fit within this
budget. Aggressively shorten oversized sections by removing less important
details, merging related items, and summarizing older entries. Prioritize
keeping 'Current State' and 'Errors & Corrections' accurate and detailed."

For compaction inserts, a hard truncation is applied as a safety valve before the notes enter the compact message — it cuts at a line boundary and appends [... section truncated for length ...].

Custom update prompts can be placed at ~/.claude/session-memory/config/prompt.md using {{currentNotes}} and {{notesPath}} as substitution variables.

05 Team Memory Sync

Team memory is a server-synced subdirectory at …/memory/team/. It is gated behind the TEAMMEM build flag and requires OAuth. Every file in the team directory maps to a key in a flat key-value store on Anthropic's servers, scoped per GitHub repo.

Sync Semantics

// From index.ts — the core sync contract:
//   GET  /api/claude_code/team_memory?repo={owner/repo}
//   PUT  /api/claude_code/team_memory?repo={owner/repo}
//
// Sync rules:
// - Pull: server wins per-key (local files overwritten)
// - Push: delta upload — only keys whose sha256 hash differs from
//         cached serverChecksums are uploaded
// - File deletions do NOT propagate. Deleting a local file won't
//   remove it from the server; the next pull restores it locally.
// - PUT body is batched at 200KB max; larger sets split into
//   sequential PUTs (server upsert-merge semantics make this safe)

The File Watcher

A session-level file watcher monitors the team memory directory. When Claude writes a team memory file, a 2-second debounced push fires automatically. The watcher uses native fs.watch({ recursive: true }) rather than chokidar to avoid holding hundreds of file descriptors.

Secret Scanning — Client-Side Guard

Before any push, all content is scanned for 35+ secret patterns derived from the gitleaks ruleset. Detection blocks the push — secrets never reach the server:

// secretScanner.ts — sample rules (high-confidence patterns only)
{ id: 'anthropic-api-key',  source: `\\b(sk-ant-api03-[a-zA-Z0-9_\\-]{93}AA)...` },
{ id: 'github-pat',          source: 'ghp_[0-9a-zA-Z]{36}' },
{ id: 'aws-access-token',    source: '\\b((?:A3T[A-Z0-9]|AKIA|ASIA)...)' },
{ id: 'stripe-access-token',  source: '\\b((?:sk|rk)_(?:test|live|prod)_...)' },
// 35 rules total: cloud providers, AI APIs, VCS tokens, payment, crypto

Security note

Path traversal is defended at two levels: (1) path.resolve() eliminates .. segments, (2) realpath() on the deepest existing ancestor catches symlink escapes. Even dangling symlinks inside the team dir are detected and rejected before any write.

Deep dive — Private vs team scope routing

When team memory is active, the system prompt describes two directories and the type taxonomy gains <scope> tags guiding where each type goes:

user — always private (personal role/preferences should never be shared)
feedback — private by default; team only for project-wide conventions (testing policies, build invariants — not personal style)
project — strongly bias toward team (most project context is shared knowledge)
reference — usually team (external system pointers apply to everyone)

The extraction agent also receives a team-specific prohibition: "You MUST avoid saving sensitive data within shared team memories. For example, never save API keys or user credentials." This is redundant with the secret scanner but creates a behavioral defense layer independent of the pattern-match layer.

Conflict resolution uses ETag versioning and a ?view=hashes probe endpoint — the client can refresh per-key checksums without downloading full entry bodies, keeping conflict resolution cheap.

06 Assistant Mode — Daily Log Pattern

When running as a long-lived assistant session (feature flag KAIROS), Claude switches from the two-step write-then-index pattern to an append-only daily log:

// buildAssistantDailyLogPrompt() — different memory behavior
// Writes to: <autoMemPath>/logs/YYYY/MM/YYYY-MM-DD.md
// Append-only, timestamped bullets
// MEMORY.md is read-only (maintained by a separate nightly /dream skill)

// What to log:
// - User corrections and preferences
// - Facts about the user, role, or goals
// - Project context not derivable from code
// - Pointers to external systems
// - Anything explicitly asked to remember
//
// The nightly /dream skill distills logs → topic files + MEMORY.md

This matters architecturally: KAIROS mode does not compose with TEAMMEM. Appending to a personal log is fundamentally incompatible with a shared index that both sides read and write. The code gates them mutually exclusively.

07 Feature Flags and Disable Mechanisms

The memory system is deeply gated. Understanding the kill switches is important for enterprise deployments and testing:

CLAUDE_CODE_DISABLE_AUTO_MEMORY=1

Full disable. No auto memory, no extraction agent, no team sync. Disables all memory subsystems including the /remember and /dream commands.

CLAUDE_CODE_SIMPLE=1

Bare mode. Drops the memory section from the system prompt entirely. The extraction fork, autoDream, and team sync also stop.

autoMemoryEnabled: false

Settings override. Set in localSettings or userSettings in settings.json. Supports project-level opt-out. NOT available in projectSettings (committed to repo) for security.

tengu_passport_quail

Extract-memories gate. GrowthBook feature flag. The main agent still has save instructions in its prompt, but the background extraction agent doesn't run.

Key Takeaways

Memory is a three-layer system: Auto (persistent user facts), Session (in-session notes for compaction), and Team (server-synced per-repo knowledge).
The MEMORY.md index is always in context; topic files are loaded on-demand by a Sonnet selector model that reads only the frontmatter descriptions — write descriptions that match user queries.
Auto memory uses a closed four-type taxonomy: user, feedback, project, reference. Type drives scope routing (private vs team) and the model's behavior instructions.
The extraction agent runs after each query loop, never during. When the main agent writes memories itself, the extractor detects it and skips, preventing duplicates.
Feedback memories should capture both corrections AND confirmations. Only saving corrections causes the model to drift away from validated approaches over time.
Team memory uses delta push with SHA-256 checksums. File deletions don't propagate — the server is append-only from the client's perspective.
A 35-rule secret scanner runs client-side before any team memory push. Secrets never reach the server regardless of what the model writes.
Memory records are point-in-time snapshots. The model is instructed to verify file paths and function names against current code before recommending from memory.

The Memory System

Auto Memory

Session Memory

Team Memory

MEMORY.md — The Index

Topic Files — The Memory Itself

Memory Types — A Closed Taxonomy

What NOT to Save

Mutual Exclusion with the Main Agent

Relevance Recall — The Selector Model

The Template

Extraction Triggers

Sync Semantics

The File Watcher

Secret Scanning — Client-Side Guard

Key Takeaways

Check Your Understanding