markdown.engineering
Lesson 22

Teams & Swarms

How Claude Code spins up multi-agent teams, routes messages through mailboxes, syncs permissions across workers, and tears everything down cleanly when the job is done.

TeamCreateTool tmux / iTerm2 / in-process Mailbox Messaging Permission Sync Team File Structure

What Is a Swarm?

A swarm is a named team of Claude agents that share a config file, a task list, and a file-based mailbox. One agent is the team lead — it creates the team, spawns teammates, assigns tasks, and gracefully shuts everything down. Every other agent is a teammate.

The feature is opt-in via the CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1 environment variable and is surfaced through two first-class tools: TeamCreate and TeamDelete.

User prompt │ ▼ Team Lead ──┬─── TeamCreate~/.claude/teams/my-team/config.json~/.claude/tasks/my-team/ │ ├─── spawn Agent(researcher) ──► researcher pane / in-process ├─── spawn Agent(tester) ──► tester pane / in-process │ │ ← mailbox messages → │ └─── TeamDelete → clean up dirs & panes

Creating a Team

When the lead calls TeamCreate, the tool performs these steps synchronously before returning:

1

Uniqueness check

It calls readTeamFile(team_name). If the name already exists, a new random generateWordSlug() is substituted so names never collide.

2

Write team config

A TeamFile JSON is written to ~/.claude/teams/<sanitized-name>/config.json. It seeds the members array with the lead's own entry (name: "team-lead").

3

Create task list

Calls resetTaskList() and ensureTasksDir() so task numbering starts fresh at 1 for this team. Also calls setLeaderTeamName() so subsequent task reads go to the right directory.

4

Update AppState

Sets appState.teamContext with the team name, file path, lead agent ID, and an initial teammates map. This is what the UI reads to show the team status badge.

5

Register for session cleanup

Adds the team to an in-memory Set via registerTeamForSessionCleanup(). If the session exits without an explicit TeamDelete, the shutdown hook cleans up orphaned directories automatically.

Source: TeamCreateTool.ts — the call() method
async call(input, context) {
  const { setAppState, getAppState } = context
  const appState = getAppState()

  // Guard: one team per leader at a time
  const existingTeam = appState.teamContext?.teamName
  if (existingTeam) {
    throw new Error(
      `Already leading team "${existingTeam}". Use TeamDelete first.`
    )
  }

  // Deduplicate name
  const finalTeamName = generateUniqueTeamName(team_name)
  const leadAgentId   = formatAgentId(TEAM_LEAD_NAME, finalTeamName)

  const teamFile: TeamFile = {
    name: finalTeamName,
    createdAt: Date.now(),
    leadAgentId,
    leadSessionId: getSessionId(),
    members: [{
      agentId:      leadAgentId,
      name:         TEAM_LEAD_NAME,     // "team-lead"
      agentType:    leadAgentType,
      model:        leadModel,
      joinedAt:     Date.now(),
      tmuxPaneId:   '',
      cwd:          getCwd(),
      subscriptions: [],
    }],
  }

  await writeTeamFileAsync(finalTeamName, teamFile)
  registerTeamForSessionCleanup(finalTeamName)

  await resetTaskList(taskListId)
  await ensureTasksDir(taskListId)
  setLeaderTeamName(sanitizeName(finalTeamName))

  setAppState(prev => ({
    ...prev,
    teamContext: {
      teamName:    finalTeamName,
      teamFilePath,
      leadAgentId,
      teammates: {
        [leadAgentId]: {
          name:         TEAM_LEAD_NAME,
          color:        assignTeammateColor(leadAgentId),
          spawnedAt:    Date.now(),
          ...
        }
      }
    }
  }))
}

On-Disk Layout

The swarm writes everything under ~/.claude/. Nothing is stored globally or in the project repo.

~/.claude/ ├── teams/ │ └── my-team/ ← sanitized team name │ ├── config.json ← TeamFile (members, permissions…) │ └── permissions/ │ ├── pending/ ← perm requests from workers │ └── resolved/ ← approved / rejected responses └── tasks/ └── my-team/ ← task list for this swarm ├── 0001.json ├── 0002.json └── ...
TeamFile schema — every field explained
type TeamFile = {
  name:            string           // canonical team name
  description?:    string
  createdAt:       number           // epoch ms
  leadAgentId:     string           // "team-lead@my-team"
  leadSessionId?:  string           // session UUID of the leader
  hiddenPaneIds?:  string[]         // pane IDs moved to hidden window
  teamAllowedPaths?: TeamAllowedPath[]  // shared "always allow" edit rules
  members: Array<{
    agentId:        string   // "researcher@my-team"
    name:           string   // "researcher" — used for SendMessage
    agentType?:     string
    model?:         string
    prompt?:        string
    color?:         string
    planModeRequired?: boolean
    joinedAt:       number
    tmuxPaneId:     string   // "%3" tmux pane or iTerm session UUID
    cwd:            string
    worktreePath?:  string   // git worktree if isolated
    sessionId?:     string   // Claude session UUID
    subscriptions:  string[]
    backendType?:   'tmux' | 'iterm2' | 'in-process'
    isActive?:      boolean  // false = idle, undefined/true = running
    mode?:          PermissionMode
  }>
}
Agent ID format

Agent IDs always follow the pattern agentName@teamName, e.g. researcher@my-team. The team lead's ID is deterministically team-lead@my-team. Teammates always address each other by name (not ID) when calling SendMessage.

Three Ways to Run a Teammate

When a teammate is spawned, the backend registry automatically selects the best execution engine. The priority order is fixed and runs once per session.

Backend When Selected Pane Visibility Hide/Show Kill method
tmux Inside a tmux session (highest priority) OR tmux available as fallback Split panes in the same window; external claude-swarm session if not inside tmux Yes — break-pane to hidden session kill-pane -t <paneId>
iterm2 In iTerm2 with it2 CLI available, and user hasn't chosen tmux preference Native vertical split panes inside current window No — not supported it2 session close -f -s <id>
in-process Non-interactive (-p flag), explicit --teammate-mode in-process, or no pane backend available Rendered inline by InProcessTeammateTask component N/A Abort via AbortController
Backend detection algorithm — registry.ts detectAndGetBackend()
// Priority order (first match wins, cached for the session)

// 1. Running INSIDE tmux? Always use tmux.
if (await isInsideTmux()) {
  return { backend: createTmuxBackend(), isNative: true }
}

// 2. In iTerm2 with it2 CLI? Use iTerm2 (unless user prefers tmux).
if (isInITerm2()) {
  if (!getPreferTmuxOverIterm2()) {
    const it2Available = await isIt2CliAvailable()
    if (it2Available) {
      return { backend: createITermBackend(), isNative: true }
    }
  }
  // Fallback to tmux as external session if tmux is installed
  if (await isTmuxAvailable()) {
    return { backend: createTmuxBackend(), isNative: false, needsIt2Setup: true }
  }
}

// 3. Standalone terminal with tmux installed
if (await isTmuxAvailable()) {
  return { backend: createTmuxBackend(), isNative: false }
}

// 4. Nothing available → throw with platform-specific install instructions
throw new Error(getTmuxInstallInstructions())

Detection uses environment variables captured at module load timeTMUX, TMUX_PANE, TERM_PROGRAM, and ITERM_SESSION_ID — so the shell or later code overwriting them has no effect.

The it2 availability check intentionally runs it2 session list (not it2 --version) because --version exits 0 even when the iTerm2 Python API is disabled, which would cause pane splits to fail silently later.

tmux backend — inside vs. outside tmux layout strategies

Inside tmux (leader is in a pane): the first teammate splits the leader's pane horizontally with -l 70% so the leader keeps 30% of the window. Additional teammates split from existing teammate panes alternating vertical/horizontal, then select-layout main-vertical rebalances the right side.

Outside tmux (standalone terminal): a separate tmux server is created on a PID-scoped socket (claude-swarm-<pid>). A claude-swarm session with a swarm-view window is created. Teammates are laid out with a tiled layout. The user can attach to this session with tmux -L claude-swarm-<pid> attach.

// Internal tmux socket name prevents conflicts between multiple Claude instances
export function getSwarmSocketName(): string {
  return `claude-swarm-${process.pid}`
}

Pane creation is serialized via a promise-based mutex (acquirePaneCreationLock()) to prevent race conditions when multiple teammates are spawned in parallel. After creating each pane, the backend sleeps 200 ms to allow the shell (zsh/bash, oh-my-zsh, etc.) to finish initializing before sending the Claude command.

iTerm2 backend — at-fault dead-session recovery

The iTerm2 backend tracks pane session UUIDs in a module-level array. When spawning additional teammates, it targets the last-known session UUID via it2 session split -s <uuid>.

If a user closes a pane manually (Cmd+W), the next spawn will fail with a non-zero exit code. The backend confirms the session is truly dead by running it2 session list and checking for the UUID. If it's missing, it prunes the stale ID from the array and retries. This loop is bounded — each iteration shrinks the array by one, so it terminates at worst in N+1 iterations.

while (true) {
  const splitResult = await runIt2(splitArgs)
  if (splitResult.code !== 0 && targetedTeammateId) {
    const listResult = await runIt2(['session', 'list'])
    if (!listResult.stdout.includes(targetedTeammateId)) {
      teammateSessionIds.splice(idx, 1)   // prune dead session
      continue                               // retry with next-to-last
    }
    throw new Error(...)  // systemic failure — surface it
  }
  break
}
In-process backend — same Node.js process, isolated context

In-process teammates run inside the leader's Node.js process. They share the API client and MCP connections but get fully isolated identity context via AsyncLocalStorage — each teammate has its own agent name, team name, color, and AbortController.

Spawn flow:

  1. spawnInProcessTeammate() creates a TeammateContext and an independent AbortController (not linked to the leader's), registers an InProcessTeammateTaskState in AppState.tasks, and returns.
  2. InProcessBackend.spawn() then calls startInProcessTeammate() as a fire-and-forget — the agent loop runs on the event loop in the background.
  3. Kill uses abortController.abort() rather than killing a process, and the task status transitions to 'killed'.
// Teammates get an independent abort controller — interrupting the leader
// does NOT abort active teammates.
const abortController = createAbortController()

// Strip parent messages — teammates start with an empty conversation
toolUseContext: { ...this.context, messages: [] }

What Gets Sent to the Pane

For pane-backed teammates (tmux/iTerm2), the backend uses buildInheritedCliFlags() and buildInheritedEnvVars() to construct the shell command that starts a new Claude instance inside the pane.

CLI flags and env vars inherited by teammates
// Flags always forwarded if applicable:
'--dangerously-skip-permissions'  // if bypassPermissions mode active
'--permission-mode acceptEdits'   // if acceptEdits mode active
'--model <model>'                 // if set via CLI
'--settings <path>'               // if set via CLI
'--plugin-dir <dir>'              // for each inline plugin
'--teammate-mode <mode>'          // so nested spawns use same mode
'--chrome / --no-chrome'          // if set explicitly

// Env vars always set:
CLAUDECODE=1
CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1
CLAUDE_CODE_AGENT_COLOR=<color>   // assigned color for pane border

// Env vars forwarded if present in parent process:
CLAUDE_CODE_USE_BEDROCK  CLAUDE_CODE_USE_VERTEX  ANTHROPIC_BASE_URL
CLAUDE_CONFIG_DIR  CLAUDE_CODE_REMOTE  HTTPS_PROXY  ...

Plan mode is special: if planModeRequired=true, bypass-permissions flags are not propagated even if the leader has them set. Safety trumps convenience.

How Teammates Communicate

Every agent — regardless of backend — communicates through a file-based mailbox. Messages are JSON files written to ~/.claude/teams/<team>/inbox/<agent-name>/. The leader polls these inboxes and delivers messages as new conversation turns.

researcher │ │ writeToMailbox("team-lead", { │ from: "researcher", │ text: "Finished analysis. See tasks/0003.", │ timestamp: "2025-..." │ }, teamName) │ ▼ ~/.claude/teams/my-team/inbox/team-lead/<uuid>.json │ │ (leader inbox poller reads file, delivers as user message) │ ▼ Team Lead receives message as a new conversation turn
Automatic delivery

The lead never manually "checks" its inbox. The inbox poller wakes on file-system events and delivers queued messages as synthetic user turns. If the leader is mid-turn, messages queue and are delivered when the current turn ends.

Message types that flow through the mailbox
TypeDirectionPurpose
plain textany → anyTask updates, questions, results
shutdown_requestlead → teammateGraceful shutdown signal
idle notificationteammate → leadSystem-generated after every turn end
permission_requestworker → leadWorker needs UI approval for a tool
permission_responselead → workerApproval/denial of tool use
mode_set_requestlead → teammateChange teammate's permission mode
sandbox_permission_requestworker → leadWorker needs network access approval
No structured JSON in plain messages

Teammates should never send raw JSON objects like {"type":"idle",...} as user messages. Only the system sends structured message types. Plain agent-to-agent communication is always natural language.

Worker → Leader Permission Escalation

When a worker agent encounters a tool-use it doesn't have permission for, it can escalate to the team lead via the mailbox rather than silently failing. The leader sees the request in its UI (the standard ToolUseConfirm dialog) and approves or denies it.

worker needs to run Edit("src/api.ts", ...) │ ├── createPermissionRequest({ toolName, input, description, ... }) │ ├── sendPermissionRequestViaMailbox(request) │ └─► writes to inbox/team-lead/<uuid>.json │ │ [leader's UI shows ToolUseConfirm dialog] │ [user approves or denies] │ ├── sendPermissionResponseViaMailbox(workerName, resolution, requestId) │ └─► writes to inbox/<workerName>/<uuid>.json │ └── pollForResponse(requestId) ← worker polls until resolved returns { decision: "approved" | "denied", updatedInput?, permissionUpdates? }
Permission request schema and locking
type SwarmPermissionRequest = {
  id:                  string   // "perm-<ts>-<random>"
  workerId:            string   // "researcher@my-team"
  workerName:          string
  workerColor?:        string
  teamName:            string
  toolName:            string   // e.g. "Edit"
  toolUseId:           string
  description:         string   // human-readable summary
  input:               Record<string, unknown>
  permissionSuggestions: unknown[]
  status:              'pending' | 'approved' | 'rejected'
  resolvedBy?:         'worker' | 'leader'
  resolvedAt?:         number
  feedback?:           string
  updatedInput?:       Record<string, unknown>
  permissionUpdates?:  unknown[]
  createdAt:           number
}

Write operations on the pending directory use a .lock file with proper-lockfile semantics so concurrent workers don't corrupt each other's requests.

Leader permission bridge (in-process only)

For in-process teammates, permission prompts surface directly in the leader's terminal UI rather than going through the file system. The bridge module (leaderPermissionBridge.ts) is a module-level registry that stores the React setter functions the REPL registers at startup:

// In leaderPermissionBridge.ts — no React dependency
let registeredSetter: SetToolUseConfirmQueueFn | null = null

export function registerLeaderToolUseConfirmQueue(setter): void {
  registeredSetter = setter
}
export function getLeaderToolUseConfirmQueue() {
  return registeredSetter
}

The in-process runner calls getLeaderToolUseConfirmQueue() when it needs to present a confirmation dialog, bypassing the file-based permission system entirely.

Team Workflow End-to-End

Phase 1 — Setup

  • Lead calls TeamCreate (creates config.json + tasks dir)
  • Lead creates tasks with TaskCreate
  • Lead spawns teammates with Agent tool (tmux/iTerm2/in-process)
  • Teammates join team, write their entry to config.json

Phase 2 — Parallel Work

  • Teammates check TaskList, claim unassigned tasks
  • Teammates work, send progress messages via SendMessage
  • After each turn, teammate sends idle notification automatically
  • Lead assigns new tasks or sends follow-up work

Phase 3 — Shutdown

  • Lead sends {"type":"shutdown_request"} to each teammate's mailbox
  • Teammate approves, exits — sets isActive: false
  • Lead waits until no active non-lead members remain
  • Lead calls TeamDelete — cleans dirs, kills panes, clears AppState

Phase 4 — Cleanup Guard

  • If session ends without TeamDelete (SIGINT, crash), the shutdown hook calls cleanupSessionTeams()
  • Pane-backed panes are killed first (so zombie processes don't linger in tmux)
  • Team and task directories are then removed

Cleaning Up a Team

TeamDelete takes no input — the team name comes from appState.teamContext.teamName. It refuses to run if any member still has isActive !== false, forcing the lead to shut down teammates gracefully first.

Source: TeamDeleteTool.ts — the call() method
async call(_input, context) {
  const teamName = getAppState().teamContext?.teamName

  if (teamName) {
    const teamFile = readTeamFile(teamName)
    const nonLeadMembers = teamFile.members
      .filter(m => m.name !== TEAM_LEAD_NAME)
    const activeMembers = nonLeadMembers
      .filter(m => m.isActive !== false)   // idle = false, running = true/undefined

    if (activeMembers.length > 0) {
      return { success: false, message: `Cannot cleanup: ${memberNames} still active` }
    }

    await cleanupTeamDirectories(teamName)  // teams/ + tasks/ + git worktrees
    unregisterTeamForSessionCleanup(teamName)
    clearTeammateColors()
    clearLeaderTeamName()
  }

  setAppState(prev => ({
    ...prev,
    teamContext: undefined,
    inbox: { messages: [] },   // flush queued messages
  }))
}

The cleanupTeamDirectories() function first reads the config to collect worktreePath entries, then destroys each git worktree with git worktree remove --force (falling back to rm -rf if that fails), and finally removes the ~/.claude/teams/ and ~/.claude/tasks/ directories.

TeamStatus & TeamsDialog

TeamStatus (footer badge)

Reads appState.teamContext.teammates and counts non-lead members. Renders a pill in the footer showing N teammates. When selected, pressing Enter opens TeamsDialog. Returns null when no teammates are active (zero teammates = no rendering cost).

TeamsDialog (interactive panel)

Two-level navigation: list view shows all teammates with their status, mode, and active/idle state; detail view lets the lead cycle the teammate's permission mode (default → acceptEdits → bypassPermissions → plan), kill the teammate, or jump to its tmux/iTerm2 pane. Refreshes on a 1-second interval to pick up mode changes written to config.json by teammates.

How TeamsDialog cycles permission modes

In list view, pressing the cycle-mode key calls cycleAllTeammateModes() which uses setMultipleMemberModes() for an atomic multi-write, preventing the TOCTOU issue of sequential single writes.

In detail view, it cycles only the selected teammate via setMemberMode(), then sends a mode_set_request message to the teammate's mailbox so the running process is notified without a file-watch poll delay.

// Atomic update of all member modes in one config.json write
setMultipleMemberModes(teamName, [
  { memberName: 'researcher', mode: 'acceptEdits' },
  { memberName: 'tester',     mode: 'acceptEdits' },
])

Key Takeaways

  1. Team = TaskList. TeamCreate creates a matching task directory under ~/.claude/tasks/, and sanitizeName(teamName) is the shared key used by both lead and teammates to find tasks.
  2. Three backends, one interface. The TeammateExecutor interface abstracts tmux, iTerm2, and in-process. The registry detects and caches the right backend once per session; callers never check process.env.TMUX directly.
  3. Detection is captured at module load. TMUX, TMUX_PANE, and ITERM_SESSION_ID are read at import time into module-level constants. This is intentional — the shell later overwrites TMUX for its own socket, and the system must not confuse that with "user is inside tmux."
  4. All messaging is file-based. Even in-process teammates use the mailbox for SendMessage. The only exception is the leader permission bridge, which uses a module-level setter to push permission dialogs into the React UI without file I/O.
  5. Idle is not dead. Teammates write isActive: false to config.json after every turn. The lead must not send shutdown or react with alarm — idle teammates wake normally when a new message arrives in their mailbox.
  6. TeamDelete blocks on active members. It checks isActive !== false to decide if a member is still running. Always shut teammates down (shutdown_request → approve → isActive=false) before calling TeamDelete.
  7. Session cleanup is a safety net. registerTeamForSessionCleanup() ensures that even if the session crashes without an explicit TeamDelete, orphaned panes and directories are cleaned up on graceful shutdown.

Quiz

1. Which backend is selected when Claude is launched from inside an existing tmux session, even if iTerm2 is the terminal emulator?

a) iTerm2 backend, because it is the native terminal emulator
b) tmux backend — being inside tmux is the highest-priority condition
c) in-process backend, since iTerm2 has no tmux socket
d) An error is thrown asking the user to choose

2. A teammate has finished its assigned tasks and called TaskUpdate to mark them complete. The lead sees the member listed in the team config with isActive: false. What does this mean?

a) The teammate process has crashed and should be re-spawned
b) The teammate is idle — its current turn ended; it will wake when a new message arrives
c) The teammate has shut down permanently and cannot receive more messages
d) TeamDelete can now be safely called without sending a shutdown request

3. Where does TeamCreate store the team's configuration file?

a) ~/.claude/sessions/<team-name>/config.json
b) ~/.claude/agents/<team-name>.json
c) ~/.claude/teams/<sanitized-name>/config.json
d) <project-cwd>/.claude/team.json

4. Why does isIt2CliAvailable() run it2 session list instead of it2 --version?

a) --version is not a valid flag for the it2 CLI
b) --version exits 0 even when the iTerm2 Python API is disabled, so a later session split would fail silently
c) session list is faster because it uses a cached result
d) The it2 CLI requires session list to authenticate before any other command

5. When the team lead session exits via Ctrl-C (SIGINT) without calling TeamDelete, what happens to orphaned teammate panes and directories?

a) They are left on disk and in tmux; the user must clean them up manually
b) Only the config.json is deleted; panes remain open
c) cleanupSessionTeams() kills pane-backed teammate processes first, then removes team and task directories
d) TeamDelete is called automatically on the next Claude session start

6. A teammate running in tmux mode has planModeRequired: true. The lead session has bypassPermissions active. What happens when the teammate's spawn command is built?

a) --dangerously-skip-permissions is included because the lead has bypass active
b) --dangerously-skip-permissions is not included — plan mode takes precedence over bypass permissions
c) Both --dangerously-skip-permissions and --plan-mode-required are included
d) An error is thrown because plan mode and bypass permissions cannot coexist