Performance LLM-as-Compressor See in Code Tour

Context Window Management

Using Claude itself to summarize conversation history and extend sessions beyond the token limit.

Context Window Management — Architecture Diagram
flowchart TD
    A[After Each Turn] --> B{Token Usage > 80%?}
    B -->|No| C[Continue]
    B -->|Yes| D[pre_compact hook]
    D --> E[Call Claude: Summarize]
    E --> F[Summary replaces history]
    F --> G[post_compact hook]
    G --> H[Continue with fresh context]

Mermaid diagram definition

Deep Dive

The context window is finite. Without compaction, a long coding session would hit the limit and fail. Claude Code solves this by using Claude itself as a compressor — calling the API to summarize the conversation history.

🔑Key Insight

Meta-pattern: Claude Code uses Claude to manage its own context. The compaction prompt is tuned to preserve tool results, decisions, and rationale — the things most important for continuing a task.

💡Tip

MicroCompact is more surgical: it summarizes only the most recent tool call cluster, not the whole history. Use it when you want to trim recently completed work without disrupting the broader conversation context.

KEY TAKEAWAYS
  • Using an LLM to compress LLM context is uniquely powerful
  • Compaction prompts must be tuned for recall of decisions and tool results
  • Pre/post hooks let users save full history before compaction
  • MicroCompact enables surgical trimming without full-session disruption

Source Code

The compact function showing the summarization call and hook firing.

import { feature } from 'bun:bundle'
import type { UUID } from 'crypto'
import uniqBy from 'lodash-es/uniqBy.js'

/* eslint-disable @typescript-eslint/no-require-imports */
const sessionTranscriptModule = feature('KAIROS')
  ? (require('../sessionTranscript/sessionTranscript.js') as typeof import('../sessionTranscript/sessionTranscript.js'))
  : null

import { APIUserAbortError } from '@anthropic-ai/sdk'
import { markPostCompaction } from 'src/bootstrap/state.js'
import { getInvokedSkillsForAgent } from '../../bootstrap/state.js'
import type { QuerySource } from '../../constants/querySource.js'
import type { CanUseToolFn } from '../../hooks/useCanUseTool.js'
import type { Tool, ToolUseContext } from '../../Tool.js'
import type { LocalAgentTaskState } from '../../tasks/LocalAgentTask/LocalAgentTask.js'
import { FileReadTool } from '../../tools/FileReadTool/FileReadTool.js'
import {
  FILE_READ_TOOL_NAME,
  FILE_UNCHANGED_STUB,
} from '../../tools/FileReadTool/prompt.js'
import { ToolSearchTool } from '../../tools/ToolSearchTool/ToolSearchTool.js'
import type { AgentId } from '../../types/ids.js'
import type {
  AssistantMessage,
  AttachmentMessage,
  HookResultMessage,
  Message,
  PartialCompactDirection,
  SystemCompactBoundaryMessage,
  SystemMessage,
  UserMessage,
} from '../../types/message.js'
import {
  createAttachmentMessage,
  generateFileAttachment,
  getAgentListingDeltaAttachment,
  getDeferredToolsDeltaAttachment,
  getMcpInstructionsDeltaAttachment,
} from '../../utils/attachments.js'
import { getMemoryPath } from '../../utils/config.js'
import { COMPACT_MAX_OUTPUT_TOKENS } from '../../utils/context.js'
import {
  analyzeContext,
  tokenStatsToStatsigMetrics,
} from '../../utils/contextAnalysis.js'
import { logForDebugging } from '../../utils/debug.js'
import { hasExactErrorMessage } from '../../utils/errors.js'
import { cacheToObject } from '../../utils/fileStateCache.js'
import {
AI Assistant

Ask anything about Context Window Management

Powered by Groq · Enter to send, Shift+Enter for newline