diff --git a/devlog/2026-06-13_smart-compression-prompts/WORKLOG.md b/devlog/2026-06-13_smart-compression-prompts/WORKLOG.md index 0ac869c..ac96bba 100644 --- a/devlog/2026-06-13_smart-compression-prompts/WORKLOG.md +++ b/devlog/2026-06-13_smart-compression-prompts/WORKLOG.md @@ -22,7 +22,27 @@ - Added `resolveThresholdPercent()` helper to parse `number | "NN%"` config values - Both inject.ts and utils.ts now call the same shared function +### 4. Iteration 2: Anti-re-compression + tool-output-first guidance +**Trigger**: Observed agent repeatedly compressing already-compressed block summaries (300 tokens each) with negligible effect. Context stayed at ~53% despite 5+ compressions. + +**Changes to `lib/prompts/system.ts`**: +- Added principle: "Target the largest UNCOMPRESSED content first" +- Expanded WHAT TO COMPRESS FIRST with recoverable high-token content types +- New section: DO NOT RE-COMPRESS (low value, diminishing returns) + +### 5. Iteration 3: Dual Oracle review fixes +**Trigger**: Dual Oracle review found 1 CRITICAL + 7 MAJOR issues. + +**Fixes applied**: +- **CRITICAL**: Agent results bullet — removed "compress immediately" (contradicted pressure-based philosophy), explained protected tools auto-preservation behavior, made decompress primary recovery path (not re-invoke) +- **MAJOR**: DO NOT RE-COMPRESS — added aging warning exception (nudge.ts tells model to re-summarize aging blocks; system prompt must not contradict) +- **MAJOR**: Merged 4 redundant bullets (terminal/build/git/publish) into one "Verbose command output" bullet +- **MAJOR**: "Content needed in next 2-3 turns" → "Content whose immediate use is complete" (models can't predict future) +- **MAJOR**: Build/test output — keep failure messages + file/line refs, not just verdict (contradicted compress-range.ts EXHAUSTIVE requirement) +- **MINOR**: Added missing scenarios — resolved discussion threads, pending tool calls guard +- **MINOR**: Removed specific agent names (Oracle, explore, librarian) — use generic phrasing + ## Verification - TypeScript: clean -- Build: success (301.02 KB) +- Build: success - Tests: 386/386 pass diff --git a/lib/prompts/system.ts b/lib/prompts/system.ts index 385d942..b32bb32 100644 --- a/lib/prompts/system.ts +++ b/lib/prompts/system.ts @@ -12,6 +12,8 @@ Compression replaces raw conversation content with dense summaries. When used co The key principle: compress based on context pressure, not habit. When context is ample, compress rarely or not at all. When context is tight, compress aggressively but selectively. The runtime context usage indicator tells you the current pressure level. +Target the largest UNCOMPRESSED content first. Savings scale with original size — compressing a 5000-token tool output frees far more than re-shrinking an already-summarized 300-token block. + CONTEXT PRESSURE LEVELS - Ample: Context is well below the threshold. Do NOT compress unless there is obvious waste (huge terminal dumps, duplicated content). Focus entirely on your task. @@ -20,11 +22,21 @@ CONTEXT PRESSURE LEVELS WHAT TO COMPRESS FIRST (high value, low risk) -- Verbose terminal/bash command output (build logs, test output, directory listings) -- Exploration that led nowhere (failed approaches, dead-end searches) -- Redundant tool results (reading the same file multiple times, repeated status checks) -- Intermediate steps of completed multi-step tasks -- Large file contents that have already been used and are no longer needed +- Agent/subagent review and consultation results: Prime compression targets when context pressure rises — the surrounding reasoning and tool-call chatter is typically the largest block of uncompressed content. Note: if the agent tool is in your protected list, its output is auto-preserved in the summary, so the savings come from the surrounding conversation, not the agent output itself. Compress once you have fully consumed the results (all recommended actions applied or recorded in files). Recover via \`decompress\` while the block is still active. Re-invoking the agent is a last resort — it is a fresh run, not a cache hit. +- Verbose command output (build/test runs, git diff/log/status, publish logs, directory listings): Once you have read the result, compress. Keep only the verdict — pass/fail status, commit hash, version number, or count. For failures, keep the specific error messages and file/line references needed to act on them. The full output is reproducible by re-running the command. +- Exploration that led nowhere (failed approaches, dead-end searches): Compress to a one-line note about what was tried and why it failed. +- Redundant tool results (reading the same file multiple times, repeated status checks, exhausted search results): Keep only the most recent result. +- Intermediate steps of completed multi-step tasks: Once the task is done, compress the process. Keep only the final outcome. +- Resolved discussion threads (clarification rounds, negotiated requirements, design debate that reached a decision): Once a conclusion is recorded, compress the back-and-forth. Keep the decision and its rationale. +- Large file contents that have already been used and are no longer needed: Compress to a summary of key functions, types, or patterns. + +DO NOT RE-COMPRESS (low value, diminishing returns) + +- Already-compressed block summaries: Re-compressing a summary into a shorter summary saves negligible tokens. If a block needs better detail, use \`decompress\` to restore it, then compress the original content properly. Exception: if a block-aging warning flags specific block IDs as facing GC truncation, re-summarize exactly those flagged blocks into a fresh range — this preserves detail that GC would otherwise destroy. +- Short messages (1-3 sentences): The compression overhead (block metadata, summary structure) may exceed the tokens saved. +- Content whose immediate use is complete — the task it supported is done and no open todo/plan references it. If still in active use, let it stay. +- User instructions and requirements: These must remain visible until the task is complete. +- Tool calls that are still pending or in-progress: Wait until the result is returned and consumed. WHAT TO COMPRESS CAREFULLY (high risk - verify before compressing)