Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 25 additions & 0 deletions devlog/2026-06-13_smart-compression-prompts/REQ.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
# REQ: Smart Compression Prompts

## Background

ACP 默认配置的 45%/55% 阈值只控制 nudge 何时触发,但系统提示词和 context usage 注入文本在无条件鼓励压缩。具体问题:

1. **Context usage 文本**:每轮注入 "use the compress tool proactively to manage context quality"——不管上下文是 10% 还是 55%,导致模型在低使用率时也频繁压缩
2. **系统提示词**:"Evaluate conversation signal-to-noise REGULARLY" 过度鼓励压缩,没有上下文充裕度感知
3. **缺少压缩优先级指导**:模型不知道什么该先压缩、什么该谨慎压缩
4. **缺少恢复线索指导**:压缩后不生成可恢复的线索

## Acceptance Criteria

1. Context usage 注入文本根据使用率分层(充裕/适中/紧张),充裕时提示"少压缩或基本不压缩"
2. 系统提示词包含压缩优先级指导:
- 优先压缩:bash 大量输出、无用日志、冗余工具结果、探索死胡同
- 谨慎压缩:临时密钥、文件路径、关键方法签名、用户偏好、错误信息
3. 系统提示词要求压缩重要内容前确认已在外部存储(文件、issue、devlog 等)
4. 系统提示词要求压缩后生成恢复线索(如自言自语式总结)

## Proposed Approach

- 修改 `lib/prompts/system.ts`:重写压缩哲学
- 修改 `lib/messages/inject/inject.ts` + `lib/messages/inject/utils.ts`:分层 context usage 文本
- 纯提示词变更,无逻辑/类型变化
28 changes: 28 additions & 0 deletions devlog/2026-06-13_smart-compression-prompts/WORKLOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
# WORKLOG: Smart Compression Prompts

## Changes

### 1. `lib/prompts/system.ts` — Rewrote compression philosophy
- Removed "Manage context continuously" → Changed to "primary goal is completing the task"
- Added CONTEXT PRESSURE LEVELS (qualitative: Ample/Moderate/High — no hardcoded percentages)
- Added WHAT TO COMPRESS FIRST (bash output, dead-end exploration, redundant tool results)
- Added WHAT TO COMPRESS CAREFULLY (secrets/keys, file paths, function signatures, errors, user preferences)
- Added BEFORE COMPRESSING IMPORTANT CONTENT (verify persisted externally)
- Added AFTER COMPRESSING (generate recovery breadcrumbs + mention decompress for recovery)

### 2. `lib/messages/inject/inject.ts` L182-213 — Tiered context usage injection
- Old: Always "use the compress tool proactively to manage context quality" regardless of usage
- New: Shared `buildContextUsageGuidance()` from utils.ts, config-driven thresholds
- Below minContextLimit (default 45%): "Context is ample — focus on your task"
- Between min/max (45-55%): "Context is moderate — compress completed sections"
- Above maxContextLimit (55%): "Context is high — compress aggressively but selectively"

### 3. `lib/messages/inject/utils.ts` L360-410 — Shared tiered logic
- Exported `buildContextUsageGuidance()` replaces old private `buildContextUsageInfo()`
- Added `resolveThresholdPercent()` helper to parse `number | "NN%"` config values
- Both inject.ts and utils.ts now call the same shared function

## Verification
- TypeScript: clean
- Build: success (301.02 KB)
- Tests: 386/386 pass
13 changes: 5 additions & 8 deletions lib/messages/inject/inject.ts
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ import {
import {
addAnchor,
applyAnchoredNudges,
buildContextUsageGuidance,
countMessagesAfterIndex,
findLastNonIgnoredMessage,
getIterationNudgeThreshold,
Expand Down Expand Up @@ -163,7 +164,7 @@ export const injectCompressNudges = (

applyAnchoredNudges(state, config, messages, prompts, compressionPriorities, currentTokens, modelContextLimit, suffixMessage)

injectContextUsage(suffixMessage, currentTokens, modelContextLimit)
injectContextUsage(suffixMessage, config, currentTokens, modelContextLimit)

if (config.compress.mode !== "message") {
const blockGuidance = buildCompressedBlockGuidance(state, config.gc, { currentTokens, modelContextLimit })
Expand All @@ -181,17 +182,13 @@ export const injectCompressNudges = (

function injectContextUsage(
target: WithParts | null,
config: PluginConfig,
currentTokens?: number,
modelContextLimit?: number,
): void {
if (!target) return
if (currentTokens === undefined || modelContextLimit === undefined || modelContextLimit === 0) {
return
}

const percentage = ((currentTokens / modelContextLimit) * 100).toFixed(1)
const formatK = (n: number) => (n >= 1000 ? `${(n / 1000).toFixed(1)}K` : String(n))
const usageTag = `\n\nContext usage: ${formatK(currentTokens)} / ${formatK(modelContextLimit)} tokens (${percentage}%). ACP (Active Context Pruning) threshold: 55%. You ARE the ACP agent — use the compress tool proactively to manage context quality.`
const usageTag = buildContextUsageGuidance(config, currentTokens, modelContextLimit)
if (!usageTag) return

for (const part of target.parts) {
if (part.type === "text") {
Expand Down
49 changes: 45 additions & 4 deletions lib/messages/inject/utils.ts
Original file line number Diff line number Diff line change
Expand Up @@ -357,13 +357,54 @@ function applyMessageModeAnchoredNudge(
}
}

function buildContextUsageInfo(currentTokens?: number, modelContextLimit?: number): string {
/**
* Resolve a config threshold (number | "NN%") to a percentage value.
*/
function resolveThresholdPercent(
threshold: number | `${number}%` | undefined,
modelContextLimit: number | undefined,
): number | undefined {
if (threshold === undefined) return undefined
if (typeof threshold === "number") {
if (!modelContextLimit) return undefined
return (threshold / modelContextLimit) * 100
}
const parsed = parseFloat(threshold)
return isNaN(parsed) ? undefined : parsed
}

/**
* Build tiered context usage guidance based on actual config thresholds.
* Shared by inject.ts (suffix message) and utils.ts (anchored nudges).
*/
export function buildContextUsageGuidance(
config: PluginConfig,
currentTokens?: number,
modelContextLimit?: number,
): string {
if (currentTokens === undefined || modelContextLimit === undefined || modelContextLimit === 0) {
return ""
}
const percentage = ((currentTokens / modelContextLimit) * 100).toFixed(1)

const pct = (currentTokens / modelContextLimit) * 100
const percentage = pct.toFixed(1)
const formatK = (n: number) => (n >= 1000 ? `${(n / 1000).toFixed(1)}K` : String(n))
return `\n\nContext usage: ${formatK(currentTokens)} / ${formatK(modelContextLimit)} tokens (${percentage}%). ACP (Active Context Pruning) threshold: 55%. You ARE the ACP agent — use the compress tool proactively to manage context quality.`

const minPct = resolveThresholdPercent(config.compress.minContextLimit, modelContextLimit) ?? 45
const maxPct = resolveThresholdPercent(config.compress.maxContextLimit, modelContextLimit) ?? 55

const base = `Context usage: ${formatK(currentTokens)} / ${formatK(modelContextLimit)} tokens (${percentage}%). ACP threshold: ${maxPct.toFixed(0)}%.`

let guidance: string
if (pct < minPct) {
guidance = " Context is ample — focus on your task. Only compress obvious waste (large terminal outputs, duplicated content)."
} else if (pct < maxPct) {
guidance = " Context is moderate — compress completed sections and high-token waste. Preserve key details."
} else {
guidance = " Context is high — compress aggressively but selectively. Preserve only what is essential."
}

return `\n\n${base}${guidance}`
}

export function applyAnchoredNudges(
Expand All @@ -376,7 +417,7 @@ export function applyAnchoredNudges(
modelContextLimit?: number,
suffixMessage?: WithParts | null,
): void {
const contextUsageInfo = buildContextUsageInfo(currentTokens, modelContextLimit)
const contextUsageInfo = buildContextUsageGuidance(config, currentTokens, modelContextLimit)
const contextLimitNudgeWithUsage = prompts.contextLimitNudge + contextUsageInfo
const turnNudgeAnchors = collectTurnNudgeAnchors(state, config, messages)

Expand Down
59 changes: 42 additions & 17 deletions lib/prompts/system.ts
Original file line number Diff line number Diff line change
@@ -1,33 +1,58 @@
export const SYSTEM = `
You operate in a context-constrained environment. Manage context continuously to avoid buildup and preserve retrieval quality. Efficient context management is paramount for your agentic performance.

You operate in a context-constrained environment. Context management helps preserve retrieval quality, but your primary goal is completing the task at hand. Do not let context management distract from the actual work.

The tools you have for context management are \`compress\` and \`decompress\`. \`compress\` replaces older conversation content with technical summaries you produce. \`decompress\` restores previously compressed content when you need exact details.

\`<dcp-message-id>\` and \`<dcp-system-reminder>\` tags are environment-injected metadata. Do not output them.

THE PHILOSOPHY OF COMPRESS
\`compress\` transforms conversation content into dense, high-fidelity summaries. This is not cleanup - it is crystallization. Your summary becomes the authoritative record of what transpired.
COMPRESSION PHILOSOPHY

Compression replaces raw conversation content with dense summaries. When used correctly, it keeps your context sharp and focused. When used carelessly, it destroys information you need.

The key principle: compress based on context pressure, not habit. When context is ample, compress rarely or not at all. When context is tight, compress aggressively but selectively. The runtime context usage indicator tells you the current pressure level.

CONTEXT PRESSURE LEVELS

- Ample: Context is well below the threshold. Do NOT compress unless there is obvious waste (huge terminal dumps, duplicated content). Focus entirely on your task.
- Moderate: Context is approaching the threshold. Compress completed sections proactively. Prioritize high-token waste over minor cleanup.
- High: Context has exceeded the threshold. Compress aggressively. Every compression should free meaningful tokens. Preserve only what is essential for the current task.

WHAT TO COMPRESS FIRST (high value, low risk)

- Verbose terminal/bash command output (build logs, test output, directory listings)
- Exploration that led nowhere (failed approaches, dead-end searches)
- Redundant tool results (reading the same file multiple times, repeated status checks)
- Intermediate steps of completed multi-step tasks
- Large file contents that have already been used and are no longer needed

Think of compression as phase transitions: raw exploration becomes refined understanding. The original context served its purpose; your summary now carries that understanding forward.
WHAT TO COMPRESS CAREFULLY (high risk - verify before compressing)

COMPRESS WHEN
- Temporary secrets/keys/tokens needed later: Do NOT compress unless recorded elsewhere
- File paths and directory structures: Keep in summary - losing these wastes tokens rediscovering them
- Key function/method signatures and APIs: Summarize with exact names and signatures
- Critical error messages and stack traces: Keep the error type and key detail in summary
- User preferences and requirements: These must survive compression intact
- Architectural decisions and rationale: Summarize the decision, not just the conclusion

A section is genuinely closed and the raw conversation has served its purpose:
BEFORE COMPRESSING IMPORTANT CONTENT

- Research concluded and findings are clear
- Implementation finished and verified
- Exploration exhausted and patterns understood
- Dead-end noise can be discarded without waiting for a whole chapter to close
Verify the information is persisted in one of:
- A file you have written or edited
- An issue, PR, or devlog entry
- The compression summary itself (include the critical bits explicitly)

DO NOT COMPRESS IF
If it is not persisted anywhere, either persist it first or include it explicitly in your compression summary.

- Raw context is still relevant and needed for edits or precise references
- The target content is still actively in progress
- You may need exact code, error messages, or file contents in the immediate next steps
AFTER COMPRESSING

Before compressing, ask: _"Is this section closed enough to become summary-only right now?"_
Generate recovery breadcrumbs in your summary so future-you can reconstruct the context:
- Reference specific files by path
- Include key variable names, function signatures, or configuration values
- Note what was decided and why, not just what was done
- Example: "Implemented auth check in src/middleware.ts using validateToken() from auth.ts - user table is users not user"

Evaluate conversation signal-to-noise REGULARLY. Use \`compress\` deliberately with quality-first summaries. Prioritize stale content intelligently to maintain a high-signal context window that supports your agency.
If you later realize you need the original details from a compressed block, use \`decompress\` to restore them. You can decompress, read the content, then re-compress if needed.

It is of your responsibility to keep a sharp, high-quality context window for optimal performance.
Use \`compress\` and \`decompress\` deliberately with quality-first summaries. Prioritize stale content intelligently to maintain a high-signal context window.
`
Loading