Skip to content

Add KV-backed agent context checkpoints#252

Open
FabioMalpezzi wants to merge 8 commits into
antirez:mainfrom
FabioMalpezzi:fix/agent-kv-context-hardening
Open

Add KV-backed agent context checkpoints#252
FabioMalpezzi wants to merge 8 commits into
antirez:mainfrom
FabioMalpezzi:fix/agent-kv-context-hardening

Conversation

@FabioMalpezzi
Copy link
Copy Markdown

@FabioMalpezzi FabioMalpezzi commented May 25, 2026

This adds an explicit context tool for ds4-agent checkpoints and restores, backed by persisted KV state.

The scope is intentionally narrow: let the agent save a known-good model state, restore it later, and report what happened in a form that can be tested.

Included:

  • agent_context checkpoint, restore, list, show, and drop.
  • Restore guardrails: side-effect epoch checks, active bash-job checks, metadata/token mismatch handling, and expected/actual restore metrics.
  • More robust checkpoint metadata parsing and explicit tool-action enums.
  • A compaction canary e2e test that fails closed if the model returns an unusable summary.
  • A KV benefit benchmark that measures avoided prefill tokens after restore.
  • An adaptive self-improvement e2e test. If the native Git tool is available, the prompt asks the agent to use it for status/diff inspection. If not, the test falls back to the existing bash path with git status --short and git diff.

The self-improvement e2e is deliberately a controlled temporary-repository test. It proves the loop shape: inspect, edit, test, diff, checkpoint, restore, retest, and record evidence. It does not claim that DS4 found and optimized DS4 itself. The docs call out the stronger follow-up: a DS4-on-DS4 optimization loop that selects a small measurable improvement, implements it, runs the relevant benchmark/e2e check, and records whether the metric improved.

This PR is independent from #250. It does not add or link the native Git tool implementation. If #250 lands first, the adaptive e2e can exercise native Git inspection; otherwise it keeps working with the current bash-based path.

Tests run after rebasing on current origin/main:

  • make test
  • make test-agent-context-compact-canary
  • make test-kv-cache-benefit
  • make test-agent-context-self-improvement

KV benefit sample from this run:

  • full prefill tokens: 2905
  • restored prefill tokens: 18
  • saved prefill tokens: 2887
  • quality guard: logits equivalence

@FabioMalpezzi FabioMalpezzi force-pushed the fix/agent-kv-context-hardening branch from 8cc6264 to c23c702 Compare May 25, 2026 16:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant