Skip to content

Enhance digital twin functionality with behavioral spec extraction#1

Merged
danielbentes merged 7 commits into
mainfrom
twin-improvements
May 15, 2026
Merged

Enhance digital twin functionality with behavioral spec extraction#1
danielbentes merged 7 commits into
mainfrom
twin-improvements

Conversation

@danielbentes
Copy link
Copy Markdown
Owner

@danielbentes danielbentes commented May 14, 2026

Summary

Behavioral Twin v1 turns the digital-twin output from a profile/memory dump into a compact, evidence-backed operational agent contract, then hardens the pipeline around untrusted corpus data and local artifact generation.

  • Adds a dedicated behavioral analysis/twin-spec.json extraction phase.
  • Renders ~/.claude/agents/twin.md primarily from that spec instead of hardcoded defaults or raw memory dumps.
  • Emits generated CLAUDE rule files for preferences, workflows, verification, and recovery.
  • Tightens corpus signals so full user messages beat duplicate truncated cache rows, while unmatched cache rows remain available as evidence.
  • Enforces the twin-spec schema before extraction succeeds or synthesis treats the spec as complete.
  • Adds deterministic eval coverage for Daniel-like operational behavior without live LLM calls.
  • Hardens generated HTML, prompt boundaries, optional network paths, PR mining, and symlink input handling.
  • Bumps the plugin and marketplace metadata to 0.3.0.

Key Changes

Behavioral twin spec

  • Added skills/digital-twin/scripts/extract-twin-spec.py.
  • Added references/twin-spec-schema.json and references/prompts/twin-spec-extraction.md.
  • The spec captures identity facts, operating model, decision policy, delegation policy, workflow policy, verification policy, recovery policy, voice policy, project routing, ranked never/always rules, examples, and evidence citations.
  • Added dependency-free schema validation in skills/digital-twin/scripts/twin_spec_validation.py.
  • Invalid nested specs now fail extraction and cause synthesis to emit the explicit degraded twin instead of a false "complete" agent.

Corrected pipeline order

  • Moved deep-source inventory generation before the LLM deep-read/spec path in the documented init flow.
  • Phase 4 now generates memory, plan, assistant-turn, and optional PR-comment evidence.
  • Phase 5 runs qualitative deep-read agents using those deep-source outputs.
  • Phase 5.5 extracts profile insights.
  • Phase 5.6 extracts the behavioral twin spec from reports, insights, stats, and deep-source inventories.
  • README manual flow now treats extract-twin-spec.py as default for replacement-agent output. It is only skipped for profile-only fallback runs where a degraded twin.md is acceptable.

Compact twin agent generation

  • Reworked synthesize.py so twin.md is rendered from analysis/twin-spec.json.
  • Replaced the old subagent template with a compact operational contract using model: inherit and explicit tool frontmatter.
  • If twin-spec.json is missing or invalid, synthesis still writes profile artifacts but emits an explicit incomplete-spec warning in twin.md.
  • Normalizes model-supplied bullet and number prefixes before rendering numbered lists.

Generated CLAUDE rules

  • Synthesis now writes:
    • rules/preferences.md
    • rules/workflows.md
    • rules/verification.md
    • rules/recovery.md
  • CLAUDE-md-patch.md is now a short install guide that imports those rules instead of pasting a long defaults block.

Corpus signal quality

  • extract-corpus.py now drops a last-prompt cache row only when it is an exact match or clear truncation-prefix duplicate of a full user/human message in the same session.
  • Unmatched last-prompt rows are preserved as behavior evidence instead of being dropped at session scope.
  • Corpus records include source_type, is_auto_wake, and is_human_typed.
  • quantitative.py computes human-typed metrics separately and filters false slash commands like /api, /users, /tmp, and /month.

Security hardening

  • Sanitizes LLM-provided interaction_style.narrative_html before rendering PROFILE.html.
  • Escapes scalar placeholders in the HTML template while preserving trusted generated HTML/SVG fragments.
  • Removes external Google Fonts from generated/sample PROFILE.html; the report is self-contained with local system fonts.
  • Escapes SVG chart text and aria labels.
  • Treats reports, corpus excerpts, memory bodies, paths, and quotes as untrusted evidence in every LLM extraction/deep-read prompt.
  • Makes Anthropic SDK/API-key fallback opt-in via extract-insights.py --allow-sdk-fallback.
  • Removes WebFetch from the default generated twin subagent tool list.
  • Hardens pr-comment-mining.sh argument parsing, Python path handling, UTF-8 file I/O, and GitHub repo/PR validation.
  • Skips symlinked or out-of-source session/memory files during corpus extraction, assistant-turn mining, memory inventory, and pushback detection.
  • Consolidates path-safety checks into a shared helper and makes the profile HTML raw-fragment allowlist explicit.
  • Corrects privacy docs to distinguish local-only phases from Claude/GitHub network calls.

Evaluation

  • Added scripts/evaluate-twin.py as a deterministic, no-network eval harness.
  • Added held-out fixture cases for approval, pushback/recovery, delegation, unknown-project routing, and planning behavior.
  • Added regression tests for spec extraction, nested schema validation, compact agent rendering, degraded mode, corpus signal filtering, symlink input hardening, HTML/SVG escaping, SDK fallback behavior, and eval scoring.
  • Added GitHub Actions CI for Python compile checks, Ruff, mypy, shell syntax, and pytest.

Release Validation

Local checks:

  • pytest -> 28 passed
  • ruff check . -> passed
  • mypy skills/digital-twin/scripts tests -> passed
  • python3 -m compileall -q skills/digital-twin/scripts skills/digital-twin/references/visualization tests -> passed
  • bash -n skills/digital-twin/scripts/pr-comment-mining.sh -> passed
  • git diff --check -> passed

Real-data temp validation, without overwriting the installed agent:

  • Used /private/tmp/digital-twin-release-20260514-223157.
  • Generated and schema-validated a real analysis/twin-spec.json.
  • Spec contains 5 identity facts, 10 never rules, 10 always rules, 10 project routes, and approved/delegation/plan/recovery examples.
  • Synthesized temp PROFILE.md, PROFILE.html, twin.md, CLAUDE-md-patch.md, generated rule files, and insight cards.
  • Verified generated artifacts contain no placeholders, no _TBD_, no See PROFILE.md, no raw memory dump, no incomplete-spec warning, no external font URLs, no WebFetch, and no obvious script/event-handler HTML patterns.
  • Parsed PROFILE.html with Python HTMLParser successfully.
  • Verified twin.md includes the expected operational sections: decision, delegation, verification, and recovery policies.
  • Verified PROFILE.md reports the real corpus counts: 9,678 prompts, 1,140 sessions, 39 projects, 144 memory files, 27 plans, and 3,550 convergence pairs.

Notes

  • This PR updates the pipeline and tests. It does not overwrite the currently installed ~/.claude/agents/twin.md until the user runs /digital-twin:init or /digital-twin:update.
  • Optional GitHub PR mining remains skipped when gh is unauthenticated locally; the rest of the corpus/profile/twin pipeline still completes.

- Updated README.md to reflect the addition of Phase 4.6, which extracts a compact behavioral `twin-spec.json` for the operational contract of the sub-agent.
- Revised init.md and update.md to include the new Phase 4.6 in the pipeline, ensuring the behavioral spec is generated and utilized.
- Introduced new generated rule files in the output, enhancing the CLAUDE.md patch with user-level rules for preferences, workflows, verification, and recovery.
- Improved the extraction scripts to prioritize full user messages over truncated cache rows, ensuring higher quality data for analysis.
- Updated methodology documentation to clarify the purpose and output of the new behavioral twin spec extraction phase.

This commit enhances the digital twin's ability to accurately reflect user behavior and preferences, improving the overall functionality and user experience.
Copy link
Copy Markdown
Owner Author

@danielbentes danielbentes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review findings:

  1. P1 commands/init.md:102-111 runs extract-twin-spec before Phase 5, but extract-twin-spec.py:68-99 consumes plan, convergence, and memory inventory. First-run specs therefore omit the exact deep-source evidence the twin is supposed to encode.

  2. P2 extract-corpus.py:204-215 drops every last-prompt row in a session if any full user/human row exists. The stated behavior is to prefer full user messages only when they represent the same turn. Count-only validation on the local corpus showed 12,503 mixed-session last-prompt rows, with 9,161 not matching any full user text, so this can silently discard real behavior evidence.

  3. P2 extract-twin-spec.py:174-202 defines only shallow validation while the schema at references/twin-spec-schema.json is much stricter. A malformed nested spec still exits 0 and gets rendered as complete, which can produce incomplete policy sections without triggering degraded mode.

Verification run: py_compile passed, pytest -q passed with 18 tests, and evaluate-twin.py on heldout_cases.json reported twin_win_rate=1.0 and pushback_trigger_hit_rate=1.0.

@danielbentes danielbentes merged commit 8e2d7b0 into main May 15, 2026
1 check passed
@danielbentes danielbentes deleted the twin-improvements branch May 15, 2026 07:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant