Skip to content

feat(skills): workflow-audit — mine CC chat corpus for self-optimization patterns#5

Open
jamestexas wants to merge 1 commit into
mainfrom
feat/workflow-audit-skill
Open

feat(skills): workflow-audit — mine CC chat corpus for self-optimization patterns#5
jamestexas wants to merge 1 commit into
mainfrom
feat/workflow-audit-skill

Conversation

@jamestexas
Copy link
Copy Markdown
Owner

Summary

New skill workflow-audit. Sibling to existing self-audit — distinct surface.

Skill Reviews Triggered when
self-audit PR code Before tagging a human reviewer
workflow-audit (new) How I work with agents Maintenance ritual, or when friction is felt

Designed to be dispatched as a sub-agent so the caller's context stays clean and the analysis is reproducible across machines (including the user's work machine).

What it does

Mines the Claude Code chat corpus at ~/.claude/projects/*/*.jsonl and outputs a single ranked markdown report (~/workflow-audit-<YYYYMMDD>.md by default) with 10–15 findings, each carrying a concrete one-line intervention.

Three load-bearing rules

  1. Anti-temporal-bias — bucket by ISO week, exclude the 2 most recent weeks, rank by bucket_coverage × cost, drop findings appearing in <3 buckets.
  2. PII scrub — deterministic mapping applied to ALL output. Mapping never leaves the machine. Same audit runs on a work machine without leaking either work or personal specifics.
  3. Don't fabricate — 50-session minimum, real quotes only, contradictions surfaced.

Tools used

  • mache ingest claude-chats — consolidates transcripts → sqlite
  • chat-embed index/query (LLO crate, fastembed/MiniLM) — semantic clustering of repeated phrasings
  • Python stdlib only for bucketing + scoring

If mache/chat-embed aren't installed, the skill clones their repos from agentic-research and builds them.

Eight pattern shapes detected

Repeated corrections · tool-use thrash · pre-action over-asking · post-action under-asking on high-impact ops · dropped threads · restated constraints · long-cycle recurrence · phrasing-shift markers.

Output skeleton

Top findings (ranked) → dropped findings (transparency) → corpus health stats → Ready-to-apply (3 hand-picked) — the section that converts the audit into action without reading all 15.

Test plan

  • Invoke via /workflow-audit (or dispatch as sub-agent) on personal machine — produces a report
  • Same skill, work machine, with WORKFLOW_AUDIT_SKIP_PATHS env set — work projects excluded
  • Sanity: report contains zero raw repo/user/email/path names
  • Findings ranked by bucket_coverage × cost, not raw count

🤖 Generated with Claude Code

…ion patterns

New skill, sibling to self-audit. Different surface:
- self-audit reviews PR code before a human reviewer.
- workflow-audit reviews the user's interaction patterns with agents,
  from the Claude Code chat corpus.

Designed to be dispatched as a sub-agent so the caller's context stays
clean. Output is a single ranked markdown report at $OUTPUT_PATH (default
~/workflow-audit-<date>.md).

Three load-bearing rules:

1. Anti-temporal-bias — bucket by ISO week, exclude the 2 most recent
   weeks, rank by bucket_coverage × cost (not raw count), drop findings
   appearing in <3 buckets. The user dispatches many agents; recent
   weeks are dense and biased.
2. PII scrub — deterministic mapping table built on first encounter,
   applied to ALL output. Mapping never leaves the machine. Enables
   running the same audit on a work machine without leaking either
   work or personal-project specifics into the report.
3. Don't fabricate — minimum corpus threshold (50 sessions), example
   snippets must be real (scrubbed) quotes, contradictions surfaced
   rather than collapsed.

Tools used:
- mache ingest claude-chats to consolidate transcripts to sqlite db
- chat-embed index/query (LLO crate, fastembed/MiniLM) for semantic
  clustering of repeated user phrasings
- Python stdlib for bucketing + scoring

Eight pattern shapes detected: repeated corrections, tool-use thrash,
pre-action over-asking, post-action under-asking, dropped threads,
restated constraints, long-cycle recurrence, phrasing-shift markers.

Output skeleton includes a "Ready-to-apply (3 hand-picked)" section so
the audit converts to action without requiring the user to read all 15
findings.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant