Skip to content

v0.0.28: per-phase + per-item granular cache reuse#2

Merged
heggria merged 5 commits into
mainfrom
feat/per-item-map-caching
Jun 27, 2026
Merged

v0.0.28: per-phase + per-item granular cache reuse#2
heggria merged 5 commits into
mainfrom
feat/per-item-map-caching

Conversation

@heggria

@heggria heggria commented Jun 27, 2026

Copy link
Copy Markdown
Owner

Summary

Granular-reuse release. v0.0.27 proved the incremental-recompute cost win; this release makes that win far larger and easier to opt into — invalidation drops from whole-flow to per-phase and per-item, with a single flag to flip a whole flow into cross-run reuse.

Changes

  • Per-phase structural sub-fingerprint (v3:phasefp) — editing one phase invalidates only it and its transitive dependents; an independent sibling keeps its cache hit. Emits a 4-tier read ladder (v3 write → v2/bare/legacy read-only) so the upgrade is additive (no miss-storm). Fail-open to whole-flow under contextSharing, shareContext, join:"any", or sub-flow inner phases.
  • Per-item map caching — when one of N items changes between runs, only that item re-executes (N−1 cache hits). Per-item keys omit the structural fingerprint (which hashes the whole over source) so changing one item no longer moves every key at once. Whole-map fast path and all soundness fallbacks preserved.
  • incremental flag — flow-level + invocation-level. Defaults every phase to scope:"cross-run" without per-phase annotation. Invocation arg wins over the flow field; per-phase cache and cross-run-blocked types (gate/approval/loop/tournament) still take precedence; default stays safe run-only.
  • Reuse reporting — end-of-run report and /tf recompute now show reused-vs-executed counts and a per-phase "Why" trace (▲ rerun / ✂ cutoff / ✓ reused / ✗ failed with ← causedBy). Dollar figures only for within-run reuse; cross-run hits counted without an invented saving.
  • phaseFingerprint strips more policy fields (cache, retry, concurrency, final) — none changes a phase's output, so a no-op config tweak no longer falsely invalidates.

Tests

  • 804 → 846 (+42) across 46 test files.
  • New: cache-phasefp, cache-peritem, incremental-flag, reuse-summary.
  • npm run typecheck clean, npm test 846/846 green.

Release

  • Version bumped to 0.0.28, CHANGELOG entry added, README counts refreshed, taskflow skill docs updated, tag v0.0.28 pushed.

heggria added 5 commits June 25, 2026 19:30
Replace the whole-flow v2:flowdef cache-key tier with a per-phase
structural sub-fingerprint so editing phase B invalidates only B and its
transitive dependents — independent sibling phase A keeps its cache hit.

phaseFingerprint(def, phaseId) (extensions/flowir/phasefp.ts) hashes the
phase plus its transitive dependsOn ∪ from closure, reusing the vendored
canonicalJson + hashCanonical (byte-identical to overstory's contract).
Only the policy field cache is stripped; every other Phase field is hashed.

Soundness fallback: phaseFingerprint returns undefined (→ caller folds the
whole-flow flowDefHash, preserving pre-M6 behavior) when per-phase
invalidation cannot be statically guaranteed — contextSharing at flow level,
any shareContext phase in the closure, or any flow phase in the closure.
Sub-flow inner phases always use this fallback.

cacheKeys now produces a 4-tier ladder: key (v3:phasefp, write) → v2Key
(v2:flowdef, read-only) → bareKey (bare flowdef, read-only) → legacyKey
(no flowdef, read-only). cachedPhase consults all four read-only on a miss;
recordCache writes only key. This makes the M6 upgrade additive — no
miss-storm for unchanged flows.

phaseFingerprints computed once per run in runTaskflowLayers alongside
flowDefHash, plumbed through RunState + PhaseCacheCtx. Fail-open: any
per-phase error degrades that phase to the whole-flow hash.

Tests: test/cache-phasefp.test.ts (11 tests — soundness gate, determinism,
precise-diff win, transitive propagation, v2 fallback, cross-flow
isolation, shareContext fallback). Updated cache-migration.test.ts
(distinct 4-tier keys; structural-change test now scoped to p's closure)
and runtime.test.ts resume tests to the v3 key shape.
Add per-item cross-run memoization to the map phase so that when one of N
items changes between runs, only that item re-executes (N-1 cache hits) —
while preserving the existing whole-map fast path and all soundness fallbacks.

Mechanism:
- runFanout accepts an optional perItem hook. Before spawning a subagent for
  an item, it consults cachedPhase with a per-item key; a hit returns a
  0-token synthesized RunResult (stopReason "cache-hit") that flows through
  mergePhaseState as a normal successful item. Successful fresh items are
  recorded per-item for future runs.
- Per-item keys fold [phase.id, it.agent, model, it.task] + the existing
  v3:phasefp/flowName/fingerprint/thinking/tools/preRead tail. Folding
  it.agent (arbiter fix) prevents a stale cross-agent hit when only
  phase.agent changes.
- Whole-map lookup stays first (fast path); per-item engages only on a
  whole-map miss. A trailing whole-map record keeps the fast path warm.

Soundness gates (per-item disabled -> whole-map only):
- cross-run scope required (run-only/"off" have no persistent store)
- shareContext / flow-wide contextSharing disabled (items may read sibling
  blackboard writes outside declared deps)
- inside a runtime-generated sub-flow (def: frame — untrusted)
- undefined phaseFingerprint is NOT a blocker (cacheKeys falls back to
  flowDefHash, which is stable for a fixed def)

Correctness:
- merged output labels are positionally aligned with over ([k/N] using
  results.length), budget-skipped items filtered to null; cache-hit items
  keep their positional slot
- cached items contribute emptyUsage -> partial-hit cost == re-executed item only
- failed and budget-skipped items are never recorded per-item
- fail-open: any cache read/write error degrades to executing the item

Backward-compat: pre-existing whole-map entries (any tier) still hit via
cachedPhase's 4-tier read-only fallback; the whole-map key format is unchanged.

Tests: new test/cache-peritem.test.ts (11 tests) covering the Test Matrix —
partial reuse, positional alignment, duplicate sharing, shareContext/def-frame
fallbacks, whole-map fast path, revert, usage/subProgress, failed/skipped
non-caching, and the agent-invalidation arbiter fix.
Per-item cross-run cache keys for the `map` phase folded both `phaseFp`
and `flowDefHash` (via the whole-phase `cc`). Both fingerprints hash the
`over` array source, so when a literal or data-derived `over` changed ONE
item between runs, EVERY per-item key moved at once — defeating partial
reuse (all N items re-executed instead of just the changed one).

Fix: build a per-item `ccPerItem` with BOTH `phaseFp` and `flowDefHash`
set to `undefined`, and use it only for per-item key construction. A
single item's output is fully specified by it.task (template + item/as
value + upstream-output refs + args) + it.agent + model +
thinking/tools/preRead + the world-state fingerprint; `over` only
determines WHICH items exist, not WHAT any item computes. `flowName` is
retained for cross-flow collision prevention.

The whole-map key keeps the FULL cc (phaseFp + flowDefHash) so its fast
path and any pre-existing whole-map entries are unchanged (backward
compat). The perItem object now carries its own cc so the lookup and
record paths in runFanout use the per-item variant consistently.

Soundness is preserved: task-template, agent, model, as (via resolved
it.task), upstream-output, and world-state changes all still invalidate
the correct items. shareContext / def-frame / failed / budget-skipped
fallbacks are unchanged.

Tests: add a bug-reproduction test (literal over, change 1 of N items)
that FAILS before the fix (counter 3 to 6) and PASSES after (3 to 4),
plus literal-over soundness variants (task/agent/upstream change imply
full re-exec) and whole-map fast-path + partial-hit + failed-item
de-masking variants. Update the budget-skipped test key reconstruction
to use the per-item cc (fingerprints omitted). Fix the e2e incremental
suite map section (add output: json so the merged-output assertion holds).
Add a flow-level and invocation-level `incremental` flag that defaults every
phase to cross-run caching (scope:"cross-run"), so re-running a flow reuses
unchanged phases without annotating each phase. The invocation arg wins over
the flow field; per-phase cache settings and the cross-run-blocked types
(gate/approval/loop/tournament) still take precedence; default stays run-only.

Surface the effect: the end-of-run cache report and /tf recompute now show
reused-vs-executed counts plus a per-phase "Why" trace (rerun/cutoff/reused/
failed with causedBy). Dollar figures are reported only for within-run reuse;
cross-run hits are counted without inventing a saving.

Also strip retry/concurrency/final from phaseFingerprint (none changes a
phase's output, so a no-op config tweak no longer falsely invalidates), and
fall back to whole-flow invalidation for join:"any" phases (they may read
refs outside their declared dependsOn).

Tests: add incremental-flag and reuse-summary suites; extend cache-phasefp
and recompute coverage.
Bump to 0.0.28. Document the granular-reuse release: per-phase structural
sub-fingerprint (v3:phasefp), per-item map caching, the incremental flag, and
reuse reporting. Refresh README test counts (804 -> 846 across 46 files) and
add per-item map caching to the headline. Document the incremental flag and
its precedence in the taskflow skill.
@heggria heggria merged commit 412ac24 into main Jun 27, 2026
2 checks passed
@heggria heggria deleted the feat/per-item-map-caching branch June 27, 2026 06:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant