Skip to content

Releases: manav8498/processfork

v1.0.15

10 May 03:51

Choose a tag to compare

Changelog

All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog,
and the project adheres to Semantic Versioning.

[1.0.15] — 2026-05-09

Closes the one production caveat from the v1.0.14 retest:
pf verify did not honor an operator-supplied session secret,
so true-ACRFence mode (where the operator deliberately keeps the
secret out-of-band rather than embedding it in the blob) silently
downgraded to "blob-integrity only." Blob hashes still verified;
the HMAC chain did not. v1.0.15 plumbs the secret through.

pf verify --session-secret-hex <HEX> (also honors PF_SESSION_SECRET)

  • New CLI flag accepts the operator's secret; clap env =
    attribute means PF_SESSION_SECRET=<hex> pf verify works
    unchanged from how operators already pass it to pf snapshot.
  • Secret precedence (highest → lowest): operator secret >
    embedded header.session_secret_hex > none. The operator
    secret WINS over an embedded one — true ACRFence requires the
    secret to live outside the blob, so trusting only the embedded
    one means an attacker who rewrites the blob can also re-sign
    it. Operators using out-of-band secrets get cryptographic
    certainty; embedded-secret tamper-detection mode keeps working
    for callers who don't have an out-of-band store.
  • New --fail-on-unverifiable-ledgers opt-in turns "skipped"
    into a hard failure when no secret is available — useful in CI
    to catch ledgers that were written before the v1.0.7 chain
    wiring.
  • New telemetry on the verify line: effects ledgers: N ok (M via operator secret), B bad, S skipped (no operator secret + no embedded secret). The "via operator secret" count is the
    signal that the real-ACRFence path was taken.

Behavior change matrix

Mode v1.0.14 v1.0.15
Snapshot embedded the secret + pf verify (no flag) ✅ verifies ✅ verifies (unchanged)
Snapshot used PF_SESSION_SECRET (out-of-band) + pf verify (no flag) 🟡 silently skipped chain 🟡 still skipped, but the verify line now says skipped (no operator secret + no embedded secret) so it's loud
Same as above + pf verify --session-secret-hex <SAME> ❌ silently skipped chain (no flag existed) ✅ verifies, via operator secret shown
PF_SESSION_SECRET=<SAME> pf verify (env-var path) ❌ env var wasn't read ✅ verifies via clap env =
Wrong secret supplied ❌ silently skipped ✅ HMAC mismatch — chain rejected, exit 4
Snapshot wrote pre-v1.0.7 ledger (no chain) + pf verify --fail-on-unverifiable-ledgers (flag didn't exist) ✅ exit 4 on the unverifiable ledger

Tests

  • New integration test
    verify_accepts_operator_supplied_session_secret_for_true_acrfence
    covers all six rows of the matrix above end-to-end via
    assert_cmd. The bug reproducer is row 3 (out-of-band
    secret + verify with same secret); v1.0.14 silently said
    "skipped", v1.0.15 says "1 ok (1 via operator secret)".

Note: the OpenAI-key warning

The auditor's report included "the OpenAI API key you pasted is
exposed; rotate it before production use." The maintainer didn't
paste an OpenAI key in this session — searched the conversation
end-to-end. Rotate any key you're worried about regardless;
ProcessFork's default secret-shaped env scrub (OPENAI_API_KEY,
*_TOKEN, *_SECRET, etc.) is on by default precisely because
this kind of mistake should not leak into a snapshot.

Versions

  • processfork (Rust + Python wheel): 1.0.14 → 1.0.15
  • All 8 internal pf-* crate version pins: → 1.0.15
  • npm @processfork/sdk: 1.0.14 → 1.0.15
  • processfork-criu: 1.0.14 → 1.0.15

Verification

  • cargo fmt --check, cargo clippy --workspace --all-targets -- -D warnings, cargo deny check: clean.
  • cargo test --workspace: 217 passed (was 216; +1).
  • All earlier audit-round fixes still stand.

v1.0.14

08 May 21:55

Choose a tag to compare

Changelog

All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog,
and the project adheres to Semantic Versioning.

[1.0.14] — 2026-05-08

Closes the three "left as-is" limitations from v1.0.13. None of
them was a bug; all three were "we genuinely cannot do this from
the maintainer's host" or "the safe default is too strict."
v1.0.14 makes each one materially better without weakening any
prior security or honesty posture.

Limitation 1: examples/06 + examples/07 were exit-2 stubs

The local PF_HAS_GPU=1 vLLM/SGLang examples printed "use the
Modal lane" and exited 2. The Modal lane is still the bit-exact
validation path, but the examples themselves are now genuinely
runnable on every host.

  • Mock-mode round-trip on every CI host. bash examples/06-vllm-bit-exact/run.sh (and examples/07-...) now
    drive the adapter's build_endpoints() API end-to-end with
    synthetic K/V pages, asserting byte-identical round-trip across
    snapshot → checkout. No GPU, no vLLM/SGLang import required —
    only the processfork-vllm / processfork-sglang adapter
    package itself (which ships pure-Python).
  • Three modes, decided at runtime:
    • No adapter installed → clean skip with install
      instructions.
    • Adapter installed, no GPU / no vLLM → mock-mode
      round-trip (the new default useful path).
    • PF_HAS_GPU=1 + adapter + vLLM/SGLang importable
      same flow, plus a footer pointer to the Modal lane for
      bit-exact validation.
  • Confirmed locally on macOS arm64: both examples round-trip
    3 synthetic K/V pages byte-identically end-to-end.

Limitation 2: CRIU Linux+CRIU only

processfork-criu and pf snapshot --criu-pid remain Linux-
only by definition (CRIU is a kernel-assisted snapshot system;
macOS/Windows have no equivalent). v1.0.14 ships a portable
respawn path alongside CRIU, so non-Linux operators get
something better than procs.unsupported.v1.

  • New pf snapshot --respawn-pid <PID> captures a
    procs.respawn.v1 blob: argv, cwd, env (Linux only — macOS
    needs root to read other-process environ; documented), exe
    path, parent PID, and the paths backing open file descriptors
    (/proc/<pid>/fd/* on Linux; lsof -p <pid> on macOS).
    Captured cross-platform: macOS arm64, Linux, Windows
    (best-effort; Windows currently emits the kind blob with empty
    argv/cwd until a Win32 implementation lands).
  • Respawn ≠ CRIU. Documented explicitly: respawn captures
    enough configuration to RE-INVOKE the process from scratch
    (think deployment metadata + state files); it does NOT capture
    register state, heap, in-flight syscalls, anonymous memory, or
    signal masks. Operators who need that fidelity stay on
    --criu-pid (Linux only).
  • --criu-pid and --respawn-pid are mutually exclusive — pick
    the right tool for the job. CLI errors out clearly if both are
    passed.
  • Regression test
    (snapshot_respawn_pid_emits_respawn_v1_blob) snapshots the
    test process's own PID on macOS and asserts the v1 marker, the
    argv non-emptiness, and the captured_on == host_os tag.

Limitation 3: absolute symlinks captured but rejected on restore

The v1.0.3 "Zip Slip" CVE fix (PF-SA-2026-001) refused absolute
symlinks at restore time as a hard error. The auditor flagged
this as awkward — captured trees often contain legitimate
absolute symlinks (e.g. /var/log/agent). The CVE protection is
about not WRITING through the symlink; whether to CREATE the
symlink is a separate decision.

  • Default behavior changed from hard-error to skip-with-warn.
    pf checkout now skips absolute symlinks with an
    eprintln!("warning: skipped absolute symlink ...") and
    continues restoring the rest of the tree. This matches what
    tar/rsync do and is a strict safety improvement (operator
    sees what was skipped; the rest of the restore still
    succeeds).
  • New pf checkout --allow-absolute-symlinks opt-in flag
    restores them verbatim. Operator explicitly acknowledges that
    anything later reading through the symlink may escape the
    sandbox.
  • The CVE protection is unchanged: relative symlinks that escape
    the staging root are still HARD-REFUSED (the depth-counter
    check in check_symlink_target); absolute paths (vs.
    targets) in the FS tree itself are still HARD-REFUSED via
    safe_join. The only thing that changed is what happens when
    the operator points a symlink AT an absolute target.
  • New library API: pf_world::RestoreOptions { allow_absolute_ symlinks: bool }, restore_tree_with_options(...). Existing
    callers of restore_tree(...) get the new safe default
    automatically.
  • Two regression tests pin the behavior:
    absolute_symlink_skipped_by_default_with_rest_restored
    (skipped by default, regular files still land);
    allow_absolute_symlinks_restores_them_verbatim (opt-in works,
    link target round-trips byte-identically).

Versions

  • processfork (Rust + Python wheel): 1.0.13 → 1.0.14
  • All 8 internal pf-* crate version pins: → 1.0.14
  • npm @processfork/sdk: 1.0.13 → 1.0.14
  • processfork-criu: 1.0.13 → 1.0.14

Verification

  • cargo fmt --check, cargo clippy --workspace --all-targets -- -D warnings, cargo deny check: clean.
  • cargo test --workspace: 216 passed (was 211; +5).
  • pytest across pf-py + pf-claude-code + pf-criu + pf-vllm +
    pf-sglang: 42 passed, 4 skipped (CRIU Linux + GPU paths).
  • node --test crates/pf-ts/test/smoke.mjs: 8/8.
  • bash examples/06-vllm-bit-exact/run.sh and
    bash examples/07-sglang-prefix-share/run.sh: both round-trip
    3 synthetic K/V pages byte-identically on macOS without GPU.

What's still not in scope

  • vLLM V1 engine bit-exact KV restore. vLLM-side fix; V0 +
    enforce_eager=True workaround documented in v1.0.12.
  • Generic CLI model+cache layer auto-discovery. "Walk a
    directory and produce a valid LoRA diff" is the source of most
    "I restored my agent and it half-worked" reports; the loud
    warning + adapter-populated path stays the answer.
  • Live PF_HAS_GPU=1 self-contained vLLM/SGLang test. The
    examples now do real adapter round-trip on every host; the
    bit-exact KV validation against actual vLLM still runs on
    Modal (scripts/gpu-validate-modal.py).

v1.0.13

08 May 07:29

Choose a tag to compare

Changelog

All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog,
and the project adheres to Semantic Versioning.

[1.0.13] — 2026-05-08

Closes the two confirmed bugs and the one Python SDK lineage gap
the v1.0.12 retest flagged. Independent matrix had been 10 PASS /
2 ISSUE / 3 LIMITATION; v1.0.13 turns the two ISSUEs into PASS and
makes the Python SDK lineage limitation a non-issue. The other
limitations (live GPU host, CRIU Linux-only, absolute-symlink
restore safety) are scope/environment notes, not bugs.

Issue 1: false merge conflicts from generated test artifacts

Reproduced with pytest's __pycache__/ and .pytest_cache/
landing in the captured tree on otherwise-disjoint branches. The
v1.0.12 CLI had no ignore mechanism beyond hardcoded defaults
(target/, node_modules/, .git/objects/, .pfcid).

  • New default-extra ignore set in WalkFsCapture::new:
    __pycache__, .pytest_cache, .mypy_cache, .ruff_cache,
    .tox, .coverage, .venv, .DS_Store, *.pyc, *.pyo.
    Conservative — every entry is a cache by definition, never a
    "maybe I want this" path.
  • New --ignore <PAT> CLI flag (repeatable). Plain entries
    (__pycache__, node_modules) match path components like
    before; glob entries (anything containing */?/[, e.g.
    *.pyc, *.log, **/build/**) match the relative path via
    the new globset dep.
  • New --ignore-from <PATH> CLI flag — reads gitignore-style
    rules from a file. Default: try <fs_root>/.pfignore then
    <fs_root>/.gitignore; pass --ignore-from /dev/null to opt
    out. Comments (#) and blank lines skipped; trailing /
    stripped; gitignore negation (!keep.pyc) is logged-and-
    skipped (full negation semantics arrive when an operator
    hits the use case).
  • New --no-default-ignores to opt out of the default-extra
    set (rare; CI auditing the cache shape, registry mirroring).
    CVE-relevant defaults (.git/objects, target, node_modules,
    .pfcid) are kept regardless.
  • New WalkFsCapture::new_without_default_ignores(root) and
    .ignore_from(path) API on the underlying library.
  • Regression coverage: 4 new unit tests in pf-world covering
    default-extra-ignores, glob *.pyc matching, opt-out, and
    .pfignore file parsing; 2 new CLI integration tests in
    cli_smoke.rs covering the snapshot end-to-end.

Issue 2: pf gc --retain-recent N left dangling log entries

Reproduced: pf log listed CIDs after GC, but pf checkout on
those CIDs failed because the layer blobs were gone. Root cause:
pf gc deleted unreachable blobs from blobs/sha256/<shard>/<hex>
but never the per-manifest marker files at
store_root/images/<cid>.json, which is what pf log walks via
store.iter_manifests(). The result was a referential-integrity
hole: index says "this CID exists", CAS says "I have no idea".

  • Fix: GC now tracks the set of evicted manifest CIDs and,
    after the blob sweep, deletes their images/<cid>.json
    markers. The output line counts both: deleted N unreachable blobs (B bytes) and M stale image markers.
  • Regression test (gc_retain_recent_prunes_image_markers):
    snapshot 3 manifests → pf gc --retain-recent 1 → assert
    pf log no longer lists the 2 evicted CIDs → assert
    pf checkout on an evicted CID fails AND pf checkout on
    the kept CID succeeds.

Limitation 3: Python SDK didn't expose parent lineage

processfork.snapshot_filesystem hardcoded parents: vec![] so
SDK-only forks couldn't be 3-way-merged (no LCA was discoverable).
Operators had to route through the CLI's --parent flag.

  • New parents: Sequence[str] | None = None kwarg on
    snapshot_filesystem. Bad CIDs surface as ValueError, not
    silent malformed manifests.
  • Regression coverage: test_snapshot_parents_field_lands_in_manifest
    pins manifest.parents round-trip; test_merge_two_forks_clean
    upgraded from "asserts 'no common ancestor' RuntimeError" to
    "asserts the merge succeeds clean with cid_x as ancestor";
    test_snapshot_rejects_bad_parent_cid covers the error path.

Versions

  • processfork (Rust + Python wheel): 1.0.12 → 1.0.13
  • All 8 internal pf-* crate version pins: → 1.0.13
  • npm @processfork/sdk: 1.0.12 → 1.0.13
  • processfork-criu: 1.0.12 → 1.0.13 (matches the CLI's
    --criu-pid baseline)

Verification

  • cargo fmt --check, cargo clippy --workspace --all-targets -- -D warnings, cargo deny check: clean.
  • cargo test --workspace: 211 passed (was 204; +7 from
    the new ignore + GC + Python-lineage integration coverage).
  • pytest crates/pf-py/python/tests/ adapters/pf-claude-code/tests/ adapters/pf-criu/tests/: 27 passed, 2 skipped (CRIU
    Linux-only paths still gate-skip on macOS as documented).
  • node --test crates/pf-ts/test/smoke.mjs: 8/8.

Still not in scope (auditor's "limitations", left as-is)

  • Live PF_HAS_GPU=1 vLLM/SGLang test — Modal lane is the
    validation; documented in v1.0.11.
  • CRIU is Linux-only by definition; macOS CI exercises Layer 1
    • the non-Linux skip paths.
  • Absolute symlinks captured but rejected on restore for
    sandbox-escape safety. This is the v1.0.3 "Zip Slip"
    hardening (PF-SA-2026-001); changing it would re-open the
    CVE. Operators who need absolute-symlink restore should
    resolve the link target post-checkout in their own code.

v1.0.12

08 May 03:32

Choose a tag to compare

Changelog

All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog,
and the project adheres to Semantic Versioning.

[1.0.12] — 2026-05-07

Closes the four "not-yet-production-ready" items v1.0.11 made
explicit. Two of them are runtime features (conflict-merge UI,
CRIU subprocess capture); two are honesty/UX (loud warnings on
empty engine layers, V1-engine workaround documented). One — V1
bit-exact KV restore — remains a vLLM-side change beyond
ProcessFork's reach and is now documented with the V0 +
enforce_eager=True workaround.

pf merge-resolve / pf merge-finalize — interactive conflict resolution

  • New top-level commands. Replace the v1.1-deferred placeholder
    with a real round-trip:
    1. pf merge A B → if conflicts, exits 3 with the resolve+
      finalize hint pointing at the merged-CID.
    2. pf merge-resolve <merged-cid> --workdir <dir> extracts
      the merged FS into <dir> (which must NOT pre-exist),
      scans for Git-style markers, and prints the conflict
      file list.
    3. Operator hand-edits.
    4. pf merge-finalize <merged-cid> --workdir <dir> re-walks
      the resolved tree, builds a single-parent image whose
      parent is <merged-cid>, returns the finalized CID.
  • pf merge-finalize refuses if any file in <dir> still
    contains conflict markers (exit code 3); pass --force to
    finalize as-is (for tree fixtures with legitimate <<<<<<<
    content).
  • Scan covers all three Git marker variants (<<<<<<<,
    =======, >>>>>>>); skips symlinks and binary files (NUL
    byte heuristic).
  • Round-trip regression test (merge_resolve_finalize_round_trip)
    exercises: snapshot-X-A-B-with-parent → conflicting merge →
    resolve workdir → finalize-without-resolution-fails →
    hand-resolve → finalize succeeds → finalized image's
    manifest.parents == [merged-cid] → checkout shows resolved
    content with no markers. --force path tested separately.

processfork-criu adapter + pf snapshot --criu-pid <PID>

  • New Python package processfork-criu (Linux-only at runtime)
    promotes the world layer's procs blob from
    procs.unsupported.v1 to procs.criu.v1. The bundle is a
    header-line JSON dict + raw tarball of CRIU's images-dir
    output, ready for pf verify to round-trip.
  • New CLI flag pf snapshot --criu-pid <PID> shells out to
    python3 -m processfork_criu (via inline script) to perform
    the dump. On macOS / Windows / non-criu Linux hosts the
    command exits with a clear "CRIU unavailable: …" message and
    the snapshot fails fast — no silent half-state.
  • Python API: processfork_criu.dump_pid(pid, leave_running=True, tcp_established=False) returns a CriuBundle whose
    serialize() is the on-disk format; restore_bundle(bundle, target_dir=...) returns the new PID after criu restore.
  • Test layering reflects the honesty caveat (same as the Modal
    vLLM lane: code is committed, validation lives where the
    kernel lives):
    • Layer 1 — runs on every host (8 tests): version match,
      v1 marker constants, header+tarball envelope round-trips,
      deserialize rejects wrong-kind / missing-newline,
      is_available() False on macOS, dump_pid /
      restore_bundle raise clean RuntimeError on macOS.
    • Layer 2 — Linux only, no criu needed (1 test, skips on
      macOS):
      is_available() reflects whether criu is on
      $PATH.
    • Layer 3 — Linux + criu binary (1 test, skips otherwise):
      end-to-end: spawn a heartbeat-writing Python child,
      criu dump it, SIGKILL the original PID, criu restore,
      assert the restored PID writes new heartbeats. This is
      the operator-runs-it validation; the maintainer's macOS
      CI has not run it.
      README has the caveat.
  • Rust-side test (snapshot_criu_pid_fails_cleanly_on_non_linux)
    confirms pf snapshot --criu-pid 1 on macOS exits non-zero
    with stderr mentioning CRIU/python3 (no panic, no silent empty
    procs blob).

Loud warning when generic CLI snapshot writes empty engine layers

  • pf snapshot now emits a multi-line stderr warning explaining
    that the model + cache layers are empty and that engine state
    requires the vLLM/SGLang adapter to populate. World (FS+env),
    trace, and effects ARE captured.
  • New --allow-empty-engine-layers flag suppresses the warning
    for CI/automation that has internalized the boundary.
  • The empty model + cache envelopes now carry a "note": "generic-cli-empty: populated by adapters, not by walking a directory" field so downstream tooling can detect them.

V1-engine bit-exact workaround documented

  • adapters/pf-vllm/README.md gets a new "Bit-exact replay: V0
    vs V1 engine" section with the V0 + enforce_eager=True
    workaround for callers who need byte-identical regenerated
    output. Calls out the throughput cost (1.3–1.8× slower
    without CUDA graphs), V0's feature-frozen status upstream,
    and when V1 + output-equivalent is acceptable (snapshot
    before destructive change vs. RL rollout reproducibility).
  • README's bit-exact metric row links to this section.

Versions

  • processfork (Rust + Python wheel): 1.0.11 → 1.0.12
  • All 8 internal pf-* crate version pins: → 1.0.12
  • npm @processfork/sdk: 1.0.11 → 1.0.12
  • New processfork-criu Python package: 1.0.12

Verification

  • cargo fmt --check, cargo clippy --workspace --all-targets -- -D warnings, cargo deny check: clean.
  • cargo test --workspace: 204 passed (was 199; +5 from the
    merge-resolve / merge-finalize / criu-pid integration tests).
  • pytest crates/pf-py/python/tests/ adapters/pf-claude-code/tests/ adapters/pf-criu/tests/: 25 passed, 2 skipped (CRIU Linux-
    only paths).
  • node --test crates/pf-ts/test/smoke.mjs: 8/8.
  • pf snapshot --criu-pid 1 on macOS: exits non-zero with
    "CRIU is Linux-only" — no panic.

What's still not in scope

  • vLLM V1 engine bit-exact KV restore. Documented workaround
    (V0 + enforce_eager); upstream V1 deterministic batch
    scheduling is the actual fix and lives in vllm/.
  • Generic CLI model/cache layer auto-discovery. The "walk a
    directory and produce a valid LoRA diff" approach is the
    source of most "I restored my agent and it half-worked"
    reports; we keep the empty-envelope-with-loud-warning path
    instead.

v1.0.11

08 May 02:44

Choose a tag to compare

Changelog

All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog,
and the project adheres to Semantic Versioning.

[1.0.11] — 2026-05-07

Documentation honesty pass. The v1.0.10 retest confirmed 12/12 of the
real-world matrix (FS snapshots, env redaction, HMAC ledger tamper
detection, 12 forks at 1.004× storage, clean+conflict merges, file://
registry, GC, symlink hardening, quiesce/resume, large binaries) but
flagged that the README's "vLLM/SGLang ✅ ships now / bit-exact KV"
framing and the example/test stubs labelled "v1.0.1 deferred
deliverable" did not match what actually shipped.

This release does not change runtime behavior. All earlier audit
fixes still stand. What changes:

README: adapter status table now distinguishes mock vs. live

  • Claude Code / LangGraph / OpenInterpreter / AutoGen / CrewAI keep
    ✅ — they snapshot/restore the FS + env + trace + effects layers
    and the auditor's matrix exercised them end-to-end.
  • vLLM / SGLang downgraded from ✅ to 🟡 mock ships v1.0 · live =
    Modal lane
    . The mock K/V page round-trip ships and is regression-
    tested; the bit-exact validation runs on Modal A10G via
    scripts/gpu-validate-modal.py, not from your local box.

README: 5-layer table now marks adapter-populated layers

  • World annotated: FS + env ship; the procs blob writes a
    procs.unsupported.v1 placeholder unless a CRIU/zombie-restart
    adapter is added (a v1.1 deliverable). Restored sessions do not
    bring back live PIDs; they bring back the FS+env+trace+effects
    state that lets a fresh worker continue.
  • Model and Cache annotated 🟡: format + math ship and run
    on the Modal lane, but the generic CLI snapshot path emits
    empty envelopes
    because these layers are populated by adapters
    (vLLM/SGLang/etc.), not by walking a directory.

README: bit-exact KV claim split V0 vs V1

  • v1.0.10 had one row claiming "Bit-exact KV-cache replay ✅ verified".
    The Modal JSONs say something more specific:
    • 2026-05-06-modal-a10g.json (V0 engine, TinyLlama-1.1B):
      bit_exact: true, 38 619 KV pages, byte-identical regen text.
    • 2026-05-06-modal-a10g-vllm-v1.json (V1 engine, collective_rpc):
      bit_exact: false; first-80-chars of regen output match across
      snapshot/restore (output-equivalent, not bit-exact).
  • README now has both rows, each linking to the source-of-truth JSON.
    Treat live V1 KV restore as "lossy semantic restore" today.

README: new "What does and doesn't ship in v1.0.x" subsection

  • Production-credible today (auditor's 12/12 matrix): pf snapshot/
    checkout for FS sandboxes; default secret-shaped env redaction
    (CLI + Python SDK + TS SDK); HMAC-chained effects ledger end-to-end
    with pf verify tamper detection; fork & merge incl. conflict
    marker materialization; file:// + OCI + S3 + HF registry transport;
    5 first-party adapters; vLLM/SGLang mock-mode K/V page persistence.
  • Not yet production-ready, made explicit: in-flight subprocess
    capture (CRIU adapter is v1.1); local PF_HAS_GPU=1 self-contained
    vLLM/SGLang test (it was always Modal-lane validation, the
    examples/06+07 + cache_bit_exact_vllm.rs were skeletons mislabelled
    "v1.0.1 deferred"); V1-engine bit-exact KV restore (output-
    equivalent only); conflict-merge resolution UI (markers ship,
    interactive pf merge --resolve is v1.1); generic CLI model+cache
    layer capture (adapter-populated only).

Skeleton/stub messages updated

  • examples/06-vllm-bit-exact/run.sh and
    examples/07-sglang-prefix-share/run.sh: removed the misleading
    "v1.0.1 deferred deliverable" pointer; both now point at
    modal run scripts/gpu-validate-modal.py and the JSONs in
    benchmarks/gpu-validation/, which is the actual validation path.
  • crates/pf-cache/tests/cache_bit_exact_vllm.rs: previously
    panic!("PF_HAS_GPU=1 set but pf-vllm adapter not yet wired").
    Now skips cleanly under any value of PF_HAS_GPU and points at
    the Modal lane + tests/cache_round_trip.rs (the on-host proxy
    that DOES exercise the cache code path everywhere).

Versions

  • processfork (Rust + Python wheel): 1.0.10 → 1.0.11
  • All 8 internal pf-* crate version pins: → 1.0.11
  • npm @processfork/sdk: 1.0.10 → 1.0.11

Why this matters

The runtime behavior in v1.0.10 was correct and the auditor's matrix
agreed. The README and a handful of stub messages were overselling.
Documentation that can't be matched against cargo test,
benchmarks/gpu-validation/*.json, or the example runners is the
same kind of trust hole as a code bug — operators who read the README
were going to spend a day chasing a "ships now" GPU validation that
the Modal lane already ran for them. v1.0.11 makes the boundary
match the reality.

v1.0.10

08 May 02:04

Choose a tag to compare

Changelog

All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog,
and the project adheres to Semantic Versioning.

[1.0.10] — 2026-05-07

Closes the two TypeScript-SDK gaps the v1.0.9 retest flagged. v1.0.7
hardened the CLI's snapshot path; v1.0.9 propagated the fix to the
Python SDK; v1.0.10 propagates it to the TypeScript SDK. The CLI,
Python SDK, and TypeScript SDK now all go through the same scrub
regex and the same HMAC-chained pf_effects::Ledger code path —
parity across all three surfaces.

Security: TS SDK env capture is no longer unsafe-by-default

  • snapshotFilesystem(store, kind, root, env, messages, opts?) now
    applies the same default scrub regex the CLI and Python SDK use
    ((?i)(?:^|_)(token|secret|password|passwd|pwd|api_?key|apikey| auth|bearer)(?:_|$)) — env keys matching it are stored as
    "<redacted>". JS callers that did
    snapshotFilesystem(..., { OPENAI_API_KEY: "...", PWD: root })
    were storing the raw API key bytes in world.env — the auditor
    reproduced the leak with two separate stores.
  • New opts.defaultScrubEnv: boolean = true knob; pass false to
    opt out (rare; CI debugging at most).
  • New opts.scrubEnv: string[] for additional regex patterns,
    mirroring the CLI's --scrub-env flag.
  • Smoke-test fix: the prior test/smoke.mjs passed new Map([...])
    for the env arg; napi-rs serializes JS Map instances to {}
    (only plain objects deserialize to Rust BTreeMap), so the test
    silently received empty env and never exercised the leak path.
    Switched to plain objects (the typed signature's documented
    shape) and added 3 regression tests:
    • default env scrub redacts secret-shaped names — proves
      OPENAI_API_KEY/GITHUB_TOKEN/DATABASE_PASSWORD/MY_API_KEY
      redacted, AND that the secret bytes don't appear anywhere in
      the serialized env blob.
    • defaultScrubEnv = false opts out — proves the opt-out path.
    • effects ledger is HMAC-chained — see below.

ACRFence: TS SDK ledger is HMAC-chained for real

  • Prior versions of the TS SDK ALWAYS wrote
    {"kind":"effects.ledger.v1","entries":0}\n to effects.ledger
    regardless of caller intent — TS integrations had no ACRFence
    protection at all, even when they had a real tool-call list.
  • New opts.effects: EffectEntry[] parameter; entries are routed
    through pf_effects::ledger::Ledger::append (per-entry
    session_hmac = HMAC(secret, prev_hash || this_hash)) and the
    blob comes out byte-compatible with the CLI/Python output —
    same header marker, same session_secret_hex embedding, same
    verification_mode = "tamper-detection".
  • pf verify validates SDK-produced ledgers through the same code
    path it already used for CLI ledgers (no pf verify change
    needed).
  • EffectEntry shape (camelCase TS): toolId, argsHash,
    resultHash, idempotencyKey, sideEffectClass ("pure" |
    "idempotent" | "irreversible" | "network-only"), timestamp
    (RFC-3339; defaults to now). All fields except toolId optional.

New SDK surface: readBlob

  • readBlob(store, digest): Buffer — fetches raw blob bytes by
    digest. Mirrors the Python SDK's processfork.read_blob.
    Adapters that need to inspect individual layer blobs (e.g. the
    smoke tests verifying the redaction wrote correctly to world.env,
    or a future TS LangGraph checkpointer reading the trace blob)
    call this.

Versions

  • processfork (Rust + Python wheel): 1.0.9 → 1.0.10
  • All 8 internal pf-* crate version pins: → 1.0.10
  • npm @processfork/sdk: 1.0.9 → 1.0.10

Why this matters

The v1.0.9 retest passed 13 of 13 real-world cases on the CLI +
Python paths but explicitly flagged the TS SDK as a blocker: a JS
caller using the typed signature exactly as documented was leaking
raw API keys to disk, and the TS effects ledger gave no ACRFence
protection regardless of caller intent. Both gaps are CLI/Python
fixes that hadn't been propagated to TS. They are now propagated,
with regression tests proving the exact attack patterns the auditor
reported.

v1.0.9

06 May 23:43

Choose a tag to compare

Changelog

All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog,
and the project adheres to Semantic Versioning.

[1.0.9] — 2026-05-06

Closes the two SDK-side gaps the v1.0.8 retest flagged. v1.0.7
hardened the CLI's snapshot path; the Python SDK was never wired
to the same hardening, so adapters that called
processfork.snapshot_filesystem(..., env=dict(os.environ)) (every
adapter in adapters/) re-opened the same secret-leak the CLI
audit had closed, and the SDK's effects ledger was raw JSONL with
no HMAC chain even though the CLI's was.

Security: SDK env capture is no longer unsafe-by-default

  • processfork.snapshot_filesystem() now applies the same default
    scrub regex the CLI uses ((?i)(?:^|_)(token|secret|password| passwd|pwd|api_?key|apikey|auth|bearer)(?:_|$)) — env keys
    matching it are stored as "<redacted>". Operators who genuinely
    need the raw env (rare; CI debugging at most) opt out via
    default_scrub_env=False.
  • New scrub_env: Sequence[str] | None = None parameter for extra
    custom regex patterns, mirroring the CLI's --scrub-env flag.
  • All 5 first-party adapters (Claude Code, LangGraph, OpenInterpreter,
    AutoGen, CrewAI) inherit the safe default automatically — none of
    them ever passed default_scrub_env=False to start with.
  • Regression tests: test_default_scrub_redacts_secret_shaped_env
    asserts that OPENAI_API_KEY, GITHUB_TOKEN, DATABASE_PASSWORD,
    MY_API_KEY are redacted AND that the secret bytes do not appear
    anywhere in the serialized blob; test_default_scrub_can_be_disabled
    asserts the opt-out path still works for operators who need it.

ACRFence: SDK ledger is HMAC-chained for real

  • processfork.snapshot_filesystem(..., effects=[...]) now routes
    every entry through pf_effects::ledger::Ledger::append, computing
    per-entry session_hmac = HMAC(secret, prev_hash || this_hash)
    the same code path the CLI's --effects-from-jsonl was switched
    to in v1.0.7. Prior versions stuffed the entries into a raw JSONL
    blob with no HMAC at all, so tamper / reorder / delete on the
    on-disk blob was undetectable.
  • A per-snapshot session secret is generated by default and embedded
    in the blob header (tamper-detection mode); operators who want full
    ACRFence supply PF_SESSION_SECRET=<hex> and the secret stays out
    of the blob.
  • pf verify already recognizes the embedded-secret format from
    v1.0.7 — SDK-produced blobs and CLI-produced blobs verify through
    the same code path now.
  • Regression test: test_effects_ledger_is_hmac_chained asserts
    the v1 header marker, the embedded session-secret-hex, and that
    every entry has a non-empty session_hmac ≥32 chars (catching
    the prior raw-JSONL session_hmac="" regression).

Versions

  • processfork (Rust + Python wheel): 1.0.8 → 1.0.9
  • All 8 internal pf-* crate version pins: → 1.0.9
  • npm @processfork/sdk: 1.0.8 → 1.0.9

Why this matters

The v1.0.8 audit retest passed 10 of 12 real-world cases but flagged
two genuine production blockers: (1) the SDK still leaked secret-shaped
env vars by default, and (2) SDK effects were raw JSONL not
HMAC-chained. Both are CLI-side fixes that hadn't been propagated
into pf-py. They are now propagated, with regression tests proving
both paths and confirmation that all 5 adapters inherit the safe
defaults.

v1.0.8

06 May 22:42

Choose a tag to compare

Changelog

All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog,
and the project adheres to Semantic Versioning.

[1.0.8] — 2026-05-06

Closes the 5th and final finding from the v1.0.6 audit — every
round-5 production-blocker is now resolved end-to-end.

Security: cargo-audit advisory ignores cleared

  • pyo3 0.22 → 0.24 (RUSTSEC-2025-0020, PyString::from_object
    buffer-overflow). The IntoPy::into_py API is deprecated in 0.24;
    pf-py's json↔PyObject converter and the merge report
    constructor were migrated to IntoPyObject::into_pyobject(...)? .into_any().unbind(). Builds clean under cargo clippy --workspace --all-targets -- -D warnings.
  • rustls-webpki 0.101.7 → 0.103.13 (RUSTSEC-2026-0098,
    -0099, -0104). Root cause was the rustls feature on
    aws-config / aws-sdk-s3, which routes through
    aws-smithy-runtime/tls-rustls
    aws-smithy-http-client/legacy-rustls-ring and pins rustls 0.21.
    Switched to the default-https-client feature, which routes
    through aws-smithy-http-client/rustls-aws-lc (rustls 0.23 +
    aws-lc-rs). cargo tree -i rustls-webpki now lists only 0.103.13
    — no more legacy rustls in the dep tree.
  • deny.toml ignore list dropped from 5 IDs → 1 (only the unrelated
    RUSTSEC-2025-0119 for number_prefix unmaintained-warning
    remains, transitive via indicatif's progress bars). cargo deny check reports advisories ok, bans ok, licenses ok, sources ok.

Versions

  • processfork (Rust + Python wheel): 1.0.7 → 1.0.8
  • All 8 internal pf-* crate version pins: → 1.0.8
  • npm @processfork/sdk was already at 1.0.8 from the prior cycle.

Why this matters

v1.0.7 shipped with a footnote: "round-5 finding #4 tracked for
v1.0.8." That note is gone. cargo deny check advisories now passes
without any RUSTSEC ignores in the AWS / pyo3 chains; the only
remaining ignore is a stylistic warning on a transitive progress-bar
dependency.

v1.0.7

06 May 22:25

Choose a tag to compare

Changelog

All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog,
and the project adheres to Semantic Versioning.

[1.0.7] — 2026-05-06

Closes 4 of 5 production-blocker findings from the v1.0.6 audit
(round 5). Audit's 5th finding (cargo-audit advisories on pyo3 0.22

  • rustls-webpki 0.101) is documented and tracked for v1.0.8 — see
    "Out of v1.0.7" below.

Security: env capture is no longer unsafe-by-default

  • pf snapshot runs a built-in regex ((?i)token|secret|password| passwd|pwd|api_?key|apikey|auth|bearer) that redacts secret-shaped
    env-var names UNLESS the operator passes --no-default-scrub.
    v1.0.6 captured every env var by default — operators with
    OPENAI_API_KEY / GITHUB_TOKEN / etc. in scope leaked them
    into the .pfimg unless they remembered --scrub-env. 1
    regression test (OPENAI_API_KEY + DATABASE_PASSWORD redacted,
    non-secret var preserved).

ACRFence: ledger writes are HMAC-chained for real

  • The CLI's --effects-from-jsonl write path (and the snapshot
    internal path) now route every entry through
    pf_effects::ledger::Ledger::append, which computes a per-entry
    session_hmac = HMAC(secret, prev_hash || this_hash). v1.0.6
    wrote raw JSONL with session_hmac = "", so tamper / reorder /
    delete on the on-disk blob was undetectable.
  • A per-snapshot session secret is generated by default and
    embedded in the blob header (tamper-detection mode). Operators
    who want full ACRFence supply PF_SESSION_SECRET=<hex> env var,
    in which case the secret is NOT echoed back into the blob.
  • pf verify now walks every manifest's effects ledger, runs
    Ledger::deserialize + verify(), and fails if the HMAC chain
    is bad. 1 regression test (snapshot 2 entries → tamper one
    entry's tool_id on disk → pf verify fails).

vLLM / SGLang plugins now actually persist

  • _snapshot writes every K/V page byte buffer + the per-snapshot
    manifest into a real ProcessFork store via the new SDK
    processfork.put_blob(). v1.0.6's hash was computed but never
    stored — the returned CID resolved to nothing on disk.
  • _checkout now reads the manifest from the store and replays
    every page via pager.write_page(). v1.0.6 just returned
    {"ok": true} without any work.
  • New SDK surface: processfork.put_blob(store, bytes) -> str.
  • Persistence works in both mock and live modes — the _live()
    gate that used to short-circuit was a usability filter, not a
    correctness one, and made the persistence path untestable
    without a real GPU. 4 new regression tests (vLLM + SGLang ×
    mock-mode + persistence-round-trip + unknown-CID-errors).

Versions aligned across surfaces

  • processfork (Rust + Python wheel): 1.0.6 → 1.0.7
  • processfork-vllm: 1.0.2 → 1.0.3 (real persistence)
  • processfork-sglang: 1.0.2 → 1.0.3 (real persistence)
  • @processfork/sdk (npm): 1.0.7 → 1.0.8
  • 8 Rust crates on crates.io: all → 1.0.7

Test count

196 → 199 cargo tests workspace-wide (+1 ledger HMAC tamper, +1
default-scrub, +1 quiesce-failure regression already in v1.0.6).
Plus 4 new vLLM/SGLang persistence regressions in adapters.

Out of v1.0.7 → tracked for v1.0.8

  • cargo audit ignores remain: pyo3 0.22.6 (RUSTSEC-2025-0020,
    buffer-overflow in PyString::from_object we don't call) and
    three rustls-webpki 0.101.7 advisories (transitive via
    aws-sdk-s3aws-smithy-http-clientrustls 0.21) are
    still in deny.toml's ignore list. Clearing them needs
    pyo3 → 0.24 (Bound API rewrite, ~30 min mechanical) and
    aws-sdk-s3 ≥1.135 (when it bumps its rustls floor, likely Q3
    2026). Each ignore has a documented scope-of-impact comment;
    none are exploitable in our use cases. v1.0.8 ships the bumps.

v1.0.6

06 May 20:35

Choose a tag to compare

Changelog

All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog,
and the project adheres to Semantic Versioning.

[1.0.6] — 2026-05-06

Closes 2 follow-up findings from the v1.0.5 audit (round 4).

Correctness fixes

  • OpenInterpreter result_hash collision (real bug). v1.0.5
    truncated the result string to 8 KiB BEFORE computing the hash,
    so two large outputs that diverged past byte 8192 collided.
    Fixed: run() now serializes the FULL output once, hashes those
    bytes (storing the hash in the ledger entry), and truncates only
    the displayed result field. The truncation suffix advertises
    the dropped byte count. Snapshot path prefers the pre-computed
    result_hash. 1 regression test that constructs two outputs
    sharing the first 9 KiB but diverging in the tail.

  • --resume-cmd not running on quiesce-cmd failure. v1.0.5's
    QuiesceGuard only stashed resume_cmd after a successful
    quiesce_cmd run, so a partial-failure quiesce (mutates app
    state, then fails) left the agent stuck in a half-quiesced state.
    Fixed: construct the guard FIRST (owns resume_cmd), THEN run
    quiesce_cmd — Rust's stack-unwind drop fires resume on the
    error-return path. Updated error message tells the operator
    resume will still run. 1 regression test verifies that a quiesce
    that touches a file then exit 7 still runs resume.

Versions

  • processfork (Rust + Python wheel): 1.0.5 → 1.0.6
  • processfork-openinterpreter: 1.0.2 → 1.0.3 (hash-before-truncate)
  • @processfork/sdk (npm): 1.0.6 → 1.0.7
  • 8 Rust crates on crates.io: all → 1.0.6

Test count

196 → 197 cargo tests workspace-wide (+1 quiesce-failure regression).
Plus +1 OI prefix-collision regression in adapters.