Releases: manav8498/processfork
v1.0.15
Changelog
All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog,
and the project adheres to Semantic Versioning.
[1.0.15] — 2026-05-09
Closes the one production caveat from the v1.0.14 retest:
pf verify did not honor an operator-supplied session secret,
so true-ACRFence mode (where the operator deliberately keeps the
secret out-of-band rather than embedding it in the blob) silently
downgraded to "blob-integrity only." Blob hashes still verified;
the HMAC chain did not. v1.0.15 plumbs the secret through.
pf verify --session-secret-hex <HEX> (also honors PF_SESSION_SECRET)
- New CLI flag accepts the operator's secret;
clapenv =
attribute meansPF_SESSION_SECRET=<hex> pf verifyworks
unchanged from how operators already pass it topf snapshot. - Secret precedence (highest → lowest): operator secret >
embeddedheader.session_secret_hex> none. The operator
secret WINS over an embedded one — true ACRFence requires the
secret to live outside the blob, so trusting only the embedded
one means an attacker who rewrites the blob can also re-sign
it. Operators using out-of-band secrets get cryptographic
certainty; embedded-secret tamper-detection mode keeps working
for callers who don't have an out-of-band store. - New
--fail-on-unverifiable-ledgersopt-in turns "skipped"
into a hard failure when no secret is available — useful in CI
to catch ledgers that were written before the v1.0.7 chain
wiring. - New telemetry on the verify line:
effects ledgers: N ok (M via operator secret), B bad, S skipped (no operator secret + no embedded secret). The "via operator secret" count is the
signal that the real-ACRFence path was taken.
Behavior change matrix
| Mode | v1.0.14 | v1.0.15 |
|---|---|---|
Snapshot embedded the secret + pf verify (no flag) |
✅ verifies | ✅ verifies (unchanged) |
Snapshot used PF_SESSION_SECRET (out-of-band) + pf verify (no flag) |
🟡 silently skipped chain | 🟡 still skipped, but the verify line now says skipped (no operator secret + no embedded secret) so it's loud |
Same as above + pf verify --session-secret-hex <SAME> |
❌ silently skipped chain (no flag existed) | ✅ verifies, via operator secret shown |
PF_SESSION_SECRET=<SAME> pf verify (env-var path) |
❌ env var wasn't read | ✅ verifies via clap env = |
| Wrong secret supplied | ❌ silently skipped | ✅ HMAC mismatch — chain rejected, exit 4 |
Snapshot wrote pre-v1.0.7 ledger (no chain) + pf verify --fail-on-unverifiable-ledgers |
(flag didn't exist) | ✅ exit 4 on the unverifiable ledger |
Tests
- New integration test
verify_accepts_operator_supplied_session_secret_for_true_acrfence
covers all six rows of the matrix above end-to-end via
assert_cmd. The bug reproducer is row 3 (out-of-band
secret + verify with same secret); v1.0.14 silently said
"skipped", v1.0.15 says "1 ok (1 via operator secret)".
Note: the OpenAI-key warning
The auditor's report included "the OpenAI API key you pasted is
exposed; rotate it before production use." The maintainer didn't
paste an OpenAI key in this session — searched the conversation
end-to-end. Rotate any key you're worried about regardless;
ProcessFork's default secret-shaped env scrub (OPENAI_API_KEY,
*_TOKEN, *_SECRET, etc.) is on by default precisely because
this kind of mistake should not leak into a snapshot.
Versions
processfork(Rust + Python wheel): 1.0.14 → 1.0.15- All 8 internal
pf-*crate version pins: → 1.0.15 - npm
@processfork/sdk: 1.0.14 → 1.0.15 processfork-criu: 1.0.14 → 1.0.15
Verification
cargo fmt --check,cargo clippy --workspace --all-targets -- -D warnings,cargo deny check: clean.cargo test --workspace: 217 passed (was 216; +1).- All earlier audit-round fixes still stand.
v1.0.14
Changelog
All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog,
and the project adheres to Semantic Versioning.
[1.0.14] — 2026-05-08
Closes the three "left as-is" limitations from v1.0.13. None of
them was a bug; all three were "we genuinely cannot do this from
the maintainer's host" or "the safe default is too strict."
v1.0.14 makes each one materially better without weakening any
prior security or honesty posture.
Limitation 1: examples/06 + examples/07 were exit-2 stubs
The local PF_HAS_GPU=1 vLLM/SGLang examples printed "use the
Modal lane" and exited 2. The Modal lane is still the bit-exact
validation path, but the examples themselves are now genuinely
runnable on every host.
- Mock-mode round-trip on every CI host.
bash examples/06-vllm-bit-exact/run.sh(andexamples/07-...) now
drive the adapter'sbuild_endpoints()API end-to-end with
synthetic K/V pages, asserting byte-identical round-trip across
snapshot → checkout. No GPU, no vLLM/SGLang import required —
only theprocessfork-vllm/processfork-sglangadapter
package itself (which ships pure-Python). - Three modes, decided at runtime:
- No adapter installed → clean skip with install
instructions. - Adapter installed, no GPU / no vLLM → mock-mode
round-trip (the new default useful path). - PF_HAS_GPU=1 + adapter + vLLM/SGLang importable →
same flow, plus a footer pointer to the Modal lane for
bit-exact validation.
- No adapter installed → clean skip with install
- Confirmed locally on macOS arm64: both examples round-trip
3 synthetic K/V pages byte-identically end-to-end.
Limitation 2: CRIU Linux+CRIU only
processfork-criu and pf snapshot --criu-pid remain Linux-
only by definition (CRIU is a kernel-assisted snapshot system;
macOS/Windows have no equivalent). v1.0.14 ships a portable
respawn path alongside CRIU, so non-Linux operators get
something better than procs.unsupported.v1.
- New
pf snapshot --respawn-pid <PID>captures a
procs.respawn.v1blob: argv, cwd, env (Linux only — macOS
needs root to read other-process environ; documented), exe
path, parent PID, and the paths backing open file descriptors
(/proc/<pid>/fd/*on Linux;lsof -p <pid>on macOS).
Captured cross-platform: macOS arm64, Linux, Windows
(best-effort; Windows currently emits the kind blob with empty
argv/cwd until a Win32 implementation lands). - Respawn ≠ CRIU. Documented explicitly: respawn captures
enough configuration to RE-INVOKE the process from scratch
(think deployment metadata + state files); it does NOT capture
register state, heap, in-flight syscalls, anonymous memory, or
signal masks. Operators who need that fidelity stay on
--criu-pid(Linux only). --criu-pidand--respawn-pidare mutually exclusive — pick
the right tool for the job. CLI errors out clearly if both are
passed.- Regression test
(snapshot_respawn_pid_emits_respawn_v1_blob) snapshots the
test process's own PID on macOS and asserts the v1 marker, the
argv non-emptiness, and thecaptured_on == host_ostag.
Limitation 3: absolute symlinks captured but rejected on restore
The v1.0.3 "Zip Slip" CVE fix (PF-SA-2026-001) refused absolute
symlinks at restore time as a hard error. The auditor flagged
this as awkward — captured trees often contain legitimate
absolute symlinks (e.g. /var/log/agent). The CVE protection is
about not WRITING through the symlink; whether to CREATE the
symlink is a separate decision.
- Default behavior changed from hard-error to skip-with-warn.
pf checkoutnow skips absolute symlinks with an
eprintln!("warning: skipped absolute symlink ...")and
continues restoring the rest of the tree. This matches what
tar/rsyncdo and is a strict safety improvement (operator
sees what was skipped; the rest of the restore still
succeeds). - New
pf checkout --allow-absolute-symlinksopt-in flag
restores them verbatim. Operator explicitly acknowledges that
anything later reading through the symlink may escape the
sandbox. - The CVE protection is unchanged: relative symlinks that escape
the staging root are still HARD-REFUSED (the depth-counter
check incheck_symlink_target); absolute paths (vs.
targets) in the FS tree itself are still HARD-REFUSED via
safe_join. The only thing that changed is what happens when
the operator points a symlink AT an absolute target. - New library API:
pf_world::RestoreOptions { allow_absolute_ symlinks: bool },restore_tree_with_options(...). Existing
callers ofrestore_tree(...)get the new safe default
automatically. - Two regression tests pin the behavior:
absolute_symlink_skipped_by_default_with_rest_restored
(skipped by default, regular files still land);
allow_absolute_symlinks_restores_them_verbatim(opt-in works,
link target round-trips byte-identically).
Versions
processfork(Rust + Python wheel): 1.0.13 → 1.0.14- All 8 internal
pf-*crate version pins: → 1.0.14 - npm
@processfork/sdk: 1.0.13 → 1.0.14 processfork-criu: 1.0.13 → 1.0.14
Verification
cargo fmt --check,cargo clippy --workspace --all-targets -- -D warnings,cargo deny check: clean.cargo test --workspace: 216 passed (was 211; +5).pytestacross pf-py + pf-claude-code + pf-criu + pf-vllm +
pf-sglang: 42 passed, 4 skipped (CRIU Linux + GPU paths).node --test crates/pf-ts/test/smoke.mjs: 8/8.bash examples/06-vllm-bit-exact/run.shand
bash examples/07-sglang-prefix-share/run.sh: both round-trip
3 synthetic K/V pages byte-identically on macOS without GPU.
What's still not in scope
- vLLM V1 engine bit-exact KV restore. vLLM-side fix; V0 +
enforce_eager=Trueworkaround documented in v1.0.12. - Generic CLI model+cache layer auto-discovery. "Walk a
directory and produce a valid LoRA diff" is the source of most
"I restored my agent and it half-worked" reports; the loud
warning + adapter-populated path stays the answer. - Live PF_HAS_GPU=1 self-contained vLLM/SGLang test. The
examples now do real adapter round-trip on every host; the
bit-exact KV validation against actual vLLM still runs on
Modal (scripts/gpu-validate-modal.py).
v1.0.13
Changelog
All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog,
and the project adheres to Semantic Versioning.
[1.0.13] — 2026-05-08
Closes the two confirmed bugs and the one Python SDK lineage gap
the v1.0.12 retest flagged. Independent matrix had been 10 PASS /
2 ISSUE / 3 LIMITATION; v1.0.13 turns the two ISSUEs into PASS and
makes the Python SDK lineage limitation a non-issue. The other
limitations (live GPU host, CRIU Linux-only, absolute-symlink
restore safety) are scope/environment notes, not bugs.
Issue 1: false merge conflicts from generated test artifacts
Reproduced with pytest's __pycache__/ and .pytest_cache/
landing in the captured tree on otherwise-disjoint branches. The
v1.0.12 CLI had no ignore mechanism beyond hardcoded defaults
(target/, node_modules/, .git/objects/, .pfcid).
- New default-extra ignore set in
WalkFsCapture::new:
__pycache__,.pytest_cache,.mypy_cache,.ruff_cache,
.tox,.coverage,.venv,.DS_Store,*.pyc,*.pyo.
Conservative — every entry is a cache by definition, never a
"maybe I want this" path. - New
--ignore <PAT>CLI flag (repeatable). Plain entries
(__pycache__,node_modules) match path components like
before; glob entries (anything containing*/?/[, e.g.
*.pyc,*.log,**/build/**) match the relative path via
the newglobsetdep. - New
--ignore-from <PATH>CLI flag — reads gitignore-style
rules from a file. Default: try<fs_root>/.pfignorethen
<fs_root>/.gitignore; pass--ignore-from /dev/nullto opt
out. Comments (#) and blank lines skipped; trailing/
stripped; gitignore negation (!keep.pyc) is logged-and-
skipped (full negation semantics arrive when an operator
hits the use case). - New
--no-default-ignoresto opt out of the default-extra
set (rare; CI auditing the cache shape, registry mirroring).
CVE-relevant defaults (.git/objects,target,node_modules,
.pfcid) are kept regardless. - New
WalkFsCapture::new_without_default_ignores(root)and
.ignore_from(path)API on the underlying library. - Regression coverage: 4 new unit tests in
pf-worldcovering
default-extra-ignores, glob*.pycmatching, opt-out, and
.pfignorefile parsing; 2 new CLI integration tests in
cli_smoke.rscovering the snapshot end-to-end.
Issue 2: pf gc --retain-recent N left dangling log entries
Reproduced: pf log listed CIDs after GC, but pf checkout on
those CIDs failed because the layer blobs were gone. Root cause:
pf gc deleted unreachable blobs from blobs/sha256/<shard>/<hex>
but never the per-manifest marker files at
store_root/images/<cid>.json, which is what pf log walks via
store.iter_manifests(). The result was a referential-integrity
hole: index says "this CID exists", CAS says "I have no idea".
- Fix: GC now tracks the set of evicted manifest CIDs and,
after the blob sweep, deletes theirimages/<cid>.json
markers. The output line counts both:deleted N unreachable blobs (B bytes) and M stale image markers. - Regression test (
gc_retain_recent_prunes_image_markers):
snapshot 3 manifests →pf gc --retain-recent 1→ assert
pf logno longer lists the 2 evicted CIDs → assert
pf checkouton an evicted CID fails ANDpf checkouton
the kept CID succeeds.
Limitation 3: Python SDK didn't expose parent lineage
processfork.snapshot_filesystem hardcoded parents: vec![] so
SDK-only forks couldn't be 3-way-merged (no LCA was discoverable).
Operators had to route through the CLI's --parent flag.
- New
parents: Sequence[str] | None = Nonekwarg on
snapshot_filesystem. Bad CIDs surface asValueError, not
silent malformed manifests. - Regression coverage:
test_snapshot_parents_field_lands_in_manifest
pins manifest.parents round-trip;test_merge_two_forks_clean
upgraded from "asserts 'no common ancestor' RuntimeError" to
"asserts the merge succeeds clean with cid_x as ancestor";
test_snapshot_rejects_bad_parent_cidcovers the error path.
Versions
processfork(Rust + Python wheel): 1.0.12 → 1.0.13- All 8 internal
pf-*crate version pins: → 1.0.13 - npm
@processfork/sdk: 1.0.12 → 1.0.13 processfork-criu: 1.0.12 → 1.0.13 (matches the CLI's
--criu-pidbaseline)
Verification
cargo fmt --check,cargo clippy --workspace --all-targets -- -D warnings,cargo deny check: clean.cargo test --workspace: 211 passed (was 204; +7 from
the new ignore + GC + Python-lineage integration coverage).pytest crates/pf-py/python/tests/ adapters/pf-claude-code/tests/ adapters/pf-criu/tests/: 27 passed, 2 skipped (CRIU
Linux-only paths still gate-skip on macOS as documented).node --test crates/pf-ts/test/smoke.mjs: 8/8.
Still not in scope (auditor's "limitations", left as-is)
- Live PF_HAS_GPU=1 vLLM/SGLang test — Modal lane is the
validation; documented in v1.0.11. - CRIU is Linux-only by definition; macOS CI exercises Layer 1
- the non-Linux skip paths.
- Absolute symlinks captured but rejected on restore for
sandbox-escape safety. This is the v1.0.3 "Zip Slip"
hardening (PF-SA-2026-001); changing it would re-open the
CVE. Operators who need absolute-symlink restore should
resolve the link target post-checkout in their own code.
v1.0.12
Changelog
All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog,
and the project adheres to Semantic Versioning.
[1.0.12] — 2026-05-07
Closes the four "not-yet-production-ready" items v1.0.11 made
explicit. Two of them are runtime features (conflict-merge UI,
CRIU subprocess capture); two are honesty/UX (loud warnings on
empty engine layers, V1-engine workaround documented). One — V1
bit-exact KV restore — remains a vLLM-side change beyond
ProcessFork's reach and is now documented with the V0 +
enforce_eager=True workaround.
pf merge-resolve / pf merge-finalize — interactive conflict resolution
- New top-level commands. Replace the v1.1-deferred placeholder
with a real round-trip:pf merge A B→ if conflicts, exits 3 with the resolve+
finalize hint pointing at the merged-CID.pf merge-resolve <merged-cid> --workdir <dir>extracts
the merged FS into<dir>(which must NOT pre-exist),
scans for Git-style markers, and prints the conflict
file list.- Operator hand-edits.
pf merge-finalize <merged-cid> --workdir <dir>re-walks
the resolved tree, builds a single-parent image whose
parent is<merged-cid>, returns the finalized CID.
pf merge-finalizerefuses if any file in<dir>still
contains conflict markers (exit code 3); pass--forceto
finalize as-is (for tree fixtures with legitimate<<<<<<<
content).- Scan covers all three Git marker variants (
<<<<<<<,
=======,>>>>>>>); skips symlinks and binary files (NUL
byte heuristic). - Round-trip regression test (
merge_resolve_finalize_round_trip)
exercises: snapshot-X-A-B-with-parent → conflicting merge →
resolve workdir → finalize-without-resolution-fails →
hand-resolve → finalize succeeds → finalized image's
manifest.parents == [merged-cid] → checkout shows resolved
content with no markers.--forcepath tested separately.
processfork-criu adapter + pf snapshot --criu-pid <PID>
- New Python package
processfork-criu(Linux-only at runtime)
promotes the world layer'sprocsblob from
procs.unsupported.v1toprocs.criu.v1. The bundle is a
header-line JSON dict + raw tarball of CRIU'simages-dir
output, ready forpf verifyto round-trip. - New CLI flag
pf snapshot --criu-pid <PID>shells out to
python3 -m processfork_criu(via inline script) to perform
the dump. On macOS / Windows / non-criu Linux hosts the
command exits with a clear "CRIU unavailable: …" message and
the snapshot fails fast — no silent half-state. - Python API:
processfork_criu.dump_pid(pid, leave_running=True, tcp_established=False)returns aCriuBundlewhose
serialize()is the on-disk format;restore_bundle(bundle, target_dir=...)returns the new PID aftercriu restore. - Test layering reflects the honesty caveat (same as the Modal
vLLM lane: code is committed, validation lives where the
kernel lives):- Layer 1 — runs on every host (8 tests): version match,
v1 marker constants, header+tarball envelope round-trips,
deserialize rejects wrong-kind / missing-newline,
is_available()False on macOS,dump_pid/
restore_bundleraise clean RuntimeError on macOS. - Layer 2 — Linux only, no criu needed (1 test, skips on
macOS):is_available()reflects whethercriuis on
$PATH. - Layer 3 — Linux + criu binary (1 test, skips otherwise):
end-to-end: spawn a heartbeat-writing Python child,
criu dumpit, SIGKILL the original PID,criu restore,
assert the restored PID writes new heartbeats. This is
the operator-runs-it validation; the maintainer's macOS
CI has not run it. README has the caveat.
- Layer 1 — runs on every host (8 tests): version match,
- Rust-side test (
snapshot_criu_pid_fails_cleanly_on_non_linux)
confirmspf snapshot --criu-pid 1on macOS exits non-zero
with stderr mentioning CRIU/python3 (no panic, no silent empty
procs blob).
Loud warning when generic CLI snapshot writes empty engine layers
pf snapshotnow emits a multi-line stderr warning explaining
that the model + cache layers are empty and that engine state
requires the vLLM/SGLang adapter to populate. World (FS+env),
trace, and effects ARE captured.- New
--allow-empty-engine-layersflag suppresses the warning
for CI/automation that has internalized the boundary. - The empty model + cache envelopes now carry a
"note": "generic-cli-empty: populated by adapters, not by walking a directory"field so downstream tooling can detect them.
V1-engine bit-exact workaround documented
adapters/pf-vllm/README.mdgets a new "Bit-exact replay: V0
vs V1 engine" section with the V0 +enforce_eager=True
workaround for callers who need byte-identical regenerated
output. Calls out the throughput cost (1.3–1.8× slower
without CUDA graphs), V0's feature-frozen status upstream,
and when V1 + output-equivalent is acceptable (snapshot
before destructive change vs. RL rollout reproducibility).- README's bit-exact metric row links to this section.
Versions
processfork(Rust + Python wheel): 1.0.11 → 1.0.12- All 8 internal
pf-*crate version pins: → 1.0.12 - npm
@processfork/sdk: 1.0.11 → 1.0.12 - New
processfork-criuPython package: 1.0.12
Verification
cargo fmt --check,cargo clippy --workspace --all-targets -- -D warnings,cargo deny check: clean.cargo test --workspace: 204 passed (was 199; +5 from the
merge-resolve / merge-finalize / criu-pid integration tests).pytest crates/pf-py/python/tests/ adapters/pf-claude-code/tests/ adapters/pf-criu/tests/: 25 passed, 2 skipped (CRIU Linux-
only paths).node --test crates/pf-ts/test/smoke.mjs: 8/8.pf snapshot --criu-pid 1on macOS: exits non-zero with
"CRIU is Linux-only" — no panic.
What's still not in scope
- vLLM V1 engine bit-exact KV restore. Documented workaround
(V0 +enforce_eager); upstream V1 deterministic batch
scheduling is the actual fix and lives in vllm/. - Generic CLI model/cache layer auto-discovery. The "walk a
directory and produce a valid LoRA diff" approach is the
source of most "I restored my agent and it half-worked"
reports; we keep the empty-envelope-with-loud-warning path
instead.
v1.0.11
Changelog
All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog,
and the project adheres to Semantic Versioning.
[1.0.11] — 2026-05-07
Documentation honesty pass. The v1.0.10 retest confirmed 12/12 of the
real-world matrix (FS snapshots, env redaction, HMAC ledger tamper
detection, 12 forks at 1.004× storage, clean+conflict merges, file://
registry, GC, symlink hardening, quiesce/resume, large binaries) but
flagged that the README's "vLLM/SGLang ✅ ships now / bit-exact KV"
framing and the example/test stubs labelled "v1.0.1 deferred
deliverable" did not match what actually shipped.
This release does not change runtime behavior. All earlier audit
fixes still stand. What changes:
README: adapter status table now distinguishes mock vs. live
- Claude Code / LangGraph / OpenInterpreter / AutoGen / CrewAI keep
✅ — they snapshot/restore the FS + env + trace + effects layers
and the auditor's matrix exercised them end-to-end. - vLLM / SGLang downgraded from ✅ to 🟡 mock ships v1.0 · live =
Modal lane. The mock K/V page round-trip ships and is regression-
tested; the bit-exact validation runs on Modal A10G via
scripts/gpu-validate-modal.py, not from your local box.
README: 5-layer table now marks adapter-populated layers
- World annotated: FS + env ship; the
procsblob writes a
procs.unsupported.v1placeholder unless a CRIU/zombie-restart
adapter is added (a v1.1 deliverable). Restored sessions do not
bring back live PIDs; they bring back the FS+env+trace+effects
state that lets a fresh worker continue. - Model and Cache annotated 🟡: format + math ship and run
on the Modal lane, but the generic CLI snapshot path emits
empty envelopes because these layers are populated by adapters
(vLLM/SGLang/etc.), not by walking a directory.
README: bit-exact KV claim split V0 vs V1
- v1.0.10 had one row claiming "Bit-exact KV-cache replay ✅ verified".
The Modal JSONs say something more specific:2026-05-06-modal-a10g.json(V0 engine, TinyLlama-1.1B):
bit_exact: true, 38 619 KV pages, byte-identical regen text.2026-05-06-modal-a10g-vllm-v1.json(V1 engine,collective_rpc):
bit_exact: false; first-80-chars of regen output match across
snapshot/restore (output-equivalent, not bit-exact).
- README now has both rows, each linking to the source-of-truth JSON.
Treat live V1 KV restore as "lossy semantic restore" today.
README: new "What does and doesn't ship in v1.0.x" subsection
- Production-credible today (auditor's 12/12 matrix): pf snapshot/
checkout for FS sandboxes; default secret-shaped env redaction
(CLI + Python SDK + TS SDK); HMAC-chained effects ledger end-to-end
withpf verifytamper detection; fork & merge incl. conflict
marker materialization; file:// + OCI + S3 + HF registry transport;
5 first-party adapters; vLLM/SGLang mock-mode K/V page persistence. - Not yet production-ready, made explicit: in-flight subprocess
capture (CRIU adapter is v1.1); local PF_HAS_GPU=1 self-contained
vLLM/SGLang test (it was always Modal-lane validation, the
examples/06+07 + cache_bit_exact_vllm.rs were skeletons mislabelled
"v1.0.1 deferred"); V1-engine bit-exact KV restore (output-
equivalent only); conflict-merge resolution UI (markers ship,
interactivepf merge --resolveis v1.1); generic CLI model+cache
layer capture (adapter-populated only).
Skeleton/stub messages updated
examples/06-vllm-bit-exact/run.shand
examples/07-sglang-prefix-share/run.sh: removed the misleading
"v1.0.1 deferred deliverable" pointer; both now point at
modal run scripts/gpu-validate-modal.pyand the JSONs in
benchmarks/gpu-validation/, which is the actual validation path.crates/pf-cache/tests/cache_bit_exact_vllm.rs: previously
panic!("PF_HAS_GPU=1 set but pf-vllm adapter not yet wired").
Now skips cleanly under any value ofPF_HAS_GPUand points at
the Modal lane +tests/cache_round_trip.rs(the on-host proxy
that DOES exercise the cache code path everywhere).
Versions
processfork(Rust + Python wheel): 1.0.10 → 1.0.11- All 8 internal
pf-*crate version pins: → 1.0.11 - npm
@processfork/sdk: 1.0.10 → 1.0.11
Why this matters
The runtime behavior in v1.0.10 was correct and the auditor's matrix
agreed. The README and a handful of stub messages were overselling.
Documentation that can't be matched against cargo test,
benchmarks/gpu-validation/*.json, or the example runners is the
same kind of trust hole as a code bug — operators who read the README
were going to spend a day chasing a "ships now" GPU validation that
the Modal lane already ran for them. v1.0.11 makes the boundary
match the reality.
v1.0.10
Changelog
All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog,
and the project adheres to Semantic Versioning.
[1.0.10] — 2026-05-07
Closes the two TypeScript-SDK gaps the v1.0.9 retest flagged. v1.0.7
hardened the CLI's snapshot path; v1.0.9 propagated the fix to the
Python SDK; v1.0.10 propagates it to the TypeScript SDK. The CLI,
Python SDK, and TypeScript SDK now all go through the same scrub
regex and the same HMAC-chained pf_effects::Ledger code path —
parity across all three surfaces.
Security: TS SDK env capture is no longer unsafe-by-default
snapshotFilesystem(store, kind, root, env, messages, opts?)now
applies the same default scrub regex the CLI and Python SDK use
((?i)(?:^|_)(token|secret|password|passwd|pwd|api_?key|apikey| auth|bearer)(?:_|$)) — env keys matching it are stored as
"<redacted>". JS callers that did
snapshotFilesystem(..., { OPENAI_API_KEY: "...", PWD: root })
were storing the raw API key bytes inworld.env— the auditor
reproduced the leak with two separate stores.- New
opts.defaultScrubEnv: boolean = trueknob; passfalseto
opt out (rare; CI debugging at most). - New
opts.scrubEnv: string[]for additional regex patterns,
mirroring the CLI's--scrub-envflag. - Smoke-test fix: the prior
test/smoke.mjspassednew Map([...])
for the env arg; napi-rs serializes JSMapinstances to{}
(only plain objects deserialize to RustBTreeMap), so the test
silently received empty env and never exercised the leak path.
Switched to plain objects (the typed signature's documented
shape) and added 3 regression tests:default env scrub redacts secret-shaped names— proves
OPENAI_API_KEY/GITHUB_TOKEN/DATABASE_PASSWORD/MY_API_KEY
redacted, AND that the secret bytes don't appear anywhere in
the serialized env blob.defaultScrubEnv = false opts out— proves the opt-out path.effects ledger is HMAC-chained— see below.
ACRFence: TS SDK ledger is HMAC-chained for real
- Prior versions of the TS SDK ALWAYS wrote
{"kind":"effects.ledger.v1","entries":0}\ntoeffects.ledger
regardless of caller intent — TS integrations had no ACRFence
protection at all, even when they had a real tool-call list. - New
opts.effects: EffectEntry[]parameter; entries are routed
throughpf_effects::ledger::Ledger::append(per-entry
session_hmac = HMAC(secret, prev_hash || this_hash)) and the
blob comes out byte-compatible with the CLI/Python output —
same header marker, samesession_secret_hexembedding, same
verification_mode = "tamper-detection". pf verifyvalidates SDK-produced ledgers through the same code
path it already used for CLI ledgers (nopf verifychange
needed).EffectEntryshape (camelCase TS):toolId,argsHash,
resultHash,idempotencyKey,sideEffectClass("pure" |
"idempotent" | "irreversible" | "network-only"),timestamp
(RFC-3339; defaults to now). All fields excepttoolIdoptional.
New SDK surface: readBlob
readBlob(store, digest): Buffer— fetches raw blob bytes by
digest. Mirrors the Python SDK'sprocessfork.read_blob.
Adapters that need to inspect individual layer blobs (e.g. the
smoke tests verifying the redaction wrote correctly toworld.env,
or a future TS LangGraph checkpointer reading the trace blob)
call this.
Versions
processfork(Rust + Python wheel): 1.0.9 → 1.0.10- All 8 internal
pf-*crate version pins: → 1.0.10 - npm
@processfork/sdk: 1.0.9 → 1.0.10
Why this matters
The v1.0.9 retest passed 13 of 13 real-world cases on the CLI +
Python paths but explicitly flagged the TS SDK as a blocker: a JS
caller using the typed signature exactly as documented was leaking
raw API keys to disk, and the TS effects ledger gave no ACRFence
protection regardless of caller intent. Both gaps are CLI/Python
fixes that hadn't been propagated to TS. They are now propagated,
with regression tests proving the exact attack patterns the auditor
reported.
v1.0.9
Changelog
All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog,
and the project adheres to Semantic Versioning.
[1.0.9] — 2026-05-06
Closes the two SDK-side gaps the v1.0.8 retest flagged. v1.0.7
hardened the CLI's snapshot path; the Python SDK was never wired
to the same hardening, so adapters that called
processfork.snapshot_filesystem(..., env=dict(os.environ)) (every
adapter in adapters/) re-opened the same secret-leak the CLI
audit had closed, and the SDK's effects ledger was raw JSONL with
no HMAC chain even though the CLI's was.
Security: SDK env capture is no longer unsafe-by-default
processfork.snapshot_filesystem()now applies the same default
scrub regex the CLI uses ((?i)(?:^|_)(token|secret|password| passwd|pwd|api_?key|apikey|auth|bearer)(?:_|$)) — env keys
matching it are stored as"<redacted>". Operators who genuinely
need the raw env (rare; CI debugging at most) opt out via
default_scrub_env=False.- New
scrub_env: Sequence[str] | None = Noneparameter for extra
custom regex patterns, mirroring the CLI's--scrub-envflag. - All 5 first-party adapters (Claude Code, LangGraph, OpenInterpreter,
AutoGen, CrewAI) inherit the safe default automatically — none of
them ever passeddefault_scrub_env=Falseto start with. - Regression tests:
test_default_scrub_redacts_secret_shaped_env
asserts thatOPENAI_API_KEY,GITHUB_TOKEN,DATABASE_PASSWORD,
MY_API_KEYare redacted AND that the secret bytes do not appear
anywhere in the serialized blob;test_default_scrub_can_be_disabled
asserts the opt-out path still works for operators who need it.
ACRFence: SDK ledger is HMAC-chained for real
processfork.snapshot_filesystem(..., effects=[...])now routes
every entry throughpf_effects::ledger::Ledger::append, computing
per-entrysession_hmac = HMAC(secret, prev_hash || this_hash)—
the same code path the CLI's--effects-from-jsonlwas switched
to in v1.0.7. Prior versions stuffed the entries into a raw JSONL
blob with no HMAC at all, so tamper / reorder / delete on the
on-disk blob was undetectable.- A per-snapshot session secret is generated by default and embedded
in the blob header (tamper-detection mode); operators who want full
ACRFence supplyPF_SESSION_SECRET=<hex>and the secret stays out
of the blob. pf verifyalready recognizes the embedded-secret format from
v1.0.7 — SDK-produced blobs and CLI-produced blobs verify through
the same code path now.- Regression test:
test_effects_ledger_is_hmac_chainedasserts
the v1 header marker, the embedded session-secret-hex, and that
every entry has a non-emptysession_hmac≥32 chars (catching
the prior raw-JSONLsession_hmac=""regression).
Versions
processfork(Rust + Python wheel): 1.0.8 → 1.0.9- All 8 internal
pf-*crate version pins: → 1.0.9 - npm
@processfork/sdk: 1.0.8 → 1.0.9
Why this matters
The v1.0.8 audit retest passed 10 of 12 real-world cases but flagged
two genuine production blockers: (1) the SDK still leaked secret-shaped
env vars by default, and (2) SDK effects were raw JSONL not
HMAC-chained. Both are CLI-side fixes that hadn't been propagated
into pf-py. They are now propagated, with regression tests proving
both paths and confirmation that all 5 adapters inherit the safe
defaults.
v1.0.8
Changelog
All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog,
and the project adheres to Semantic Versioning.
[1.0.8] — 2026-05-06
Closes the 5th and final finding from the v1.0.6 audit — every
round-5 production-blocker is now resolved end-to-end.
Security: cargo-audit advisory ignores cleared
- pyo3 0.22 → 0.24 (
RUSTSEC-2025-0020, PyString::from_object
buffer-overflow). TheIntoPy::into_pyAPI is deprecated in 0.24;
pf-py's json↔PyObject converter and themergereport
constructor were migrated toIntoPyObject::into_pyobject(...)? .into_any().unbind(). Builds clean undercargo clippy --workspace --all-targets -- -D warnings. - rustls-webpki 0.101.7 → 0.103.13 (
RUSTSEC-2026-0098,
-0099,-0104). Root cause was therustlsfeature on
aws-config/aws-sdk-s3, which routes through
aws-smithy-runtime/tls-rustls→
aws-smithy-http-client/legacy-rustls-ringand pins rustls 0.21.
Switched to thedefault-https-clientfeature, which routes
throughaws-smithy-http-client/rustls-aws-lc(rustls 0.23 +
aws-lc-rs).cargo tree -i rustls-webpkinow lists only0.103.13
— no more legacy rustls in the dep tree. deny.tomlignore list dropped from 5 IDs → 1 (only the unrelated
RUSTSEC-2025-0119fornumber_prefixunmaintained-warning
remains, transitive viaindicatif's progress bars).cargo deny checkreportsadvisories ok, bans ok, licenses ok, sources ok.
Versions
processfork(Rust + Python wheel): 1.0.7 → 1.0.8- All 8 internal
pf-*crate version pins: → 1.0.8 - npm
@processfork/sdkwas already at 1.0.8 from the prior cycle.
Why this matters
v1.0.7 shipped with a footnote: "round-5 finding #4 tracked for
v1.0.8." That note is gone. cargo deny check advisories now passes
without any RUSTSEC ignores in the AWS / pyo3 chains; the only
remaining ignore is a stylistic warning on a transitive progress-bar
dependency.
v1.0.7
Changelog
All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog,
and the project adheres to Semantic Versioning.
[1.0.7] — 2026-05-06
Closes 4 of 5 production-blocker findings from the v1.0.6 audit
(round 5). Audit's 5th finding (cargo-audit advisories on pyo3 0.22
- rustls-webpki 0.101) is documented and tracked for v1.0.8 — see
"Out of v1.0.7" below.
Security: env capture is no longer unsafe-by-default
pf snapshotruns a built-in regex ((?i)token|secret|password| passwd|pwd|api_?key|apikey|auth|bearer) that redacts secret-shaped
env-var names UNLESS the operator passes--no-default-scrub.
v1.0.6 captured every env var by default — operators with
OPENAI_API_KEY/GITHUB_TOKEN/ etc. in scope leaked them
into the .pfimg unless they remembered--scrub-env. 1
regression test (OPENAI_API_KEY+DATABASE_PASSWORDredacted,
non-secret var preserved).
ACRFence: ledger writes are HMAC-chained for real
- The CLI's
--effects-from-jsonlwrite path (and the snapshot
internal path) now route every entry through
pf_effects::ledger::Ledger::append, which computes a per-entry
session_hmac = HMAC(secret, prev_hash || this_hash). v1.0.6
wrote raw JSONL withsession_hmac = "", so tamper / reorder /
delete on the on-disk blob was undetectable. - A per-snapshot session secret is generated by default and
embedded in the blob header (tamper-detection mode). Operators
who want full ACRFence supplyPF_SESSION_SECRET=<hex>env var,
in which case the secret is NOT echoed back into the blob. pf verifynow walks every manifest's effects ledger, runs
Ledger::deserialize+verify(), and fails if the HMAC chain
is bad. 1 regression test (snapshot 2 entries → tamper one
entry's tool_id on disk →pf verifyfails).
vLLM / SGLang plugins now actually persist
_snapshotwrites every K/V page byte buffer + the per-snapshot
manifest into a real ProcessFork store via the new SDK
processfork.put_blob(). v1.0.6's hash was computed but never
stored — the returned CID resolved to nothing on disk._checkoutnow reads the manifest from the store and replays
every page viapager.write_page(). v1.0.6 just returned
{"ok": true}without any work.- New SDK surface:
processfork.put_blob(store, bytes) -> str. - Persistence works in both mock and live modes — the
_live()
gate that used to short-circuit was a usability filter, not a
correctness one, and made the persistence path untestable
without a real GPU. 4 new regression tests (vLLM + SGLang ×
mock-mode + persistence-round-trip + unknown-CID-errors).
Versions aligned across surfaces
processfork(Rust + Python wheel): 1.0.6 → 1.0.7processfork-vllm: 1.0.2 → 1.0.3 (real persistence)processfork-sglang: 1.0.2 → 1.0.3 (real persistence)@processfork/sdk(npm): 1.0.7 → 1.0.8- 8 Rust crates on crates.io: all → 1.0.7
Test count
196 → 199 cargo tests workspace-wide (+1 ledger HMAC tamper, +1
default-scrub, +1 quiesce-failure regression already in v1.0.6).
Plus 4 new vLLM/SGLang persistence regressions in adapters.
Out of v1.0.7 → tracked for v1.0.8
cargo auditignores remain:pyo3 0.22.6(RUSTSEC-2025-0020,
buffer-overflow inPyString::from_objectwe don't call) and
threerustls-webpki 0.101.7advisories (transitive via
aws-sdk-s3→aws-smithy-http-client→rustls 0.21) are
still indeny.toml'signorelist. Clearing them needs
pyo3 → 0.24 (Bound API rewrite, ~30 min mechanical) and
aws-sdk-s3 ≥1.135 (when it bumps its rustls floor, likely Q3
2026). Each ignore has a documented scope-of-impact comment;
none are exploitable in our use cases. v1.0.8 ships the bumps.
v1.0.6
Changelog
All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog,
and the project adheres to Semantic Versioning.
[1.0.6] — 2026-05-06
Closes 2 follow-up findings from the v1.0.5 audit (round 4).
Correctness fixes
-
OpenInterpreter
result_hashcollision (real bug). v1.0.5
truncated the result string to 8 KiB BEFORE computing the hash,
so two large outputs that diverged past byte 8192 collided.
Fixed:run()now serializes the FULL output once, hashes those
bytes (storing the hash in the ledger entry), and truncates only
the displayedresultfield. The truncation suffix advertises
the dropped byte count. Snapshot path prefers the pre-computed
result_hash. 1 regression test that constructs two outputs
sharing the first 9 KiB but diverging in the tail. -
--resume-cmdnot running on quiesce-cmd failure. v1.0.5's
QuiesceGuardonly stashedresume_cmdafter a successful
quiesce_cmdrun, so a partial-failure quiesce (mutates app
state, then fails) left the agent stuck in a half-quiesced state.
Fixed: construct the guard FIRST (ownsresume_cmd), THEN run
quiesce_cmd— Rust's stack-unwind drop fires resume on the
error-return path. Updated error message tells the operator
resume will still run. 1 regression test verifies that a quiesce
that touches a file then exit 7 still runs resume.
Versions
processfork(Rust + Python wheel): 1.0.5 → 1.0.6processfork-openinterpreter: 1.0.2 → 1.0.3 (hash-before-truncate)@processfork/sdk(npm): 1.0.6 → 1.0.7- 8 Rust crates on crates.io: all → 1.0.6
Test count
196 → 197 cargo tests workspace-wide (+1 quiesce-failure regression).
Plus +1 OI prefix-collision regression in adapters.