Toolkit for managing data produced by the PhenoCycler Fusion (Akoya Biosciences). Covers the data naming standard, monthly backup to TrueNAS and AWS S3, and capacity-driven TrueNAS cleanup.
Validates that a study directory matches the naming standard before any backup runs. Eight rules:
- R1 — study name matches
YYYYMMDD_<user>_<project>_<customname>(4 fields) with a valid calendar date - R2 — experiment subdirectory =
<study>(4-field form, same as study) or<study>_<suffix>(5-field form, suffix ∈[A-Za-z0-9-]+; no_) - R3 — character set per field is
[A-Za-z0-9-]; experiment name length ≤ 60 chars (Windows MAX_PATH guard) - R4 — study has both metadata (≥ 1
.xpdor.fpr) and data (≥ 1 experiment directory) - R5 — at study level, allow only
.xpd,.fpr, and matching experiment directories (nothing else) - R6 — every
.xpdhas a matching sibling experiment directory of the same name (strict 1:1) - R7 —
.fprfilename uses the R3 character set; presence-only (one.fprmay serve multiple experiments, no pairing required) - R8 — each experiment has at least one
Scan<N>/whose contents include both a*.qptifffile and a.temp/folder
Full rule reference with failing / passing examples:
docs/data-standard.md. Exposed as
validate_study(path) / validate_root(path) and via the validate
CLI group.
Per-month append-only JSONL ledger of backup runs. The authoritative
copy lives on TrueNAS at manifest_root/YYYY-MM.jsonl (configurable);
a mirror is uploaded to S3 after every append as a disaster-recovery
copy. Each record holds:
backup_name,run_name, source / TrueNAS / S3 pathsfile_count,total_bytesstarted_at,truenas_verified_at,s3_verified_at,completed_at,source_deleted_atstatus—in_progress,complete,failed_validation,failed_truenas, orfailed_s3
Readers fold the file into {backup_name: latest_record} to get
current state. The backup subsystem writes; the cleanup subsystem
reads (for its safety-gate check).
Monthly orchestrator that copies eligible idle runs (mtime ≥
idle_days ago) from the Fusion host to TrueNAS and AWS S3. Each run
goes through the pipeline:
- Validate against the data standard — fail fast if violating
- Stream-copy to TrueNAS with per-file SHA-256
- Read-back-verify every file on TrueNAS against the hash
- Atomically write a
_checksums.sha256sidecar - Upload to S3 with server-side
ChecksumSHA256headers - Append a
completerecord to the manifest - Delete the source on the Fusion host
An optional [hooks] table in config.toml lets each subcommand run
an external command before its own logic kicks off:
[hooks]
backup = ["powershell.exe", "-NoProfile", "-File", "C:/path/to/mount-truenas.ps1"]
cleanup = ["powershell.exe", "-NoProfile", "-File", "C:/path/to/mount-truenas.ps1"]Currently backup and cleanup honor hooks. Each value is an argv
list. Hook stdio is inherited so its output appears live in
fusion-toolkit's log; non-zero exit aborts the subcommand with exit
code 1 (the actual backup/cleanup never starts). A missing key, an
empty [hooks] table, or no [hooks] table at all is a no-op — the
subcommand runs as before.
The original use case is mounting an SMB share via Windows
New-SmbMapping before each scheduled task, since drive-letter
mappings are per-logon-session and can't be inherited from a boot-time
mount task.
While a study is being processed a .fusion-toolkit.lock file is
written into the study directory and removed in a finally clause when
the per-study pipeline exits (success or any failure). If the process
crashes mid-pipeline (SIGKILL, power loss) the lock is left behind. At
the start of the next backup run, source_root is scanned for any
.fusion-toolkit.lock files; if any are present the entire run is
aborted with exit code 1 so an operator can inspect and clean up
before more backups run. The lock holds metadata only (hostname /
pid / started_at / run_name) — the orchestrator does not check
PID liveness or staleness, so manual cleanup is always required after
a crash.
Per-run exceptions are isolated so one failing run does not abort the
monthly batch; the failure is recorded in the manifest and the next
run is attempted. Driven by backup run [--config PATH] [--dry-run].
Also exposes scan_health_check(study, mtime_gap_days=14) as a soft
pre-backup gate (not wired into the orchestrator; the caller decides
when to apply it). Returns HealthWarning items for two independent
checks:
empty— no experiment has aScan<N>/with both*.qptiffand.temp/mtime_gap— earliest and latest mtimes in the study span more thanmtime_gap_daysdays (study may still be actively edited)
Capacity-driven FIFO eviction on TrueNAS. When free space falls below
free_threshold_pct, the oldest backups are deleted until the
threshold is met. Every delete passes three safety gates:
- Lock — an exclusive
.cleanup.lockprevents concurrent cleanup and overlapping backup/cleanup races - Manifest gate — the candidate's latest manifest record must be
Status.COMPLETE; in-progress or failed runs are skipped - S3 gate — at least one object must exist under the candidate's S3 prefix (defense in depth against partial uploads that passed the manifest gate somehow)
Only TrueNAS data is deleted — the S3 copy is the permanent archive
and never touched. Driven by
cleanup run [--config PATH] [--dry-run].
Long-running tail of the Fusion application's Fusion.log. Detects
ERROR blocks, enriches each with the nearest sample + cycle
context, and emails them via the notify subsystem:
- Offset + dedup-map persisted under
~/.config/fusion-toolkit/state/so restarts don't re-email past errors - Dedup key = first 200 chars of the error content (timestamp
stripped); same error suppressed for
sent_keep_days(default 30) - File rotation handled by detecting a shrunk file and resetting to 0
- Unhandled exceptions in the main loop trigger a
[Fusion Monitor] CRASHemail before exit 1
Driven by monitor tail [--config PATH] [--smtp-env PATH].
Gmail SMTP helper used by backup, cleanup, and monitor for
failure / event emails. Credentials follow a boto3-style chain:
environment variables (FUSION_SMTP_USER, FUSION_SMTP_APP_PASSWORD)
first, then a dotenv-style file at
~/.config/fusion-toolkit/smtp.env.
Five events:
| Event | Fires on |
|---|---|
backup_failure |
backup run exit != 0 |
cleanup_failure |
cleanup run exit != 0 |
fusion_error |
monitor tail detects an ERROR line in Fusion.log |
monitor_crash |
monitor tail itself crashes |
toolkit_error |
any fusion-toolkit WARNING/ERROR log (tool-health channel, dedup + cooldown) |
Subscription lists are per-event text files, one email per line,
under ~/.config/fusion-toolkit/recipients/:
~/.config/fusion-toolkit/recipients/
├── backup_failure.txt # ops@lab / manager@lab
├── cleanup_failure.txt
├── fusion_error.txt # oncall + instrument owners
├── monitor_crash.txt
├── toolkit_error.txt # dev / maintainer
└── default.txt # fallback for any event with an empty file
Lab members subscribe by appending their email to the relevant file.
Lines starting with # are comments; blank lines are ignored. No
TOML syntax, no array brackets — just one email per line. Hot-reload
still applies: every alert reads fresh, so changes take effect on the
next email with no restart.
Resolution order for each event:
recipients/<event>.txt(if any uncommented emails)recipients/default.txt- empty → that event sends nothing
To inspect / verify:
fusion-toolkit notify list-subscribers # all events + sources
fusion-toolkit notify list-subscribers --event fusion_error
fusion-toolkit notify test --event fusion_error --dry-run
fusion-toolkit notify test --event fusion_error # actually sends a TEST emailThe config.toml [notify] section only carries the global switch,
cooldown, and an optional recipients-dir override — no recipients
live in TOML:
[notify]
enabled = true
# Optional: override the recipients directory. Default is the
# `recipients` sibling of this config.toml.
# recipients_dir = "/some/other/path"
toolkit_error_cooldown_seconds = 300One-shot deployment helper for the Fusion host. Two commands:
-
install init [--config-dir PATH] [--force]— populates~/.config/fusion-toolkit/withconfig.toml,smtp.env, and per-eventrecipients/*.txtsubscription list templates. Refuses to clobber an existingconfig.toml/smtp.envwithout--force;recipients/*.txtis never overwritten, even with--force, because it carries operator-curated state. -
install tasks [--user USER --password PASS] [--uninstall]— registers (or with--uninstallremoves) three Windows Scheduled Tasks viaschtasks.exe:Task Command Schedule FusionToolkitBackupbackup runMonthly, day 1 at 02:00 FusionToolkitCleanupcleanup runMonthly, day 20 at 02:00 FusionToolkitMonitormonitor tailOn system start
Argv construction is pure (build_create_argv /
build_delete_argv), so the schtasks contract is unit-tested
without invoking Windows. Unregister is idempotent — missing tasks
are not an error. See Setup on the Fusion host
below for the typical flow.
fusion-toolkit --version
fusion-toolkit [-v | --verbose] <subcommand> ...
fusion-toolkit validate study <path> [-f text|json] [-q]
fusion-toolkit validate root <path> [-f text|json] [-q]
fusion-toolkit manifest show <manifest.jsonl>
fusion-toolkit backup run [--config PATH] [--dry-run] [--smtp-env PATH]
fusion-toolkit cleanup run [--config PATH] [--dry-run] [--smtp-env PATH]
fusion-toolkit monitor tail [--config PATH] [--smtp-env PATH]
fusion-toolkit install init [--config-dir PATH] [--force]
fusion-toolkit install tasks [--user USER --password PASS] [--uninstall]
fusion-toolkit notify list-subscribers [--event EVENT]
fusion-toolkit notify test --event EVENT [--smtp-env PATH] [--dry-run]
| Code | Meaning |
|---|---|
0 |
Success |
1 |
Validation failure or orchestrator reported one or more failed runs |
2 |
Config error (missing or malformed TOML, missing required key, lock held) |
3 |
Unexpected runtime error (network, permissions, unhandled bug) |
Scheduled-task alerting should treat any non-zero exit code as
actionable. When [notify].enabled = true, backup run / cleanup run / monitor tail also route failures and tool-health warnings to
event-specific recipient lists (see the fusion_toolkit.notify
section above). If notify is disabled or credentials are missing the
commands behave exactly as before (exit codes only).
After uv sync in the cloned repo, the typical flow is:
uv run fusion-toolkit install init # writes config + smtp.env templates
# edit ~/.config/fusion-toolkit/{config.toml,smtp.env}
uv run fusion-toolkit install tasks # prompts for your Windows passwordinstall tasks defaults --user to $env:USERDOMAIN\$env:USERNAME
(the current account) and prompts for the password via stdin so it
never appears in PowerShell history. Pass --user or --password
explicitly to override (e.g. for unattended automation).
Full details (Python + uv install, AWS credential chain, SMB mapping
quirks) are in docs/setup-fusion-host.md
(中文: docs/setup-fusion-host.zh.md).
Browser-based name checker at https://wuwenrui555.github.io/fusion_toolkit/.
Enter a study name and optional experiment name; per-field pass/fail
is shown inline. Maintenance: docs/gh-pages.md.
uv sync
uv run pre-commit installRun checks locally before pushing:
uv run ruff check src/ tests/
uv run ruff format --check src/ tests/
uv run pyright src/
uv run pytestTBD