feat(desktop): auto-update the Kanban runtime in the background#440
feat(desktop): auto-update the Kanban runtime in the background#440johnwschoi wants to merge 4 commits into
Conversation
5914ea0 to
4926d80
Compare
27debc7 to
97eafc5
Compare
|
Closing on reviewer feedback: the stated requirement ("the desktop shell ships and runs the latest Kanban runtime") is already satisfied on This PR adds runtime OTA (background npm polling, user-data runtime store, pointer file, rollback subsystem, pacote, restart-to-apply IPC, native dep copying into userData) on top of that already-working baseline. None of that is requested. If we ever decide to support runtime OTA as an explicit feature, we can revisit; until then the packaging-time fix on main is the right shape. Closes #438 will need to stay open or be reframed against the actual gap (if any) — given main already stages the latest runtime, #438 may already be implicitly resolved; flagging for triage. |
|
Reopening: clarified requirement is "installed desktop shell fetches/uses latest runtime without requiring shell reinstall." That's runtime OTA, not packaging-time staging. The bundled-runtime-at-build-time path on The earlier review feedback ("reduce to packaging-only") was based on a narrower reading of the requirement than the user actually has. The current branch is the trimmed version (275 tests, +3377 / -177 LOC, all small functional modules — no over-engineering). Sticking with this shape. |
d84610d to
419e13b
Compare
… version Addresses review on PR #440: 1. runtime-store: readPointer / writePointer now require cliEntry to be the canonical versions/<v>/dist/cli.js path. A tampered current.json pointing outside the runtime-store would otherwise be forwarded to the shim as KANBAN_CLI_OVERRIDE. 2. runtime-auto-update: loadOverride now enforces max(pointer, bundled) as the effective launch version. A pointer at-or-below bundled (user upgrades the shell while userData still pointed at an older staged runtime) is treated as stale and cleared, instead of overriding the newer bundled cli forever. 3. runCheck: currentVersion is now an explicit max(pointer, bundled), not 'pointer if present' — so the version gate against latest compares against the right baseline after a shell upgrade. 4. runtime-store comment: bad-versions is documented as never-pruned (the prior comment claimed self-emptying, which the current code doesn't do). 5. New test/runtime-auto-update.test.ts (15 tests) covers loadOverride stale-pointer clearing, non-canonical pointer rejection, rollback sequence, runCheck max(pointer,bundled) selection, broadcast, error-swallowing, and stop().
…-path enforcement Addresses second round of review on PR #440: 1. Rollback race (most important): RuntimeOrchestrator now captures the override path used at spawn time as 'currentSpawnOverridePath: string | null' (was 'currentSpawnUsedOverride: boolean') and passes it to onCliEntryOverrideFailed(reason, cliEntry). runtime-auto-update onFailed now derives failedVersion from versionFromCliEntry(cliEntry) instead of re-reading the (possibly newer) pointer. If a concurrent stage advanced the pointer between spawn and failure, we mark the *failed* version bad and remove its dir, but leave the newer pointer untouched (it's presumed-good until proven otherwise). Adds a 'race-condition' test in runtime-auto-update.test.ts and a cliEntry-forwarded assertion in runtime-orchestrator.test.ts. 2. Invalid-pointer cleanup: readPointer returning null no longer leaves the bad current.json on disk. New pointerFileExists() helper lets loadOverride() distinguish 'no pointer' from 'invalid pointer' and delete the bad file (corrupt JSON, non-canonical path, non-absolute path). Existing test renamed to 'clears a non-canonical pointer file' and asserts existsSync(pointerFile) === false. New 'corrupt-JSON' test for symmetry. 3. Absolute-path enforcement: readPointer now requires path.isAbsolute(cliEntry) in addition to canonical-path equality. Previously a relative form that happened to resolve to the canonical path from process.cwd() was accepted, making pointer validity cwd-dependent. Test 'normalizes a relative cliEntry' renamed to 'rejects a relative cliEntry even if it would resolve to the canonical path' and flipped to assert null. 4. Self-contained already-staged check: checkAndStageLatestRuntime no longer requires the caller to self-repair pointers first. Uses resolvePointerCliEntry() (which returns null when the pointed cliEntry is missing on disk) so 'already-staged' can't lie about a wiped version dir. 5. New shared helper versionFromCliEntry(cliEntry) in runtime-store — inverse of cliEntryFor() — used by both runtime-update (for the already-staged check) and runtime-auto-update (for the rollback). All 293 desktop tests pass (was 291 — added race-condition test + corrupt-JSON pointer test). Repo-wide tests still pass under pre-commit. TSC clean.
…absolute-path symmetry Addresses round-3 review nits on PR #440: 1. versionFromCliEntry: previously only returned the basename of dirname(dirname(cliEntry)) and did a semver check on it, so paths like /tmp/1.2.3/dist/not-cli.js or /elsewhere/1.2.3/dist/cli.js would still produce '1.2.3'. Now requires path.isAbsolute() and string-equals against cliEntryFor(userData, v) — i.e. the helper actually validates the full canonical shape its docstring claims. Updated signature: versionFromCliEntry(userData, cliEntry). This matters for the rollback path: a stray cliEntry from a non-canonical source would otherwise let the rollback mark a real-but-unrelated version bad and remove its on-disk dir. 2. writePointer: now requires path.isAbsolute(cliEntry) symmetric with readPointer. Previously path.resolve()'d the input and string- compared, so a relative form that resolves to canonical from cwd would silently round-trip as canonical at write time. Pointer validity must not depend on process.cwd() at *either* boundary. 5 new tests cover the tightened helpers: - versionFromCliEntry: rejects relative paths, non-semver <v>, wrong leaf filename, wrong root, wrong parent dir. - writePointer: rejects relative cliEntry symmetric with readPointer. All 298 desktop tests pass (was 293).
|
Split #440 into a reviewable stacked series:
Each PR targets the previous branch, with #464 targeting main, so they can be reviewed/merged in order. |
… version Addresses review on PR #440: 1. runtime-store: readPointer / writePointer now require cliEntry to be the canonical versions/<v>/dist/cli.js path. A tampered current.json pointing outside the runtime-store would otherwise be forwarded to the shim as KANBAN_CLI_OVERRIDE. 2. runtime-auto-update: loadOverride now enforces max(pointer, bundled) as the effective launch version. A pointer at-or-below bundled (user upgrades the shell while userData still pointed at an older staged runtime) is treated as stale and cleared, instead of overriding the newer bundled cli forever. 3. runCheck: currentVersion is now an explicit max(pointer, bundled), not 'pointer if present' — so the version gate against latest compares against the right baseline after a shell upgrade. 4. runtime-store comment: bad-versions is documented as never-pruned (the prior comment claimed self-emptying, which the current code doesn't do). 5. New test/runtime-auto-update.test.ts (15 tests) covers loadOverride stale-pointer clearing, non-canonical pointer rejection, rollback sequence, runCheck max(pointer,bundled) selection, broadcast, error-swallowing, and stop().
…-path enforcement Addresses second round of review on PR #440: 1. Rollback race (most important): RuntimeOrchestrator now captures the override path used at spawn time as 'currentSpawnOverridePath: string | null' (was 'currentSpawnUsedOverride: boolean') and passes it to onCliEntryOverrideFailed(reason, cliEntry). runtime-auto-update onFailed now derives failedVersion from versionFromCliEntry(cliEntry) instead of re-reading the (possibly newer) pointer. If a concurrent stage advanced the pointer between spawn and failure, we mark the *failed* version bad and remove its dir, but leave the newer pointer untouched (it's presumed-good until proven otherwise). Adds a 'race-condition' test in runtime-auto-update.test.ts and a cliEntry-forwarded assertion in runtime-orchestrator.test.ts. 2. Invalid-pointer cleanup: readPointer returning null no longer leaves the bad current.json on disk. New pointerFileExists() helper lets loadOverride() distinguish 'no pointer' from 'invalid pointer' and delete the bad file (corrupt JSON, non-canonical path, non-absolute path). Existing test renamed to 'clears a non-canonical pointer file' and asserts existsSync(pointerFile) === false. New 'corrupt-JSON' test for symmetry. 3. Absolute-path enforcement: readPointer now requires path.isAbsolute(cliEntry) in addition to canonical-path equality. Previously a relative form that happened to resolve to the canonical path from process.cwd() was accepted, making pointer validity cwd-dependent. Test 'normalizes a relative cliEntry' renamed to 'rejects a relative cliEntry even if it would resolve to the canonical path' and flipped to assert null. 4. Self-contained already-staged check: checkAndStageLatestRuntime no longer requires the caller to self-repair pointers first. Uses resolvePointerCliEntry() (which returns null when the pointed cliEntry is missing on disk) so 'already-staged' can't lie about a wiped version dir. 5. New shared helper versionFromCliEntry(cliEntry) in runtime-store — inverse of cliEntryFor() — used by both runtime-update (for the already-staged check) and runtime-auto-update (for the rollback). All 293 desktop tests pass (was 291 — added race-condition test + corrupt-JSON pointer test). Repo-wide tests still pass under pre-commit. TSC clean.
…absolute-path symmetry Addresses round-3 review nits on PR #440: 1. versionFromCliEntry: previously only returned the basename of dirname(dirname(cliEntry)) and did a semver check on it, so paths like /tmp/1.2.3/dist/not-cli.js or /elsewhere/1.2.3/dist/cli.js would still produce '1.2.3'. Now requires path.isAbsolute() and string-equals against cliEntryFor(userData, v) — i.e. the helper actually validates the full canonical shape its docstring claims. Updated signature: versionFromCliEntry(userData, cliEntry). This matters for the rollback path: a stray cliEntry from a non-canonical source would otherwise let the rollback mark a real-but-unrelated version bad and remove its on-disk dir. 2. writePointer: now requires path.isAbsolute(cliEntry) symmetric with readPointer. Previously path.resolve()'d the input and string- compared, so a relative form that resolves to canonical from cwd would silently round-trip as canonical at write time. Pointer validity must not depend on process.cwd() at *either* boundary. 5 new tests cover the tightened helpers: - versionFromCliEntry: rejects relative paths, non-semver <v>, wrong leaf filename, wrong root, wrong parent dir. - writePointer: rejects relative cliEntry symmetric with readPointer. All 298 desktop tests pass (was 293).
2e388a7 to
a2f33a2
Compare
… runtime Adds the foundational primitive for runtime updates: an env var the desktop shim reads to decide which cli.js to execute. The path is validated with fail-loud-on-missing-file semantics rather than silently falling back, so the parent's rollback bookkeeping (added in a later commit) stays in sync with what actually ran — a silent fall-back-to-bundled would let the parent think the staged version 'worked' and never blacklist a broken one. - POSIX + Windows shims read KANBAN_CLI_OVERRIDE. On missing-file, print to stderr and exit non-zero. Both shims still prefer the bundled Electron binary via ELECTRON_RUN_AS_NODE=1 over system node, since GUI-launched apps inherit minimal PATHs. - RuntimeChildManager: typed cliEntryOverride option, forwarded through env to the child. Empty-string coerces to undefined so consumers don't have to guard. No host wires this yet — the orchestrator hooks arrive in the next commit, and the auto-updater that actually writes to the override path arrives later.
…h fallback Wires KANBAN_CLI_OVERRIDE into the runtime orchestrator with two host-supplied callbacks and adds a single-shot fallback to bundled. resolveCliEntryOverride(): called per-spawn (not cached) so a background updater advancing the pointer between user-triggered restarts takes effect on the very next launch. onCliEntryOverrideFailed(reason, cliEntry): fires when a staged runtime's readiness probe fails. The captured cliEntry — not whatever the pointer says NOW — is forwarded so a concurrent stage that advanced the pointer mid-spawn doesn't get blamed for an older spawn's failure. The host uses this to mark the version bad. Same-launch fallback: on initial-spawn failure, retry exactly once with cliEntryOverride=undefined (bundled). If bundled also fails, give up and propagate the original error rather than spinning. This guarantees a healthy bundled cli always boots the shell, even when every staged version on disk is broken. The retry happens at the orchestrator layer (not the host) so the readiness-probe and crash-recovery state machine stays in one place. Tests cover: forwards override to child manager, captured-cliEntry contract on failure, bundled-also-fails no-loop, resolver-throws- treat-as-no-override, fresh resolver call on each spawn.
… primitives Two pure-data modules that together let a host (next commit) keep an alternate runtime checked out under userData. Adds pacote + semver (and their @types) as dependencies. runtime-store.ts — on-disk state. - current.json: { version, cliEntry } pointer. readPointer is strict — version must be valid semver, cliEntry must be an absolute path AND match the canonical <userData>/runtime-store/versions/<v>/dist/cli.js shape. Anything off-shape returns null (treat as no pointer). The strict shape is enforced symmetrically in writePointer so we can't write something we wouldn't read back. - bad-versions.json: append-only blacklist. Versions that failed a readiness probe never re-stage. - versionFromCliEntry: derives version from a captured cliEntry by validating the full canonical layout — not just basename — so an exotic input from a misbehaving caller can't produce a bogus version string the rollback path would then act on. - Atomic writes (tmp + rename, fsync on the tmp before rename) so a crash mid-write can't leave a half-pointer. - cleanupPartials() sweeps <v>.partial/ left by interrupted extracts on every boot, before any new staging can collide. runtime-update.ts — the fetcher. - checkAndStageLatestRuntime: pacote.manifest → engines check → bad-version skip → semver.gt against currentVersion → pacote.extract into <v>.partial/ → copy native deps (node-pty) from app.asar.unpacked with dereference:true (so the staged tree has no symlinks pointing back into a potentially-upgraded bundle) → atomic rename → writePointer. - Returns a typed Outcome union (staged | up-to-date | already-staged | bad-version) so the caller picks the right user-facing message without inspecting strings. - Engines.node guard: skip versions whose package.json declares a Node range incompatible with the bundled Electron's Node — we'd rather stay on bundled than crash on syntax/API the runtime needs. No consumer wires either module yet — the auto-updater that calls checkAndStageLatestRuntime and reads the pointer arrives next.
The activation: turns on background runtime updates and connects the orchestrator hooks to the runtime-store + runtime-update modules. runtime-auto-update.ts — orchestrator-facing wrapper. Returns null in dev (isPackaged: false) so the orchestrator just spawns bundled. - resolveCliEntryOverride: max(pointer, bundled) gate to avoid launching an older staged runtime forever after a shell upgrade. Self-repairs (deletes current.json) when: JSON is corrupt, cliEntry is non-canonical, cliEntry no longer exists on disk, or pointer.version <= bundled. The self-repair prevents stale state from lingering as visible-but-ignored facts. - onCliEntryOverrideFailed: derives the failed version from the CAPTURED cliEntry, marks bad, removes the version dir, clears the pointer. clearPointer + removeVersionDir are gated on markBadVersion succeeding — a transient write failure on the blacklist would otherwise drop the pointer without blacklisting the version, looping runCheck → re-extract → fail forever. With the gate, the user still launches successfully via the orchestrator's same-launch fallback, and we retry markBad on every subsequent boot until it succeeds. - bundledVersion: read from the staged cli/package.json that stage-cli.mjs now embeds at build time. Falls back to app.getVersion() if the file is missing OR its version field isn't valid semver — defense against a corrupt/hand-edited package.json that would otherwise throw on the hot startup path the first time semver.lte/gt hit it. - scheduleChecks: first runCheck after 30s, then every 30min. Single-flight inFlight guard prevents a slow extract from racing the periodic interval. Both timers unref'd. stop() clears them before shutdown so a check can't fire mid-teardown. main.ts: constructs the auto-updater, plumbs both callbacks into the orchestrator, schedules checks after the runtime is ready, and calls autoUpdate.stop() in before-quit so an extract past pacote.extract finishes cleanly (writes pointer); an earlier-stage one gets dropped and its <v>.partial/ swept on next boot. broadcastToAllRenderers uses BrowserWindow.getAllWindows() so transient windows (OAuth popup) also receive update events. preload.ts: exposes onUpdateStaged + onRuntimeRolledBack as detach-returning subscribers. Returning a detach fn (rather than forwarding removeListener) prevents one renderer from removing listeners installed by another. stage-cli.mjs: embeds the repo-root package.json's version into the staged cli/package.json so the desktop shell can read the actual bundled-runtime version at boot — distinct from app.getVersion(), which returns the Electron shell version.
a2f33a2 to
c821d93
Compare
Closes #438.
Requirement
The shell currently freezes its bundled
cli.jsat desktop-build time. An installed shell can't pick up a newer runtime — users have to reinstall the desktop app to get runtime fixes. This PR closes that gap.What this PR does
createRuntimeAutoUpdate(inruntime-auto-update.ts) callspacote.manifest("kanban@latest")30s after first boot and every 30 min thereafter. Single-flight gated.checkAndStageLatestRuntime(inruntime-update.ts) usespacote.extractinto${userData}/runtime-store/versions/<v>.partial/, copies bundlednode-ptyfromapp.asar.unpacked/node_modules/, then atomic-renames to<v>/. Crash-resilient — partial dirs are swept before each extract.${userData}/runtime-store/current.jsonis read on every spawn byloadOverride; the orchestrator spawns the pointed-tocli.jsinstead of the bundled one. Renderer getsruntime:update-stagedIPC for a "Restart to apply" banner.max(pointer, bundled)so a stale older pointer can never override a newer bundled runtime after a shell upgrade. If the pointed-to version fails its startup probe,onCliEntryOverrideFailedmarks the version bad (bad-versions.json), clears the pointer, removes the version dir, and the orchestrator immediately retries the same launch with the bundled cli.latest-mac.yml/latest.ymlhosting, no signed installer pipeline. The shell binary on disk is never modified — only the runtime inuserData/.pacote.manifest()for registry checks,pacote.extract()for tarball staging. No custom HTTP / tarball / integrity code.Code changes
The whole feature is 3 new modules + small wiring in
main.ts. Architecture is intentionally functional — no lifecycle classes, no EventEmitters, no DI containers.New modules (
packages/desktop/src/)runtime-store.ts(170 LOC) — on-disk layout under${userData}/runtime-store/:current.jsonvia*.tmp+rename) with strict validation:semver.valid(version)required,cliEntrymust equal the canonicalversions/<v>/dist/cli.jspath afterpath.resolve()(any drift → pointer rejected atreadPointer, andwritePointerthrows). This is what stops a tamperedcurrent.jsonfrom forwarding an arbitrary on-disk path to the shim asKANBAN_CLI_OVERRIDE.resolvePointerCliEntry— returns the canonical cliEntry only if the file exists on disk (statSync().isFile()), so a wiped version dir can't permanently freeze the version gate.bad-versions.json— version-blocklist that prevents re-staging a version that already failed startup. Entries are not pruned; the registry only publishes monotonically increasing versions andisBadVersion(latest)is the only call site, so old entries are dead weight (a few bytes) but never re-examined.cleanupPartials,removeVersionDir— disk hygiene called on boot and after rollback.runtime-update.ts(99 LOC) — pure async functioncheckAndStageLatestRuntime(opts) → StageOutcome:{ kind: "staged" | "up-to-date" | "already-staged" | "bad-version" }.manifest→ version gate (againstcurrentVersion = max(pointer, bundled)) → bad-version skip → already-staged skip →pacote.extract→ stagenode-ptyfrom bundled → verifydist/cli.jsexists → atomic rename partial → version dir → atomic pointer write. Any failure before the pointer write leaves the existing pointer untouched.runtime-auto-update.ts(198 LOC) — wires the two modules above into the orchestrator'scliEntryOverridecallbacks and the 30s/30min check schedule:loadOverride()enforcesmax(pointer, bundled): a pointer at-or-below bundled (e.g. user upgraded the shell while userData still pointed at an older staged runtime) is treated as stale and cleared. Also clears pointers whosecliEntryno longer exists.onCliEntryOverrideFailed(reason)is the rollback path: mark bad → remove version dir → clear pointer → broadcastruntime:rolled-backto the renderer.runCheck()is single-flight gated, swallows network errors so the timer keeps firing, and broadcastsruntime:update-stagedonly on a successful new staging.nullwhen!isPackagedso dev runs the bundled cli with no auto-update path active.Wiring (
packages/desktop/src/main.ts,runtime-orchestrator.ts)main.ts— callscreateRuntimeAutoUpdate(deps)once, passes itsresolveCliEntryOverride/onCliEntryOverrideFailedcallbacks into the orchestrator, and callsscheduleChecks()after the window is ready. Total wiring: +39 LOC.runtime-orchestrator.ts— added two optional callbacks:resolveCliEntryOverride(called on every spawn) andonCliEntryOverrideFailed(called when the staged cli fails its readiness probe; orchestrator then retries this same launch with the bundled cli, exactly once).File breakdown
packages/desktop/src/runtime-store.tspackages/desktop/src/runtime-update.tscheckAndStageLatestRuntimepackages/desktop/src/runtime-auto-update.tspackages/desktop/src/main.tscreateRuntimeAutoUpdate(...), forwards callbackspackages/desktop/src/runtime-orchestrator.tsresolveCliEntryOverride/onCliEntryOverrideFailedhookspackages/desktop/src/preload.tsruntime:update-staged/runtime:rolled-backIPC bridgepackages/desktop/test/runtime-store.test.tspackages/desktop/test/runtime-update.test.tspackages/desktop/test/runtime-auto-update.test.tsloadOverride(incl. stale-pointer + non-canonical), rollback, runCheck broadcast/error/stoppackages/desktop/test/runtime-orchestrator.test.tspackages/desktop/test/runtime-child-manager.test.tsNote on
node-ptyruntime closurenode-pty/package.jsondeclaresnode-addon-apias a dependency, but it isn't required at runtime — node-addon-api is a header-only C++ helper consumed bybinding.gypduring a from-source compile, and node-pty's prebuilt.nodebinaries (innode-pty/prebuilds/) are self-contained. Empirically verified: anode_modules/containing onlynode-pty/is sufficient forrequire('node-pty').spawn(...)to work. If a future kanban release grows a new external runtime dep beyondnode-pty,runtime-updatewould still extract and rename it, then the new version would fail its startup probe, hit the rollback path, and end up on the bad-versions list — the user lands back on the bundled runtime automatically.Bad-version policy (user-visible behavior)
If
kanban@latestfails startup on a user's shell once, this PR's bad-versions list will skip that exact version on every subsequent update tick. As soon as upstream publishes a newer version, the next check seeslatest > badVersionand stages the new one normally (the old bad-versions entry is harmless dead data, never re-examined). Net effect from the user's perspective: one prompt to restart, then it falls back to bundled and stops nagging.Why not the "just stage at build time" alternative
Reviewer briefly suggested this. It satisfies the install-time slice but freezes the runtime until the next desktop reinstall — exactly what the requirement ("without shell reinstall") rules out. Build-time staging is necessary (and already present on
mainviascripts/stage-cli.mjs) but not sufficient.Out of scope
@latestonly.Tests + checks
runtime-store/runtime-auto-updatetests, 9runtime-updatetests; rest are orchestrator/child-manager wiring)tsc --noEmitcleanModule-level coverage:
runtime-store— pointer round-trip + atomic-write, strict semver + canonical-path validation on bothreadPointer(rejects non-canonical) andwritePointer(throws on non-canonical),resolvePointerCliEntryself-repair contract, bad-versions idempotency / semver-sort / file-corruption defense,cleanupPartials.runtime-update— pacote mocked at module boundary; all 4 outcomes (staged/up-to-date/already-staged/bad-version) + non-semvercurrentVersiondefense, missingnode-pty, corrupt tarball (nodist/cli.js), stale-partial recovery.runtime-auto-update—loadOverride(stale-pointer-clears, file-missing self-repair, non-canonical pointer, returns canonical path on happy path),onCliEntryOverrideFailedrollback sequence,runCheckbroadcast on staged + usesmax(pointer, bundled)ascurrentVersion+ swallows pacote/network errors +stop()cancels timers.runtime-orchestrator—resolveCliEntryOverridewiring,onCliEntryOverrideFailedinvocation with bundled-cli retry, no-double-failure when bundled-arm itself fails,resolveCliEntryOverridethrowing doesn't brick the spawn.