Skip to content

Plug three cache/setup integrity gaps#39

Merged
jdoss merged 3 commits intomasterfrom
fix/cache-and-orphan-handling
May 5, 2026
Merged

Plug three cache/setup integrity gaps#39
jdoss merged 3 commits intomasterfrom
fix/cache-and-orphan-handling

Conversation

@jdoss
Copy link
Copy Markdown
Contributor

@jdoss jdoss commented May 5, 2026

Three discrete fixes, one PR — they all surfaced from a single recovery on a homelab where a stale-but-reachable in-memory cache, a clobbered cache file, and a quietly orphaned set of HSM-encrypted Podman secrets compounded into Infisical refusing to start.

1. cache: clear in-memory entries when backing file vanishes

Cache.maybe_reload() swallowed FileNotFoundError and returned False, so once a cache.enc was deleted the running psi serve kept handing out the entries it had loaded at startup. Now a vanished file clears the in-memory dict, logs a warning, and the next lookup falls through to the provider.

2. cli: refuse to clobber existing cache.enc without --force, rotate to .bak

psi cache init went straight to an atomic rename, so re-running it on a populated cache silently wiped every entry. The new behaviour:

  • If cache.enc exists, refuse unless --force is passed.
  • With --force, rename the existing file to <name>.bak-<UTC timestamp> first, then write fresh.

3. setup: surface orphaned podman secrets and exit non-zero

_classify_secrets already detected Podman shell secrets without a mapping file — but only in --dry-run. Regular psi setup would happily report success while leaving the host in a state where the next container start fails. Adds OrphanedSecretsError, a _check_orphans pass after the workload loop, per-secret warnings, and a non-zero exit. Orphan takes precedence over drift since it produces hard container-start failures rather than missing env vars.

Test plan

  • uv run pytest -q — 376 passed
  • uv run ruff check psi/ tests/ — clean
  • uv run ruff format --check psi/ tests/ — clean (one auto-format applied)
  • uv run ty check — clean
  • New tests: TestMaybeReload::test_clears_entries_when_file_vanishes_after_load, TestGuardExistingCache (3 cases), TestCheckOrphans (3 cases), TestRunSetupOrphanExit (3 cases)

jdoss added 3 commits May 4, 2026 19:45
maybe_reload returned False on FileNotFoundError, so a deleted
cache.enc left the live serve process serving the entries it had
loaded at startup. Forever. Clearing on vanish forces the next
lookup through the provider, which is what the operator expects
after wiping the cache.
….bak

cache init went straight to atomic rename, so re-running it on a
populated cache wiped every entry with no recovery path. Now it
refuses if the file exists and only proceeds with --force, in which
case the previous file is moved to <name>.bak-<UTC timestamp>
before the new empty cache is written.
Adds OrphanedSecretsError and a check in run_setup that scans every
shell-driver Podman secret for a backing mapping file in state_dir.
Missing mapping means lookups return 404 and the consuming container
fails to start; before this commit setup would happily complete
without surfacing the condition. Drift detection still fires when
relevant; orphan takes precedence since it produces hard failures
rather than missing env vars.
@jdoss jdoss merged commit aa5d739 into master May 5, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant