Drop cache-hit lookups to DEBUG, document the healthcheck amplifier#31
Merged
Drop cache-hit lookups to DEBUG, document the healthcheck amplifier#31
Conversation
Podman re-resolves every Secret= reference in a container's quadlet on every `podman healthcheck run` call, not just at container start. With a default HealthInterval=30s across ~50 containers, this generates a constant ~15+ lookups/sec against the PSI socket. The cache serves these from an in-memory dict in under a millisecond — the throughput cost is negligible — but logging every hit at INFO floods the journal with thousands of entries per minute. Move the cache-hit log line to DEBUG. Cache misses, provider fetches, and errors stay at INFO / WARNING / ERROR so anything interesting still surfaces. Document the behavior in the README cache section and add a troubleshooting entry in docs/secret-cache.md explaining that this is upstream Podman behavior, not a PSI regression.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Podman re-resolves every
Secret=reference in a container's quadlet on everypodman healthcheck runcall — not just at container start. With a defaultHealthInterval=30sacross ~50 containers, this generates a constant ~15+ lookups/sec against the PSI socket. The cache serves these from an in-memory dict in under a millisecond, so the throughput cost is negligible, but logging every hit atINFOfloods the journal with thousands of entries per minute.Move the cache-hit log line to
DEBUG. Cache misses, provider fetches, and errors stay atINFO/WARNING/ERRORso anything interesting still surfaces.Why
Observed on the test server after PR #29 made the cache actually work: ~16 lookups/sec sustained with all containers stable. Traced to Podman healthcheck cycles — every 30s healthcheck for every container = N secrets × M containers per interval. This is upstream Podman behavior; PSI can't opt out.
Docs
psi serveis busy even when containers are stable.docs/secret-cache.mdwith the same explanation plus how to re-enable the verbose logging if you want to see the hit rate.Test plan
pytest tests/test_serve.py— all existing tests still pass (no tests assert on log level for cache hits).ruff check/ty check— clean.