Skip to content

Fix corrupted local cache eviction with logging and tests#14

Closed
cchristous wants to merge 2 commits intorichardartoul:mainfrom
cchristous:cchristous/fix-corrupted-cache-eviction
Closed

Fix corrupted local cache eviction with logging and tests#14
cchristous wants to merge 2 commits intorichardartoul:mainfrom
cchristous:cchristous/fix-corrupted-cache-eviction

Conversation

@cchristous
Copy link
Copy Markdown
Contributor

Summary

  • Evict corrupted local cache entries instead of just logging warnings. Previously, corrupted metadata (e.g., missing outputID from stale entries across builds) caused repeated warn-level log spam on every cache check. Now corrupted entries are cleaned up and treated as cache misses.
  • Add debug-level logging when evicting corrupted entries for observability without spam. Warn-level logging is used for actual eviction failures (permission denied, I/O errors).
  • Guard against empty outputID in S3 Get responses to prevent corrupted entries from propagating into the local cache. Returns a clean cache miss consistent with other metadata parse failures.
  • Add comprehensive tests covering corrupted metadata eviction, valid entry preservation, empty outputID defense-in-depth, permission error resilience, and read-only mode GET-after-PUT behavior.

Why these changes are safe

  • The eviction in check() only fires for non-ErrNotExist errors from readMetadata — genuine cache misses (file not found) are unaffected.
  • The evict() function tolerates already-removed files (ErrNotExist is silently ignored) and logs real filesystem errors at Warn level.
  • The S3 empty outputID guard follows the existing pattern for all other metadata parse failures in S3.Get() — returns miss=true, err=nil and properly closes the response body.
  • Debug logging for eviction avoids production log spam while remaining visible with -debug for troubleshooting.
  • 8 new tests validate both positive and negative cases including permission error resilience.

Test plan

  • TestCorruptedMetadata_EvictsEntry — corrupted metadata triggers eviction of both data and meta files
  • TestCorruptedMetadata_ValidEntryNotEvicted — valid entries are not affected by eviction logic
  • TestEmptyOutputID_BackendHit_LocalCacheSelfHeals — empty outputID from backend creates corrupted entry that self-heals on next check
  • TestEvict_PermissionError_LogsWarningAndContinues — eviction failure due to permissions does not panic, files remain
  • TestReadOnlyMode_GetAfterPut — GET after PUT in read-only mode hits local cache
  • All existing tests pass

When S3 backend returns objects with empty outputid metadata,
gobuildcache wrote corrupted local cache entries (outputID: with
no value). Subsequent check() calls logged 2 WARN lines per
corrupted entry, producing 45,000+ warnings per build.

Two fixes:
- s3.go: Treat empty outputid metadata as a cache miss instead of
  returning empty outputID that gets written to local cache
- localcache.go: Silently evict corrupted entries on read instead of
  logging WARN. Corrupted entries are treated as cache misses.
- Add debug-level logging when evicting corrupted local cache entries
  for observability while avoiding log spam from stale entries
- Add warn-level logging in evict() for non-ErrNotExist os.Remove
  errors (permission denied, I/O errors) instead of silently ignoring
- Add tests: GET after PUT in read-only mode, corrupted metadata
  eviction, valid entry not evicted, empty outputID defense-in-depth,
  and evict permission error handling
@cchristous cchristous closed this Mar 24, 2026
@cchristous cchristous deleted the cchristous/fix-corrupted-cache-eviction branch March 24, 2026 02:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant