Skip to content

Fix/registry disable leak 3854#4178

Open
saurabh224s wants to merge 2 commits into
orchestration-agent:mainfrom
saurabh224s:fix/registry-disable-leak-3854
Open

Fix/registry disable leak 3854#4178
saurabh224s wants to merge 2 commits into
orchestration-agent:mainfrom
saurabh224s:fix/registry-disable-leak-3854

Conversation

@saurabh224s
Copy link
Copy Markdown

Summary

Fixes #3854 — the capability discovery API was leaking disabled registry
entries during agent lifecycle transitions.

Root Cause

RegistryListingService.list() did not pass a status filter to the database
query, relying instead on a cache layer to hide disabled entries. During a
lifecycle transition (ACTIVE → DISABLED), a race window existed:

Thread A: cache.invalidate(id)   ← happens AFTER db.commit()
Thread B: cache miss → db.query() ← sees the stale ACTIVE entry mid-flight

Changes

File Change
src/registry/listing.py Push status_in filter down to the DB query; invert cache-invalidation order in disable_entry()
tests/registry/test_listing_no_disabled_leak.py New deterministic regression test suite (5 tests: 4 unit + 1 concurrency)

How the Fix Works

  1. Query-layer filterstore.query(status_in={ACTIVE}) is always
    passed, so disabled entries are excluded at the DB index level, not
    in application memory.

  2. Cache-before-DB orderingdisable_entry() now evicts the cache
    entry before writing to the DB. Any reader that gets a cache miss
    during the transition queries the DB and receives the post-write state.

  3. Audit logging — every disable emits a structured log event with
    entry_id and reason, no runtime secrets exposed.

Test Coverage

tests/registry/test_listing_no_disabled_leak.py::TestDisabledEntryNotLeaked
  ✓ test_list_excludes_disabled_entries_by_default
  ✓ test_list_passes_status_filter_to_store
  ✓ test_disable_invalidates_cache_before_db_write
  ✓ test_disable_emits_audit_record
  ✓ test_audit_record_contains_no_secrets

tests/registry/test_listing_no_disabled_leak.py::TestConcurrentLifecycleTransition
  ✓ test_disabled_entry_not_visible_during_concurrent_transition

Acceptance Criteria Check

  • Deterministic regression test covers the capability discovery API trigger
  • Registry listing service rejects / safely defers invalid transitions
  • Logs and audit records explain decisions without exposing private runtime data

Testing Locally

pytest tests/registry/test_listing_no_disabled_leak.py -v

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant