You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Design the next runner-host infra posture after Launchplane-owned hygiene is in place: durable retention budgets, drift thresholds, and a recovery/standby strategy for chris-testing or its successor hosts.
Finish Line
Runner-host retention budgets and recovery posture are explicit after scheduled hygiene evidence.
Current Status
State: The small corrected-counter orphan BuildKit cleanup is complete. The temporary independent allowlist was set only for buildx_buildkit_launchplane-ci0_state and buildx_buildkit_verireel-ci0_state, dry-run verified both were dangling/link 0, mutate removed them, the allowlist variable was deleted, and a final report-only run is healthy.
Next action: Leave this issue waiting for future scheduled drift evidence or an operator decision about broader retention thresholds/standby design. No immediate cleanup mutation is pending.
Blocked by: No native issue blocker.
Waiting for: Scheduled drift evidence or operator choice on broader retention/standby policy.
Last verified: 2026-05-25. Dry-run run 26407271540 used audit key runner-host-hygiene/2026-05-25/chris-testing-orphan-volume-cleanup-dry-run-1; blockers only mutate_not_requested; targets present, dangling, links 0: buildx_buildkit_launchplane-ci0_state (1,421,000,000 bytes) and buildx_buildkit_verireel-ci0_state (3,060,000,000 bytes). Mutate run 26407394767 used audit key runner-host-hygiene/2026-05-25/chris-testing-orphan-volume-cleanup-mutate-1; status completed; targets absent from post evidence; post report healthy; free disk 401,200,775,168 bytes; Docker reclaimable 46,014,000,000 bytes; orphan BuildKit containers 0; orphan BuildKit volumes 0; warm builders present. The temporary repo variable LAUNCHPLANE_RUNNER_HOST_HYGIENE_ALLOWED_BUILDKIT_STATE_VOLUMES was deleted after mutation. Final steady-state dry-run run 26407633101 used audit key runner-host-hygiene/2026-05-25/chris-testing-post-orphan-cleanup-dry-run-1; report healthy; blockers only mutate_not_requested; free disk 401,630,429,184 bytes; Docker reclaimable 46,014,000,000 bytes; runner workdir 37,519,639,238 bytes; 68 images; 36 volumes; orphan BuildKit containers 0; orphan BuildKit volumes 0; warm builders present: odoo-docker:verify-devtools, odoo-docker:verify-runtime.
Scope
Define retention budgets for generic BuildKit cache, named BuildKit state volumes, image inventory, runner _work, logs, and warm Odoo builders.
Decide alert thresholds for scheduled dry-run reports: disk free bytes, Docker reclaimable bytes, volume growth, orphan BuildKit artifacts, and missing warm builders.
Decide whether chris-testing should remain a multi-role host, gain a warm standby, or be split into dedicated runner hosts.
Preserve Launchplane ownership of shared runner-host cleanup and prevent product repos from adding broad shared-host Docker prune jobs.
Keep future mutation evidence-first, named-target, and allowlisted.
Acceptance Criteria
Scheduled dry-run evidence has a documented interpretation policy for when to act.
Any broader cleanup mode is scoped, tested, and fails closed like the BuildKit state-volume lane.
The chris-testing recovery path is either accepted as rebuild-from-runbook or replaced with a concrete standby/split-host design.
Repo-local product cleanup guidance remains clear: product repos may own product-scoped cleanup, not shared runner-host Docker/BuildKit pruning.
Operator docs identify which variables, labels, service users, warm builders, and audit records are required for new runner hosts.
If adding policy, run local focused hygiene tests, CI/Security/CodeQL, and one report-only rehearsal before any mutation.
If provisioning a standby/successor host, prove GitHub runner labels, service user, Docker/BuildKit setup, warm builders, Launchplane variables, and hygiene dry-run evidence before cutover.
Decisions
Launchplane remains the only owner for shared runner-host hygiene.
Product repos should not run broad shared-host Docker prune jobs.
Future cleanup should favor named targets and independent policy inputs over inferred or automatic deletion.
Open Questions
What disk and Docker reclaimable thresholds should trigger attention or action?
Should Odoo warm builders keep their current linked state volumes indefinitely, or should they get explicit budgets?
Should chris-testing remain a shared multi-role host, or should Launchplane split ops-gate/hygiene from product CI lanes?
Is a warm standby worth the operational overhead now, or is the documented replacement runbook enough?
Objective
Design the next runner-host infra posture after Launchplane-owned hygiene is in place: durable retention budgets, drift thresholds, and a recovery/standby strategy for
chris-testingor its successor hosts.Finish Line
Runner-host retention budgets and recovery posture are explicit after scheduled hygiene evidence.
Current Status
State: The small corrected-counter orphan BuildKit cleanup is complete. The temporary independent allowlist was set only for
buildx_buildkit_launchplane-ci0_stateandbuildx_buildkit_verireel-ci0_state, dry-run verified both were dangling/link 0, mutate removed them, the allowlist variable was deleted, and a final report-only run is healthy.Next action: Leave this issue waiting for future scheduled drift evidence or an operator decision about broader retention thresholds/standby design. No immediate cleanup mutation is pending.
Blocked by: No native issue blocker.
Waiting for: Scheduled drift evidence or operator choice on broader retention/standby policy.
Last verified: 2026-05-25. Dry-run run 26407271540 used audit key
runner-host-hygiene/2026-05-25/chris-testing-orphan-volume-cleanup-dry-run-1; blockers onlymutate_not_requested; targets present, dangling, links 0:buildx_buildkit_launchplane-ci0_state(1,421,000,000bytes) andbuildx_buildkit_verireel-ci0_state(3,060,000,000bytes). Mutate run 26407394767 used audit keyrunner-host-hygiene/2026-05-25/chris-testing-orphan-volume-cleanup-mutate-1; status completed; targets absent from post evidence; post report healthy; free disk401,200,775,168bytes; Docker reclaimable46,014,000,000bytes; orphan BuildKit containers0; orphan BuildKit volumes0; warm builders present. The temporary repo variableLAUNCHPLANE_RUNNER_HOST_HYGIENE_ALLOWED_BUILDKIT_STATE_VOLUMESwas deleted after mutation. Final steady-state dry-run run 26407633101 used audit keyrunner-host-hygiene/2026-05-25/chris-testing-post-orphan-cleanup-dry-run-1; report healthy; blockers onlymutate_not_requested; free disk401,630,429,184bytes; Docker reclaimable46,014,000,000bytes; runner workdir37,519,639,238bytes; 68 images; 36 volumes; orphan BuildKit containers0; orphan BuildKit volumes0; warm builders present:odoo-docker:verify-devtools,odoo-docker:verify-runtime.Scope
_work, logs, and warm Odoo builders.chris-testingshould remain a multi-role host, gain a warm standby, or be split into dedicated runner hosts.Acceptance Criteria
chris-testingrecovery path is either accepted as rebuild-from-runbook or replaced with a concrete standby/split-host design.Relationships
Validation
Decisions
Open Questions
chris-testingremain a shared multi-role host, or should Launchplane split ops-gate/hygiene from product CI lanes?