retention: archive stale results/*.txt on startup (D half of flood fix) by sonichi · Pull Request #354 · sonichi/sutando

sonichi · 2026-04-15T23:23:01Z

Summary

First half of the D+C DM-fallback redesign you approved after the 2026-04-15 DM-flood post-mortem. Lands D (separate retention policy) as a standalone hygiene PR. C (event-driven fs.watch wiring for the fallback delivery path) will stack on top as a follow-up.

What this does

src/archive-stale-results.py (new, 85 lines) — one-shot script that walks results/*.txt, moves any file whose mtime is older than RETENTION_HOURS (default 24) into results/archive-YYYY-MM-DD/. Honors DRY_RUN=1 for preview. Skips files already under archive-* subdirs and the non-.txt calls/ subdirectory.
src/startup.sh (+6 lines) — invokes the script right after mkdir -p tasks results data, i.e. before any long-running service starts iterating results/. On fresh boots the backlog gets swept out of the delivery path at startup.

Why this first, independent of any DM-fallback rewiring

Today's incident proved results/ accumulates long-dead files indefinitely (task-/question-/briefing-/insight-/friction- from voice work that was never spoken because voice wasn't connected). Any future code that scans results/ for delivery — reverted #352, #349's polling loop, or anything we do next — will flood again on first tick unless the directory is actually kept small. This PR makes the directory small independent of the fallback question. Even if we never wire fallback again, bounded results/ is an unambiguous win.

Retention window = 24h

Judgment call, overridable per-run via RETENTION_HOURS=N:

<1h is too aggressive (kills same-day results)
7d doesn't bound aggressively enough
24h = "older than one full day is effectively dead mail"

Test plan

bash -n src/startup.sh clean
ast.parse clean on archive-stale-results.py
Dry run (DRY_RUN=1) against fresh clean results/ — "no stale files to archive"
Fabricated a 2-day-old results/task-test-old.txt, dry-run correctly flagged it, real run moved it to results/archive-2026-04-15/
results/calls/ subdirectory is skipped (f.is_file() check)

Not in this PR

C (event-driven fs.watch wiring) — stacks on this as follow-up
Any new DM-fallback delivery code — still paused per post-mortem

🤖 Generated with Claude Code

Fixes #362

First half of the D+C DM-fallback redesign owner approved after the 2026-04-15 DM-flood post-mortem. This PR lands D (separate retention policy) as a standalone hygiene win. C (event-driven fs.watch wiring for the DM fallback) will stack on top as a follow-up PR. ### What this does - Adds `src/archive-stale-results.py` — a small one-shot script that walks `results/*.txt`, moves any file whose mtime is older than `$RETENTION_HOURS` (default 24) under `results/archive-YYYY-MM-DD/`. Files inside existing `archive-*` subdirectories are never touched. Honors `DRY_RUN=1` for safe preview. - Invokes it from `src/startup.sh` right after `mkdir -p tasks results data` — i.e. before any long-running service starts iterating `results/`. Any accumulated backlog from a prior quiet day gets swept out of the service iteration paths at boot time. ### Why this first, before any DM fallback rewiring Today's incident proved that `results/` accumulates long-dead files indefinitely: task-*, question-*, briefing-*, insight-*, friction-* from voice-originated work that was never spoken because voice wasn't connected at the time. Any code that scans `results/` for delivery — my reverted #352 task-bridge wiring, #349's discord-bridge polling loop, or any future design — is going to re-deliver the entire backlog on first tick after a cold boot unless the directory is actually kept small. D (retention) fixes the underlying hygiene concern independently of the DM-fallback question. Even if we never wire fallback again, the directory staying bounded is an unambiguous win. With D in place, C (the event-driven follow-up) can use `fs.watch` without needing a complex "pre-existing files at boot" escape hatch — D has already moved them out of the way. ### Retention window rationale Defaulted to **24 hours**. This is a judgment call: - Too short (e.g. 1h) — risks archiving same-day results the user might still want the voice agent to speak when they reconnect. - Too long (e.g. 7d) — bounds the accumulation less aggressively. - 24h feels right: anything older than one full day is effectively dead mail. Overridable per-run via `RETENTION_HOURS=N`. ### Verification - `bash -n src/startup.sh` clean - `ast.parse` clean on `archive-stale-results.py` - Dry run against fresh clean results/: "no stale files to archive" - Fabricated a 2-day-old test file, DRY_RUN=1 correctly flagged it, then real run correctly moved it to `results/archive-2026-04-15/`. - Does not affect `results/calls/` (subdirectory, skipped by `f.is_file()` check). ### What's not in this PR - **C (event-driven fs.watch wiring)** — stacks on this. Will replace the current `readdirSync(RESULT_DIR)` scan in task-bridge.ts with an fs.watch listener that only reacts to newly-created files. Separate PR, coming next. - **Full architectural redo of DM fallback delivery** — out of scope; the pause-on-DM-fallback rule from the post-mortem still holds. This PR is pure hygiene. ### References - Post-mortem: `notes/post-mortem-dm-flood-2026-04-15.md` - Feedback memory: `feedback_results_dir_is_graveyard.md` - Owner's D+C decision: Discord 23:20 local Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

sonichi

MacBook review: LGTM. Clean retention sweep — archives stale results/*.txt on startup before services can iterate them. Configurable via RETENTION_HOURS env var, DRY_RUN support, date-stamped archive dirs. Good defensive fix for the DM flood class of bug.

…#354 Each script reproduces the bug (before the fix) and verifies it's resolved (after the fix). All POCs pass on current main. - poc-pr353-open-file.sh (11/11) — 18s polling timeout in open_file - poc-pr355-subtitled-pending.sh (9/9) — false positive subtitled_pending - poc-pr332-team-tier-revert.sh (9/9) — team-tier -C /tmp broke codex - poc-pr325-bodhi-dep.sh (7/7) — bodhi dep pointed at deleted repo - poc-pr354-retention-sweep.sh — retention sweep for stale results Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

josuachavesolmos

Reviewed by Sutando proactive loop.

APPROVE — core logic is sound. All downstream consumers verified safe (discord-bridge, telegram-bridge, daily-insight, agent-api — none recurse into subdirectories). Placement in startup.sh is correct (after mkdir, before services).

Two optional hardening suggestions:

[MEDIUM] Guard against RETENTION_HOURS=0 (archives all in-flight results). Clamp with max(1, RETENTION_HOURS).
[MEDIUM] Wrap int() parse in try/except so invalid values fall back to 24h instead of silently skipping via || true.

Neither is blocking — safe to merge as-is.

liususan091219 · 2026-04-16T01:02:43Z

POC: bash scripts/poc-pr354-retention-sweep.sh. Script in PR #358. Issue: #362

Reproduces the results/ unbounded growth bug and verifies the archival fix. Run: bash scripts/test-pr354-retention-sweep.sh Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

sonichi commented Apr 15, 2026

View reviewed changes

This was referenced Apr 16, 2026

test: POC verification scripts for fix PRs #358

Closed

results/ directory grows unbounded — causes DM flood on scan #362

Open

josuachavesolmos approved these changes Apr 16, 2026

View reviewed changes

liususan091219 and others added 2 commits April 15, 2026 21:14

test: add POC verification script for PR #354

e6f70fc

Reproduces the results/ unbounded growth bug and verifies the archival fix. Run: bash scripts/test-pr354-retention-sweep.sh Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

chore: remove poc-* script, keep test-* only

49fcd26

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

retention: archive stale results/*.txt on startup (D half of flood fix)#354

retention: archive stale results/*.txt on startup (D half of flood fix)#354
sonichi wants to merge 3 commits intomainfrom
feat/retention-sweep

sonichi commented Apr 15, 2026 •

edited by liususan091219

Loading

Uh oh!

sonichi left a comment

Uh oh!

josuachavesolmos left a comment

Uh oh!

liususan091219 commented Apr 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

sonichi commented Apr 15, 2026 • edited by liususan091219 Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What this does

Why this first, independent of any DM-fallback rewiring

Retention window = 24h

Test plan

Not in this PR

Uh oh!

sonichi left a comment

Choose a reason for hiding this comment

Uh oh!

josuachavesolmos left a comment

Choose a reason for hiding this comment

Uh oh!

liususan091219 commented Apr 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

sonichi commented Apr 15, 2026 •

edited by liususan091219

Loading