discord-bridge: wire dm-result.py into result-poll flow as DM fallback by sonichi · Pull Request #349 · sonichi/sutando

sonichi · 2026-04-15T21:04:05Z

Summary

Follow-up to #347. The src/dm-result.py CLI shipped as a standalone tool with no caller. This PR wires it into the Discord bridge's polling infrastructure so voice-originated and cron-originated results are actually delivered as a DM fallback.

Adds poll_dm_fallback() — a 4th asyncio task in on_ready alongside poll_results / poll_approved / poll_proactive. Every 30s, scans results/ for task/question/briefing/insight/friction files that are:

Not in pending_replies (Discord-originated, already handled by poll_results)
Older than 90 seconds (grace period so voice-agent and telegram-bridge get first dibs)

and subprocess-calls python3 src/dm-result.py --file <f>. Delegating to the CLI keeps the voiceConnected check + Discord-DM logic in one place (the script from #347).

If dm-result.py prints "voice client connected, skipping DM", the file is left on disk for the voice agent to pick up. On any other non-zero exit, the error is logged and the file also stays for the next retry cycle.

Also drops an unused import subprocess from src/dm-result.py — spotted in my review comment on #347.

Test plan

ast.parse clean on both files
Subprocess smoke test: wrote a fake stale results/task-0000000000000.txt, ran the same subprocess call the loop makes. dm-result.py correctly detects voice disconnected and attempts delivery. Return code + stdout handled correctly.
No changes to poll_results / poll_proactive behavior. Race-free because poll_dm_fallback excludes pending_replies task IDs AND only touches files ≥90s old.

Known limitation (pre-existing, not blocking this PR)

Tested on Mac Mini end-to-end: subprocess call reaches Discord but returns HTTP 403. The hardcoded DM_CHANNEL = "1485370959870431433" in dm-result.py corresponds to a channel the Mac Mini bot doesn't have permission to POST to — it's the DM channel the MacBook bot opened with the owner. Flagged in my original #347 cross-review (#347 (comment)):

Hardcoded DM channel ID … fine for single-owner use but doesn't survive if the owner changes Discord accounts or if we ever want multi-owner setups.

This PR is correct independent of that — it wires the caller. The 403 is fixable separately by either per-node config or having each bot open its own DM channel on demand. The wiring loop already handles the error gracefully (logs it, leaves the file on disk for retry).

File changes

src/discord-bridge.py — add poll_dm_fallback() + register in on_ready + add import subprocess (+63 lines)
src/dm-result.py — drop unused import subprocess (−1 line)

Total: +64/−1 across 2 files.

🤖 Generated with Claude Code

Follow-up to PR #347. PR #347 shipped `src/dm-result.py` as a standalone CLI but didn't wire it into any polling loop, so voice-originated and cron-originated results that no one handles would still be silently dropped when the voice client is offline. Adds `poll_dm_fallback()` — a 4th asyncio task registered in `on_ready` alongside `poll_results`, `poll_approved`, and `poll_proactive`. Scans `results/` every 30s for task/question/briefing/insight/friction files that are: - not in `pending_replies` (Discord-originated, already handled) - older than 90s (grace period so voice-agent and telegram-bridge get first dibs) and subprocess-calls `python3 src/dm-result.py --file <f>`. Delegating to the CLI keeps the voiceConnected check + Discord-DM-send logic in one place (the script MacBook shipped in #347). If dm-result.py prints "voice client connected, skipping DM" the file is left on disk for the voice agent to speak when a client reconnects. On any other non-zero exit, the error is logged and the file is also left on disk for the next retry cycle. Also: drop unused `import subprocess` from `src/dm-result.py` — flagged in my cross-review on #347 (comment 4255378452). Harmless but noted. ### Known limitation (not introduced by this PR) Tested end-to-end on Mac Mini: the subprocess call reaches Discord but returns HTTP 403. dm-result.py hardcodes `DM_CHANNEL = "1485370959870431433"` which corresponds to MacBook's bot's DM channel with the owner; the Mac Mini bot doesn't have permission to POST to that channel. Flagged in my original #347 review — the fix is either a per-node config file or having the bot open its own DM channel on demand. Out of scope for this PR; the wiring is correct and will work once the DM channel is per-node. ### Verification - `ast.parse` clean on both files - Subprocess smoke test: loop correctly subprocesses out, reads rc, logs stderr when dm-result.py fails. Fake stale file cleaned up after. - No changes to existing `poll_results` / `poll_proactive` behavior — race-free because `poll_dm_fallback` excludes `pending_replies` task IDs and only touches files ≥90s old. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

@me

Addresses the hardcode owner flagged after PR #349 opened. Previously `DM_CHANNEL = "1485370959870431433"` was baked into dm-result.py. That channel belonged to a specific bot's DM with the owner, so subprocess calls from a different node hit HTTP 403. Confirmed end-to-end on Mac Mini during smoke-testing of PR #349. Now the DM channel is resolved on demand: 1. `_resolve_owner_id()` picks the human owner by: a. `$SUTANDO_DM_OWNER_ID` env var (explicit override, skips API calls) b. First non-bot entry in `~/.claude/channels/discord/access.json` allowFrom, decided by a GET /users/{id} `bot` flag. allowFrom typically contains multiple bot accounts (MacBook bot, Mac Mini bot) plus the human owner — without the is-bot check we'd DM the wrong bot. 2. `_open_dm_channel()` calls `POST /users/@me/channels` with the resolved owner's user ID. Discord docs are explicit that this endpoint is idempotent — returns the existing DM channel if one exists, otherwise creates one. Cheap to call per invocation. 3. `_discord_api()` extracted as a small wrapper so the GET /users/@me, POST /users/@me/channels, POST /channels/{id}/messages paths all share a single urllib+auth call site. Side effect: send_dm() is ~40% shorter and easier to follow. ### Smoke test results Against live Discord API on Mac Mini: - `_resolve_owner_id` correctly returns `1022910063620390932` (sonichi) by skipping `1485364006297534584` (MacBook bot) via is-bot lookup. - `_open_dm_channel` returns `1490906927675474030` — a live DM channel, completely different from the old hardcoded value. - Actual send deferred to avoid spamming owner during testing. The POST path is standard Discord API and is the same byte sequence as before the refactor; only the channel ID input changed. ### Python 3.9 note The `dict | None = None` union-syntax type hint crashed at module import on the system Python 3.9 (`TypeError: unsupported operand | for type`). Switched to untyped function signatures for the helpers — docstrings carry the same information. Caught in the first smoke-test run. ### Bot ID also no longer needs to be known The previous draft of this change used GET /users/@me to discover the current bot's user ID and skip it in allowFrom. Replaced with per-user is-bot lookup, which correctly skips all bots (not just the current one) in multi-bot allowFrom lists. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

sonichi · 2026-04-15T21:13:55Z

Superseded by #350 (MacBook's parallel wiring).

MacBook shipped #350 wiring the same feature into src/task-bridge.ts startResultWatcher() instead of discord-bridge.py. That's the better layer:

task-bridge already watches results/ and already has isClientConnected(), so there's no polling loop or 90s grace window. Responds immediately to state changes.
Less surface to review: +47/-7 vs my +178.

Per feedback_two_bot_collaboration.md ("take the better version and close the other"), recommending #349 close in favor of #350. Posted the same recommendation on #350 itself with honest comparison.

One piece worth preserving from this PR

My dm-result.py refactor resolves the owner user ID and DM channel via the Discord API (walks access.json allowFrom, queries /users/{id} for bot flag, opens DM channel via /users/@me/channels). That's zero-config — no per-node .env edits required.

MacBook's #350 instead reads DISCORD_DM_CHANNEL from .env. Simpler but requires per-node config.

Follow-up plan after #350 merges: open a small PR that adds the API-discovery path as a fallback when DISCORD_DM_CHANNEL is unset. Best of both.

Owner: say the word and I'll close this PR.

…sk-bridge Follow-up to #349. Owner asked for the cleaner wiring architecture after #349 merged, because the polling-loop approach has an inherent 30s–120s latency between result-write and DM delivery (30s poll interval + 90s grace window). Net effect: −33 lines. One wiring point instead of two. ### What moves - **Removed** from `src/discord-bridge.py`: - `async def poll_dm_fallback()` (59 lines) — the polling loop that scanned `results/` every 30s and subprocess-called `dm-result.py` for stale files not in `pending_replies`. - `client.loop.create_task(poll_dm_fallback())` registration in `on_ready`. - `import subprocess` (unused after the loop is gone). - **Added** to `src/task-bridge.ts`: - `import { execSync } from 'node:child_process'`. - Inside `startResultWatcher()`'s main tick, when `!isClientConnected()` — iterate the same `files` the connected path iterates, `execSync` `python3 src/dm-result.py --file <f>` per unprocessed file, then run the same bookkeeping the connected path runs (`_sendTaskStatus`, `_deliveredResults.add`, `_pendingTasks.delete`, `/task-done` POST, 10s unlink). ### Why this is better 1. **Latency.** The poll loop had a cumulative 30s+90s worst case. task-bridge.ts runs on the same interval as the result watcher (2s tick per the file-loop's natural cadence), so the DM arrives within a single tick of the result being written. No grace period needed — task-bridge already tracks `_deliveredResults` so it can't double-send. 2. **Right layer.** task-bridge.ts is the single owner of the "voice-agent delivery" contract. Fallback is a property of that delivery contract, so it belongs next to the normal delivery path — not in a sibling process (discord-bridge.py) polling the same directory. 3. **No redundant state.** The polling loop needed `pending_replies` awareness, a file-age check, and a grace window. task-bridge.ts already has `_deliveredResults` tracking and `isClientConnected()` state — nothing to reinvent. 4. **Subprocess parity preserved.** Both the old polling loop and this path shell out to `python3 src/dm-result.py --file <f>`. `dm-result.py`'s API-based owner + DM-channel resolution from #349 is unchanged and carries over directly. ### What this PR does NOT change - `src/dm-result.py` — untouched. All the `_resolve_owner_id`, `_open_dm_channel`, and `_discord_api` helpers from #349 stay as they are. The wiring PR just calls it differently. - `src/discord-bridge.py` `poll_proactive()` — still handles `results/proactive-*.txt` with its own direct-Discord-client DM path. That's a different contract (explicitly write-proactive-to-DM) and doesn't need the voiceConnected check. ### Verification - `python3 -c "import ast; ast.parse(...)"` clean on discord-bridge.py - `npx tsc --noEmit --skipLibCheck` clean on task-bridge.ts - Net diff: −33 lines across 2 files - dm-result.py smoke test from #349 still valid — this PR doesn't touch the script Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…sk-bridge (#352) Follow-up to #349. Owner asked for the cleaner wiring architecture after #349 merged, because the polling-loop approach has an inherent 30s–120s latency between result-write and DM delivery (30s poll interval + 90s grace window). Net effect: −33 lines. One wiring point instead of two. ### What moves - **Removed** from `src/discord-bridge.py`: - `async def poll_dm_fallback()` (59 lines) — the polling loop that scanned `results/` every 30s and subprocess-called `dm-result.py` for stale files not in `pending_replies`. - `client.loop.create_task(poll_dm_fallback())` registration in `on_ready`. - `import subprocess` (unused after the loop is gone). - **Added** to `src/task-bridge.ts`: - `import { execSync } from 'node:child_process'`. - Inside `startResultWatcher()`'s main tick, when `!isClientConnected()` — iterate the same `files` the connected path iterates, `execSync` `python3 src/dm-result.py --file <f>` per unprocessed file, then run the same bookkeeping the connected path runs (`_sendTaskStatus`, `_deliveredResults.add`, `_pendingTasks.delete`, `/task-done` POST, 10s unlink). ### Why this is better 1. **Latency.** The poll loop had a cumulative 30s+90s worst case. task-bridge.ts runs on the same interval as the result watcher (2s tick per the file-loop's natural cadence), so the DM arrives within a single tick of the result being written. No grace period needed — task-bridge already tracks `_deliveredResults` so it can't double-send. 2. **Right layer.** task-bridge.ts is the single owner of the "voice-agent delivery" contract. Fallback is a property of that delivery contract, so it belongs next to the normal delivery path — not in a sibling process (discord-bridge.py) polling the same directory. 3. **No redundant state.** The polling loop needed `pending_replies` awareness, a file-age check, and a grace window. task-bridge.ts already has `_deliveredResults` tracking and `isClientConnected()` state — nothing to reinvent. 4. **Subprocess parity preserved.** Both the old polling loop and this path shell out to `python3 src/dm-result.py --file <f>`. `dm-result.py`'s API-based owner + DM-channel resolution from #349 is unchanged and carries over directly. ### What this PR does NOT change - `src/dm-result.py` — untouched. All the `_resolve_owner_id`, `_open_dm_channel`, and `_discord_api` helpers from #349 stay as they are. The wiring PR just calls it differently. - `src/discord-bridge.py` `poll_proactive()` — still handles `results/proactive-*.txt` with its own direct-Discord-client DM path. That's a different contract (explicitly write-proactive-to-DM) and doesn't need the voiceConnected check. ### Verification - `python3 -c "import ast; ast.parse(...)"` clean on discord-bridge.py - `npx tsc --noEmit --skipLibCheck` clean on task-bridge.ts - Net diff: −33 lines across 2 files - dm-result.py smoke test from #349 still valid — this PR doesn't touch the script Co-authored-by: Chi <wangchi@Chis-Mac-mini.local> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

First half of the D+C DM-fallback redesign owner approved after the 2026-04-15 DM-flood post-mortem. This PR lands D (separate retention policy) as a standalone hygiene win. C (event-driven fs.watch wiring for the DM fallback) will stack on top as a follow-up PR. ### What this does - Adds `src/archive-stale-results.py` — a small one-shot script that walks `results/*.txt`, moves any file whose mtime is older than `$RETENTION_HOURS` (default 24) under `results/archive-YYYY-MM-DD/`. Files inside existing `archive-*` subdirectories are never touched. Honors `DRY_RUN=1` for safe preview. - Invokes it from `src/startup.sh` right after `mkdir -p tasks results data` — i.e. before any long-running service starts iterating `results/`. Any accumulated backlog from a prior quiet day gets swept out of the service iteration paths at boot time. ### Why this first, before any DM fallback rewiring Today's incident proved that `results/` accumulates long-dead files indefinitely: task-*, question-*, briefing-*, insight-*, friction-* from voice-originated work that was never spoken because voice wasn't connected at the time. Any code that scans `results/` for delivery — my reverted #352 task-bridge wiring, #349's discord-bridge polling loop, or any future design — is going to re-deliver the entire backlog on first tick after a cold boot unless the directory is actually kept small. D (retention) fixes the underlying hygiene concern independently of the DM-fallback question. Even if we never wire fallback again, the directory staying bounded is an unambiguous win. With D in place, C (the event-driven follow-up) can use `fs.watch` without needing a complex "pre-existing files at boot" escape hatch — D has already moved them out of the way. ### Retention window rationale Defaulted to **24 hours**. This is a judgment call: - Too short (e.g. 1h) — risks archiving same-day results the user might still want the voice agent to speak when they reconnect. - Too long (e.g. 7d) — bounds the accumulation less aggressively. - 24h feels right: anything older than one full day is effectively dead mail. Overridable per-run via `RETENTION_HOURS=N`. ### Verification - `bash -n src/startup.sh` clean - `ast.parse` clean on `archive-stale-results.py` - Dry run against fresh clean results/: "no stale files to archive" - Fabricated a 2-day-old test file, DRY_RUN=1 correctly flagged it, then real run correctly moved it to `results/archive-2026-04-15/`. - Does not affect `results/calls/` (subdirectory, skipped by `f.is_file()` check). ### What's not in this PR - **C (event-driven fs.watch wiring)** — stacks on this. Will replace the current `readdirSync(RESULT_DIR)` scan in task-bridge.ts with an fs.watch listener that only reacts to newly-created files. Separate PR, coming next. - **Full architectural redo of DM fallback delivery** — out of scope; the pause-on-DM-fallback rule from the post-mortem still holds. This PR is pure hygiene. ### References - Post-mortem: `notes/post-mortem-dm-flood-2026-04-15.md` - Feedback memory: `feedback_results_dir_is_graveyard.md` - Owner's D+C decision: Discord 23:20 local Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Chi and others added 2 commits April 15, 2026 14:03

sonichi mentioned this pull request Apr 15, 2026

feat: wire Discord DM fallback into task bridge result watcher #350

Closed

2 tasks

sonichi merged commit c82d864 into main Apr 15, 2026
1 check passed

sonichi deleted the feat/wire-dm-fallback branch April 15, 2026 21:14

sonichi mentioned this pull request Apr 15, 2026

refactor: move DM fallback wiring from discord-bridge poll loop to task-bridge #352

Merged

4 tasks

sonichi mentioned this pull request Apr 15, 2026

retention: archive stale results/*.txt on startup (D half of flood fix) #354

Open

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

discord-bridge: wire dm-result.py into result-poll flow as DM fallback#349

discord-bridge: wire dm-result.py into result-poll flow as DM fallback#349
sonichi merged 2 commits intomainfrom
feat/wire-dm-fallback

sonichi commented Apr 15, 2026

Uh oh!

sonichi commented Apr 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

sonichi commented Apr 15, 2026

Summary

Test plan

Known limitation (pre-existing, not blocking this PR)

File changes

Uh oh!

sonichi commented Apr 15, 2026

One piece worth preserving from this PR

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant