Skip to content

fix: don't drop Telegram messages when send_action times out before tmux injection#88

Merged
six-ddc merged 1 commit into
six-ddc:mainfrom
centaurverse:fix/message-drop-before-tmux-injection
Jun 3, 2026
Merged

fix: don't drop Telegram messages when send_action times out before tmux injection#88
six-ddc merged 1 commit into
six-ddc:mainfrom
centaurverse:fix/message-drop-before-tmux-injection

Conversation

@centaurverse

Copy link
Copy Markdown
Contributor

Symptom

On a flaky network (e.g. reaching api.telegram.org from mainland China), messages sent from Telegram intermittently never reach the tmux session. The user has to resend 2–3 times before one gets through, with no error shown.

Root cause

In text_handler (and photo_handler / voice_handler / forward_command_handler), a cosmetic outbound-Telegram callchat.send_action(ChatAction.TYPING), the "typing…" indicator — runs before the message is injected into tmux via send_to_window, with no error isolation. The bot also registers no global error handler (No error handlers are registered in logs).

Sequence in text_handler (pre-fix):

line call nature
950 send_action(TYPING) outbound Telegram network ← raises here
951 enqueue_status_update(...) outbound Telegram network
958 capture_pane + interactive-UI precheck local tmux (safe)
970 send_to_window(wid, text) the actual injection

When send_action raises telegram.error.TimedOut, the exception propagates out of the handler (no error handler), so execution never reaches line 970 — the message is never injected. Because the update offset has already advanced, Telegram does not redeliver, so the message is lost and the user must resend.

Evidence (from a ~7-day production log)

  • telegram.error.TimedOut raised at bot.py:950 in text_handler, aborting before injection: 129 occurrences.
  • Every one ends in telegram.error.TimedOut: Timed out inside send_chat_action.
  • For contrast, inbound get_updates timeouts (82) are harmless — they self-recover because the offset isn't advanced, so Telegram redelivers.
File ".../ccbot/bot.py", line 950, in text_handler
    await update.message.chat.send_action(ChatAction.TYPING)
File ".../telegram/_chat.py", line 1270, in send_chat_action
    ...
telegram.error.TimedOut: Timed out

Fix

Wrap every pre-injection outbound-Telegram call (send_action, enqueue_status_update) and the interactive-UI precheck in try/except so a transient network failure can never prevent the tmux injection. send_to_window is the one call that must always run. Applied to all four handlers that send TYPING before injecting. Ordering/semantics are unchanged (the UI precheck still runs first when it can); only the cosmetic calls' ability to abort the critical path is removed.

A complementary hardening (not in this PR) would be registering a global application.add_error_handler(...) so any future handler exception is at least logged and optionally surfaced to the user.


Note: while diagnosing this, ccbot --version / --help were observed to boot a second daemon (no early-exit), which then fights the running instance for getUpdates and raises telegram.error.Conflict: terminated by other getUpdates request. Might be worth a real --version/--help that exits without starting the bot.

…mux injection

Wrap the cosmetic pre-injection Telegram calls (send_action/TYPING, enqueue_status_update) and the interactive-UI precheck in try/except across text_handler, photo_handler, voice_handler and forward_command_handler so a transient telegram.error.TimedOut can no longer abort the handler before send_to_window runs and silently drop the user's message.
@centaurverse

Copy link
Copy Markdown
Contributor Author

Repro environment & why send_action times out

This was reproduced on a real deployment behind a router-level transparent proxy (OpenClash / Clash), reaching api.telegram.org from mainland China. To rule out misconfiguration vs. a genuinely flaky upstream, I measured the path from the host:

  • DNS: api.telegram.org198.18.0.18 (Clash fake-ip) — i.e. the proxy rule is matched, traffic is not leaking to a blocked direct route.
  • Connectivity (23 probes, curl https://api.telegram.org/):
    • 21/23 succeed, returning 302 with a stable TTFB of ~1.0–1.5 s.
    • 2/23 hang (curl rc=28, TCP connect never completes within 5–6 s).
    • ~9 % of connections stall completely at the proxy node; the rest are healthy.

So the timeouts aren't a config error — the proxy node intermittently drops ~1 in 11 connections. That ~9 % per-request failure is exactly what makes the unguarded send_action(TYPING) before injection drop messages:

  • P(message lost on 1st send) ≈ 9 %
  • P(still lost after 2 sends) ≈ 0.8 %, after 3 ≈ 0.07 %

which matches the observed user behavior precisely: "resend 2–3 times and it gets through." Over ~7 days this produced 129 text_handler aborts at send_action, all telegram.error.TimedOut.

The proxy flakiness itself is the user's to fix (e.g. a url-test/fallback proxy group so a stalled node auto-switches). But any non-trivial fraction of users on unstable networks will hit transient TimedOut on outbound Telegram calls — this PR makes that cosmetic failure stop eating their messages.

@six-ddc six-ddc merged commit 7c2e15c into six-ddc:main Jun 3, 2026
1 check failed
HParis added a commit to HParis/ccbot that referenced this pull request Jun 9, 2026
Cherry-picks the still-relevant upstream/main fixes, adapted to the iTerm2
backend and covered by tests:

- six-ddc#88 (7c2e15c): wrap the cosmetic pre-injection Telegram calls — send_action,
  enqueue_status_update, and the interactive-UI precheck — in try/except across
  text/photo/voice/forward handlers, so a transient TimedOut on a flaky link can
  no longer abort the handler before send_to_window and silently drop the user's
  message (the update offset has already advanced, so Telegram won't redeliver).
- six-ddc#67 (865ab89): on a hard interactive-UI edit failure, keep the old message and
  delete it only AFTER the replacement send succeeds, so a failed replacement
  never strands the user without controls. ("Message is not modified" was already
  handled on this branch.)
- Write line count (f5ddd7f): compute from the tool_use input content, not the
  result confirmation string, so it stops always showing "Wrote 1 lines".

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
00al added a commit to 00al/ccbot that referenced this pull request Jun 18, 2026
…ges on send_action timeout

Brings 7c2e15c: wrap the cosmetic pre-injection Telegram calls (send_action/TYPING,
enqueue_status_update) and the interactive-UI precheck in try/except across
text/photo/voice/forward_command handlers, so a transient telegram.error.TimedOut
can't abort the handler before send_to_window and silently drop the user's message.
Our fork had the exact unguarded pattern. Clean merge over our 13 local patches.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants