Skip to content

Fix shell execution degradation and session lock corruption in long conversations#468

Open
ayourk wants to merge 1 commit intoop7418:mainfrom
ayourk:fix/shell-exec-and-session-locks
Open

Fix shell execution degradation and session lock corruption in long conversations#468
ayourk wants to merge 1 commit intoop7418:mainfrom
ayourk:fix/shell-exec-and-session-locks

Conversation

@ayourk
Copy link
Copy Markdown

@ayourk ayourk commented Apr 11, 2026

Summary

  • Shell commands silently stop executing during long conversations because the context window overflows, causing the model to output tool calls as text instead of using the tool API. This adds token-aware context limiting, output streaming, and sanitization of fake tool call patterns that create feedback loops.
  • Session locks left behind by crashes or force-quit leave sessions permanently stuck in "running" state. This adds automatic stale lock cleanup on startup and a force-release mechanism for the UI.

Steps to reproduce

Shell execution degradation

  1. Start a session and have a long conversation (40+ exchanges with tool use)
  2. As context grows, the model begins outputting (used Bash: {"command":"..."}) as literal text instead of making tool API calls
  3. These fake tool call patterns get saved to conversation history
  4. On subsequent turns the model imitates the pattern, creating a feedback loop where no commands execute
  5. The user sees what looks like tool activity but nothing actually runs — "0 actions" in the UI

Session lock corruption

  1. Start a session and run a long-running shell command
  2. Force-quit the app (kill process, crash, or close during execution)
  3. Relaunch the app and open the same session
  4. The session is stuck in "running" state with no way to recover — the lock row in the database references a process that no longer exists
  5. The user must manually edit the SQLite database or create a new session

Changes

Shell execution reliability

  • bash.ts: Validate working directory exists before spawning, stream output to client via SSE as it arrives, clean up kill timers properly on process exit
  • agent-loop.ts: Token-aware context limiting — estimates token count and reduces message history if it exceeds 75% of the context window. Fixes doom loop detection with proper counting instead of broken heuristic
  • context-pruner.ts: Keep 16 recent turns instead of 6 (~8 full exchanges), preserve tool name + 200-char excerpt in pruned results instead of a generic [truncated] marker
  • claude-client.ts: Explicit instruction in fallback context to never output [Tool call: ...] as text. TypeScript type fixes for tool result content array parsing
  • message-builder.ts: Detect and clean fake tool call syntax ((used Bash: {...})) from compacted conversation history. Merge consecutive assistant messages preserving all content parts instead of dropping the earlier message
  • message-normalizer.ts: Change tool_use summary format from (used Name: ...) to [Tool call: Name — ...] so the sanitizer can distinguish real summaries from model-hallucinated fake calls

Session lock recovery

  • db.ts: cleanupStaleLocks() deletes expired locks and resets stuck sessions to idle on startup. forceReleaseSessionLock() lets the UI break a stuck session regardless of lock ownership
  • chat/route.ts: Call cleanupStaleLocks() on first API request. Validate working directory exists before starting a session — return a 400 with INVALID_CWD instead of crashing
  • sessions/[id]/route.ts: Accept force_unlock in PATCH body to trigger forceReleaseSessionLock()

You can edit the patches as needed before the merge.

Long conversations cause context window overflow, making the model
output tool calls as text instead of using the tool API. This adds
token-aware context limiting, output streaming, fake tool call
sanitization, and improved context pruning to prevent the feedback
loop.

Session locks left behind by crashes leave sessions permanently
stuck in "running" state. This adds automatic stale lock cleanup
on startup and a force-release endpoint for the UI.

Shell execution:
- Validate working directory before spawning (bash.ts)
- Stream tool output via SSE as it arrives (bash.ts)
- Reduce message history when context exceeds 75% of window (agent-loop.ts)
- Fix doom loop detection with proper counting (agent-loop.ts)
- Keep 16 recent turns instead of 6, preserve tool excerpts (context-pruner.ts)
- Sanitize fake tool call patterns from compacted history (message-builder.ts)
- Merge consecutive assistant messages instead of dropping (message-builder.ts)
- Distinguish real summaries from hallucinated calls (message-normalizer.ts)
- Harden fallback context instruction (claude-client.ts)

Session lock recovery:
- Clean up expired locks and reset stuck sessions on startup (db.ts)
- Force-release lock endpoint for UI recovery (sessions/[id]/route.ts)
- Validate working directory before starting session (chat/route.ts)
@vercel
Copy link
Copy Markdown

vercel bot commented Apr 11, 2026

@ayourk is attempting to deploy a commit to the op7418's projects Team on Vercel.

A member of the Team first needs to authorize it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant