Skip to content

Prompt text can be misinterpreted as a mode switch #3387

Description

@Hmbown

Problem

Sometimes when a user submits an ordinary prompt, CodeWhale appears to take part of the prompt text and interpret it as a mode-switch instruction. The result is that the session enters another mode even though the user did not explicitly request a mode change.

This should not happen for natural-language task text. Mode changes should only come from explicit mode-changing inputs such as the Tab cycle, /mode ..., a command-palette action, startup/config defaults, or another deliberately typed command surface.

Why this matters

  • The user prompt may be partially consumed, truncated, or reinterpreted instead of being sent as the task.
  • The agent can end up operating under the wrong behavioral posture.
  • If the mistaken switch affects Plan/Agent/YOLO, it can change approval, sandbox, shell, or trust behavior.
  • The bug is intermittent and easy to miss because it can look like the model decided to change mode, when the harness should be enforcing the boundary.

Observed behavior

Reported on 2026-06-22: while giving a normal prompt, CodeWhale sometimes takes a fragment of that prompt and enters another mode that is not appropriate or correct.

We do not yet have a minimal repro string, but likely trigger candidates include prompt text containing words or phrases such as mode, plan, agent, yolo, enter ... mode, or numbered aliases that overlap with /mode parsing.

Expected behavior

Natural-language prompt content must be treated as prompt content unless it is submitted through an explicit command/mode-switch path.

In particular:

  • plan, agent, yolo, and numeric aliases should only switch mode when parsed as explicit /mode arguments or through the official mode UI path.
  • Free-form prompt text should never be scanned broadly for mode names and converted into a mode transition.
  • If a prompt begins with text that resembles a command but is not an exact command invocation, it should remain user text.
  • The full submitted prompt should reach the model unchanged except for existing intentional composer normalization.

Areas to inspect

  • TUI composer submit path and slash-command detection.
  • /mode command parsing and aliases.
  • Any command-palette, shortcut, or natural-language intent handling that can call set_mode.
  • Turn setup and mode-policy resolution around AppMode, especially alongside v0.8.65: Untangle Plan/Agent/YOLO mode cycling from permission policy #3386.
  • Tests that simulate prompt submission text which contains mode-like words but is not an explicit command.

Acceptance criteria

  • Add regression coverage for prompt strings containing mode-like words that must not switch modes.
  • Add coverage for exact command strings that should still switch modes, e.g. /mode plan, /mode agent, /mode yolo, and supported numeric aliases.
  • Confirm that prompt submission and command dispatch have a clear boundary: only explicit command syntax reaches mode-command parsing.
  • Log or trace the source of mode transitions in a way that distinguishes user UI action, slash command, config/default startup, and runtime/internal requests.
  • Manual repro check: submit several natural-language prompts containing mode, plan, agent, yolo, and enter mode; CodeWhale should remain in the current mode unless the prompt is an explicit mode command.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingenhancementNew feature or requestreliabilityReliability, flaky behavior, retries, fallbacks, and robustnesssecuritySecurity, isolation, permissions, or trust-boundary worktuiTerminal UI behavior, rendering, or interactionv0.8.65Targeting v0.8.65

    Projects

    Status
    Done

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions