Skip to content

Windows: codex app-server leaks taskkill stdout into its own stdout (corrupts JSONL on non-English locales) #21957

@lask3802

Description

@lask3802

Summary

codex (Rust binary) on Windows leaks the stdout of an internally-spawned taskkill /T /F /PID <pid> onto its own stdout. On non-English Windows locales (e.g. zh-TW / CP-950 / Big5), the leaked line is OS-localized text that breaks any consumer parsing the codex stdout as JSONL.

The downstream fallout is documented in openai/codex-plugin-cc #310codex app-server's JSONL stream is corrupted, and the plugin's parser tears the connection down.

Reproduction

OS Windows 11 Pro 26200, system locale Traditional Chinese / Taiwan (chcp = 950)
codex codex-cli 0.130.0
Trigger start codex app-server, request a thread that pulls in any MCP server config that fails to start (handshake error / missing binary / dead transport)

The corrupted line on stdout is reliably:

hex   : a6 a8 a5 5c 3a 20 50 49 44 20 ac b0 20 31 32 33 34 ...
cp950 : 成功: PID 為 1234 (PID 為 5678 的子處理程序) 的處理程序已終止。
utf-8 : ���\: PID �� 1234 (PID �� 5678 ���l�B�z�{��) ...

That is exactly what taskkill /T /F /PID xxxx prints to its own stdout on a zh-TW Windows console — and it is appearing on the codex app-server JSONL stdout, not on stderr, not in a side-channel.

Evidence: timing

Captured by attaching a raw-byte logger to the codex stdout pipe inside the embedding process (timestamps abridged):

T+0.0  STDOUT  {"id":1,"result":{...}}                                    ← initialize ok
T+0.1  STDOUT  {"method":"remoteControl/status/changed",...}              ← ok
T+7.6  STDOUT  {"method":"mcpServer/startupStatus/updated",...starting} × N
T+7.8  STDERR  ��� ~ : ... 找不到 ...                                     ← Big5 "錯誤" path
T+7.8  STDOUT  {"method":"mcpServer/startupStatus/updated","...failed..."}  ← ok
...
T+9.6  STDERR  rmcp::transport::worker quit with fatal: Transport channel closed ...
T+9.7  STDOUT  {"method":"mcpServer/startupStatus/updated","name":"unityMCP","status":"failed","error":"... Server returned error: ..."}
...
T+9.9  STDOUT  ���\: PID �� 70260 (PID �� 1396 ���l�B�z�{��) ...           ← LEAK

The leak fires exactly after the codex CLI gives up on a failed MCP server and has to clean its child process tree. So the path that calls taskkill to terminate a misbehaving / orphaned MCP child is also the path that leaks the killer's stdout.

The leak is not present on:

  • en-US Windows: taskkill still prints, but the bytes happen to be ASCII (SUCCESS: The process with PID 1234 ...) so JSONL parsers still get a parse error but the failure mode is much harder to attribute.
  • macOS / Linux: the cleanup path doesn't shell out to a PID killer.
  • runs where every MCP server starts cleanly: no cleanup, no leak.

Suggested fix

Audit the Windows-only process-cleanup path that invokes taskkill. Options:

  1. Capture stdout / stderr explicitly. Set Stdio::null() for stdout when spawning taskkill — the cleanup code only needs the exit status, not the success text.
  2. Pipe and discard. If output is needed for diagnostics, capture into a Vec<u8> and (a) decode as the active OEM codepage rather than UTF-8 before logging, (b) write to the codex log file rather than stdout.
  3. Use the Win32 API directly. OpenProcess + TerminateProcess for the target PID, and CreateToolhelp32Snapshot to walk children for the /T-equivalent. No subprocess, no stdio to manage.

(1) is the smallest patch and definitively fixes the symptom.

Why this surfaces in plugins, too

The codex-plugin-cc embedding parses codex app-server stdout as JSONL. When this leak fires, the parser sees a non-JSON line and tears the connection down (Failed to parse codex app-server JSONL: Unexpected token '�', "���\: PID "...). I have a corresponding plugin-side guard incoming at openai/codex-plugin-cc — it drops lines whose first non-whitespace character is not { or [, which protects every embedder from this and from similar terminal-noise leaks (e.g. zsh bracketed-paste, also reported as openai/codex-plugin-cc#23). That guard is, however, defense-in-depth; the right place to fix the corruption is here.

Environment

  • Windows 11 Pro 26200 (zh-TW system locale, CP-950 console)
  • codex 0.130.0 (npm @openai/codex)
  • Reproduces via codex-plugin-cc 1.0.4 and via a standalone cargo-built consumer that just spawns codex app-server and reads stdout.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingmcpIssues related to the use of model context protocol (MCP) serverswindows-osIssues related to Codex on Windows systems

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions