fix: cap adversarial review prompt size to stay under Codex API 1MB input limit by cardene777 · Pull Request #313 · openai/codex-plugin-cc

cardene777 · 2026-05-11T12:33:21Z

Summary

/codex:adversarial-review --wait (and therefore Skill('parallel-review', ...) flows that wrap it) fails on heavy PRs with:

Input exceeds the maximum length of 1048576 characters.

even after PR #179 capped the embedded diff at 256KB. PR #179 fixed the Node spawnSync ENOBUFS path, but the Codex API thread input is hard-limited to 1,048,576 characters, and template + 256KB diff + collection guidance + focus text can still cross that line in practice — especially when the diff contains multi-byte UTF-8.

This PR adds a small second-tier budget inside buildAdversarialReviewPrompt so the prompt as a whole always fits the API window.

Repro

Reported downstream as cardene777/claude-config#1467 and triggered on AlphaTrader heavy-tier PR +540/-31. Observed exit:

[codex] Starting Codex task thread.
[codex] Thread ready (...).
Input exceeds the maximum length of 1048576 characters.

focus_text short-circuit (200 → 100 chars) did not change the outcome, which is what pointed at the assembled prompt rather than the focus string.

Change

Inside plugins/codex/scripts/codex-companion.mjs:

Add two budgets:
- MAX_ADVERSARIAL_PROMPT_BYTES = 850 * 1024
- MAX_ADVERSARIAL_PROMPT_CHARS = 900 * 1024
  Both checked together so emoji-heavy diffs are clamped by chars and long ASCII diffs are clamped by bytes. Both stay safely under the 1,048,576-char API cap so the system-prompt / thread metadata that the API stitches on every turn has headroom.
Refactor buildAdversarialReviewPrompt into a three-step fallback chain:
1. render the full prompt, return it if it already fits;
2. otherwise binary-search the largest prefix of REVIEW_INPUT that fits, append a [truncated: REVIEW_INPUT was trimmed by N bytes ...] notice, and re-render;
3. if no useful prefix survives, drop the inline diff entirely and switch REVIEW_COLLECTION_GUIDANCE to the existing lightweight self-collect language so the reviewer is told to use read-only git commands.
USER_FOCUS is never dropped — it is small and high-signal.
Export buildAdversarialReviewPrompt, MAX_ADVERSARIAL_PROMPT_BYTES, and MAX_ADVERSARIAL_PROMPT_CHARS so they are unit-testable.
Guard the bottom-of-file main() invocation behind isDirectInvocation() so importing the module from tests no longer runs the CLI entry point. This is the minimal side-effect cleanup that the export change made necessary.

Tests

New tests/adversarial-prompt-cap.test.mjs (7 cases, all green locally):

small input passthrough — full content retained
~900KB ASCII content — prompt truncated under both budgets
truncated prompt — contains truncation marker or self-collect guidance
5MB unfittable input — falls back to self-collect mode
multi-byte UTF-8 (250k poop emojis) — both byte and char budgets respected
truncation preserves USER_FOCUS
sanity: byte/char budgets stay under the 1,048,576-char API cap

Full npm test result on this branch: 89 passed / 4 failed. The 4 failures are pre-existing and reproduce on a clean main checkout (resolveStateDir uses a temp-backed per-workspace directory, result returns the stored output for the latest finished job by default, and two status runtime tests that depend on a live Codex job). They are unrelated to this change.

Compatibility

Default behavior unchanged for prompts that already fit (still returns the full interpolated template).
No new public CLI flags.
No change to the prompts/adversarial-review.md template wording. Only the chosen REVIEW_INPUT and REVIEW_COLLECTION_GUIDANCE values can change when a fallback fires.

Non-goals

Re-tuning DEFAULT_INLINE_DIFF_MAX_BYTES (still 256KB; this PR layers an upper-bound budget on top, the two are orthogonal).
Reworking the generator-verifier / codex-review paths — they were never affected.
Token-level budget accounting — bytes + chars is enough to keep us under the documented API limit and matches how the API error message itself is phrased.

Smoke (pending downstream)

End-to-end smoke on the original heavy-tier repro PR requires a Claude Code restart so the updated plugin gets loaded; that will be run in a follow-up session and reported on cardene777/claude-config#1467. The unit test coverage above exercises every fallback branch directly without needing a live API.

Refs: cardene777/claude-config#1467

…nput limit The `adversarial-review --wait` command fails on heavy PRs with `Input exceeds the maximum length of 1048576 characters.` even after PR openai#179 capped the embedded diff to 256KB. The Codex API thread input is hard-limited to 1,048,576 characters, and the template plus 256KB diff plus collection guidance plus focus text can still cross that line in practice, especially when the diff contains multi-byte UTF-8. Add `MAX_ADVERSARIAL_PROMPT_BYTES = 850 * 1024` and `MAX_ADVERSARIAL_PROMPT_CHARS = 900 * 1024` budgets and a small fallback chain inside `buildAdversarialReviewPrompt`: 1. render the full prompt and return it if it already fits both budgets; 2. otherwise binary-search the largest prefix of `REVIEW_INPUT` that fits, append a truncation notice, and re-render; 3. if no useful prefix survives, drop the inline diff entirely and switch the collection guidance to the existing lightweight self-collect path. Both budgets are checked together, so a prompt of mostly multi-byte content (e.g. emoji-heavy diffs) is constrained by chars while a long ASCII diff is constrained by bytes. `USER_FOCUS` is never dropped, since it is small and high-signal. The helper is now exported alongside the budget constants so it can be unit-tested directly. The bottom-of-file `main()` call is guarded behind `isDirectInvocation()` so importing the module from tests no longer runs the CLI entry point. New tests in `tests/adversarial-prompt-cap.test.mjs` cover: small input passthrough, ~900KB ASCII content truncation, truncation marker presence, unfittable (5MB) input falling back to self-collect mode, multi-byte content respecting both budgets, `USER_FOCUS` preservation, and a sanity check that the byte/char budgets stay under the 1,048,576-char API cap. Refs: cardene777/claude-config#1467

cardene777 requested a review from a team May 11, 2026 12:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: cap adversarial review prompt size to stay under Codex API 1MB input limit#313

fix: cap adversarial review prompt size to stay under Codex API 1MB input limit#313
cardene777 wants to merge 1 commit into
openai:mainfrom
cardene777:feature/buildAdversarialReviewPrompt-1mb-cap

cardene777 commented May 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

cardene777 commented May 11, 2026

Summary

Repro

Change

Tests

Compatibility

Non-goals

Smoke (pending downstream)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant