Skip to content

Add ralph-github: plan execution with GitHub PR pipeline#1

Open
DanielGGordon wants to merge 12 commits intomasterfrom
ralph-github-pipeline
Open

Add ralph-github: plan execution with GitHub PR pipeline#1
DanielGGordon wants to merge 12 commits intomasterfrom
ralph-github-pipeline

Conversation

@DanielGGordon
Copy link
Copy Markdown
Owner

@DanielGGordon DanielGGordon commented Mar 20, 2026

Summary

  • New /ralph-github skill and ralph-github.sh bash script that extends Ralph with full GitHub integration
  • Per-task pipeline: branch → execute → codex review → PR → bugbot check → merge
  • Each task branches off the previous task's branch, so code flows forward through the plan
  • Codex review after each task (falls back to Claude Opus 4.6 if codex unavailable)
  • Creates a GitHub PR per task, triggering bugbot (cursor[bot]) automatically
  • Polls for bugbot comments (30s intervals), examines findings, fixes issues
  • Merges previous PR and rebases current branch between task iterations
  • Special final-PR handling: waits for bugbot on the last task before merging

Test plan

  • Run bash ~/dotfiles/claude/skills/ralph-github/ralph-github.sh --help to verify argument parsing
  • Run with --dry-run on a sample plan to verify task detection and flow
  • Test on a real plan in a GitHub repo to verify branch creation, PR, and bugbot integration
  • Verify Ctrl+C cleanup returns to original branch

🤖 Generated with Claude Code


Note

Medium Risk
Medium risk because it changes the core ralph.sh execution loop to run git auto-commits and automated review/fix steps, which can affect developer repos and workflow. Also adds model-selection flags and prompt trimming that could alter task behavior and context.

Overview
Adds a new skills/ralph-github skill + ralph-github.sh wrapper that runs ralph.sh --review, and updates the README to document the new review-enabled workflow.

Enhances skills/ralph/ralph.sh with --review/--no-review, reviewer selection (codex vs Claude fallback), auto-commit-before-review, and an automatic “fix findings” pass; also adds --model presets/effort flags, trims plan context to the current ## phase to reduce prompt size, and reduces per-task overhead by optimizing stream-json parsing.

Updates statusline-command.sh to show model effort level, render a true-color gradient context bar, add a remote-control indicator, and simplify the directory segment; adds .gitignore for the cloned skills/excalidraw-diagram/ directory.

Written by Cursor Bugbot for commit ebcdc1f. This will update automatically on new commits. Configure here.

New skill that extends Ralph with full GitHub integration:
- Per-task branches (each task branches off previous)
- Codex review after each task (falls back to Claude Opus 4.6)
- Automatic PR creation per task (triggers bugbot)
- Bugbot polling + examination (cursor[bot] comments)
- Auto-fix of review/bugbot findings
- PR merge + rebase pipeline between tasks
- Final PR handling (waits for bugbot on last task)

Runs as standalone bash script:
  bash ~/dotfiles/claude/skills/ralph-github/ralph-github.sh [plan.md]

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Comment thread skills/ralph-github/ralph-github.sh Outdated
Comment thread skills/ralph-github/ralph-github.sh Outdated
Comment thread skills/ralph-github/ralph-github.sh Outdated
Comment thread skills/ralph-github/ralph-github.sh Outdated
1. Git push stdout corrupts PR number extraction — redirect to stderr
2. Sed non-greedy .+? doesn't work in POSIX ERE — use two-step sed
3. log_phase in run_review captured into review output — redirect to stderr
4. set -e kills script on skip/stop countdown — use || rc=$? pattern

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Comment thread skills/ralph-github/ralph-github.sh Outdated
grep -oP uses Perl regex which is unavailable on macOS BSD grep,
causing pr_num to always be empty and disabling the entire pipeline.
Comment thread skills/ralph-github/ralph-github.sh Outdated
Comment thread skills/ralph-github/ralph-github.sh Outdated
Comment thread skills/ralph-github/ralph-github.sh Outdated
1. Remove 2>&1 from gh pr create to prevent stderr contaminating pr_url
   (bug 39c4a506): gh writes progress messages to stderr which polluted
   PREV_PR_URL and degraded downstream prompt quality.

2. Call handle_previous_pr on exit-code failure path
   (bug b9c7e8c1): Previously, non-zero exit from run_claude skipped
   handle_previous_pr entirely, leaving the previous PR open and unmerged.

3. Clean up dangling task branch on exit-code failure
   (bug ffef6275): Checkout base branch and delete the failed task branch,
   consistent with the no-changes cleanup path.
Comment thread skills/ralph-github/ralph-github.sh Outdated
When a task fails (exit code path), handle_previous_pr was called with
PREV_BRANCH as the return branch. Since merge_pr uses --delete-branch,
PREV_BRANCH gets deleted, causing rebase_on_main to crash when it tries
to checkout the deleted branch.

Additionally, PREV_PR/PREV_PR_URL/PREV_BRANCH were never cleared after
merge, so the next iteration would try to branch off a deleted ref.

Fix: pass MAIN_BRANCH as the return branch (since there's nothing to
rebase after a failed task) and clear pipeline state after handling.
Comment thread skills/ralph-github/ralph-github.sh Outdated
- Add --review flag with codex/claude reviewer (--reviewer auto|codex|claude)
- Add --model presets (opus-high, sonnet, haiku, etc.) with effort levels
- Fix codex review: drop prompt arg incompatible with --base flag
- Fix has_review_issues: grep full output instead of just first 3 lines
- Consolidate ralph-github.sh to thin wrapper calling ralph.sh --review
- Update README, SKILL.md, statusline, .gitignore

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Comment thread skills/ralph/ralph.sh
Comment thread skills/ralph/ralph.sh
Comment thread skills/ralph/ralph.sh Outdated
…matching, captured progress messages

- Add --no-review case to argument parser so the documented flag works
  (previously fell through to catch-all, breaking ralph-github.sh workflow)
- Narrow has_review_issues to check only last 5 lines of output and remove
  overly broad 'clean' pattern to prevent false negatives
- Redirect log_phase calls in run_review to stderr so progress messages
  display to user instead of being captured into review_out variable
Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Autofix Details

Bugbot Autofix prepared a fix for the issue found in the latest run.

  • ✅ Fixed: --no-review cannot override wrapper's appended --review flag
    • Moved --review before "$@" so user-supplied flags like --no-review are processed last and can override the default.

Comment thread skills/ralph-github/ralph-github.sh Outdated
cursoragent and others added 2 commits March 20, 2026 05:19
Four performance improvements to the main task loop:

1. parse_stream: use bash pattern matching to skip frequent streaming
   events (text deltas, message lifecycle) without forking jq. Only
   fork jq for infrequent events (tool use, result). Reduces process
   spawns from hundreds to ~10-30 per task.

2. count_tasks: replace bash while-read loop with two grep -c calls.

3. find_next_task: replace bash while-read loop with grep -n | head -1.

4. trim_plan_for_task: new function that sends only the plan preamble
   plus the current phase section in the prompt, cutting completed
   phases. Reduces input tokens and time-to-first-token for large plans.

Also fixes stale plan context bug — PLAN_CONTENT was only refreshed
every 3 tasks despite the plan changing every task.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Comment thread skills/ralph/ralph.sh
Comment thread skills/ralph/ralph.sh
- Change '|| return' to '|| return 0' in find_next_task so grep's
  exit code 1 (no matches) doesn't propagate to callers under set -e.
  Plain assignments like 'var=$(find_next_task)' would crash the script
  instead of returning an empty string.

- Restore final_text fallback in parse_stream by accumulating
  content_block_delta lines to a temp file (no per-event jq fork).
  When the result event's .result field is empty, extract text from
  the accumulated deltas. This ensures RESULT_TMPFILE is populated
  for follow-up detection even when .result is absent.
Comment thread skills/ralph/ralph.sh Outdated
Use jq -j (join) instead of jq -r when reconstructing text from
text_delta fragments. jq -r appends a newline after each output value,
causing deltas like ['Hello ', 'world'] to produce 'Hello \nworld'
instead of the correct 'Hello world'. jq -j suppresses these
inter-value newlines, matching the behavior of the old direct
string concatenation approach.
Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Autofix Details

Bugbot Autofix prepared fixes for both issues found in the latest run.

  • ✅ Fixed: SIGPIPE causes find_next_task to silently return empty
    • Replaced || return 0 with || true followed by a separate [[ -z "$match" ]] && return 0 check, so SIGPIPE from the grep|head pipeline no longer causes the function to discard a valid match.
  • ✅ Fixed: Inconsistent JSON matching may miss result events
    • Added alternate patterns with a space after the colon ("type": "assistant" and "type": "result") to the substring checks so both compact and pretty-printed JSON formats are matched.

Comment thread skills/ralph/ralph.sh
Comment thread skills/ralph/ralph.sh Outdated
…_stream

- find_next_task: Use '|| true' instead of '|| return 0' to prevent
  SIGPIPE (exit 141) from pipefail causing the function to return empty
  when grep|head-1 finds a match. Check for empty match separately.

- parse_stream: Add '"type": "assistant"' and '"type": "result"'
  (with space after colon) as alternate patterns so JSON with whitespace
  after colons is still matched correctly.
Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

Comment thread skills/ralph/ralph.sh
${review_output}

## Working Directory
$(pwd)" 2>/dev/null || true
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix agent ignores user-specified model flags

Low Severity

The fix_review_issues function calls claude -p without passing CLAUDE_MODEL_FLAGS, while the main task execution in run_claude does pass them. When a user specifies --model opus-max, task execution uses Opus 4.6 with max effort, but the fix agent silently falls back to whatever the user's default claude model is — potentially a much less capable model for a step that requires understanding and correctly fixing code issues identified by the reviewer.

Fix in Cursor Fix in Web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants