feat: Add cloud mode for PR review workflow #1966

xingyaoww · 2026-02-08T15:04:05Z

Summary

This PR adds a new 'cloud' mode to the PR review workflow that launches reviews in OpenHands Cloud instead of running locally in GitHub Actions. This is inspired by and builds upon PR #1927.

Changes

examples/03_github_workflows/02_pr_review/agent_script.py:
- Add MODE environment variable support (sdk or cloud)
- Add post_github_comment() helper function to post PR comments
- Add _start_cloud_conversation() helper function to call OpenHands Cloud API
- Add run_cloud_mode() function that creates a cloud conversation and posts a comment with the URL
- Refactored existing logic into run_sdk_mode() function
- Cloud mode only requires OPENHANDS_CLOUD_API_KEY (ignores LLM_MODEL and LLM_BASE_URL)
.github/actions/pr-review/action.yml:
- Add new inputs: mode, openhands-cloud-api-key, openhands-cloud-api-url
- Update validation step to check correct secrets for each mode
- Update run step to pass new environment variables
examples/03_github_workflows/02_pr_review/workflow.yml:
- Add mode input with comment explaining both modes
- Add commented cloud mode secrets section for easy configuration
examples/03_github_workflows/02_pr_review/README.md:
- Add "Two Review Modes" feature description
- Updated setup documentation with separate sections for SDK and Cloud mode
- Add Cloud Mode Benefits section
- Update Action Inputs table with new cloud mode inputs
tests/github_workflows/test_pr_review_agent.py (new):
- Add 14 tests covering cloud mode functionality

Cloud Mode Benefits

Faster CI completion: Exits immediately after starting the review
Track progress in UI: Users can follow the review in OpenHands Cloud
Interactive: Users can interact with the review conversation
Uses cloud model: The LLM model is configured in the OpenHands Cloud account, not in the workflow

Usage

SDK Mode (default):

- name: Run PR Review
  uses: OpenHands/software-agent-sdk/.github/actions/pr-review@main
  with:
      mode: sdk
      llm-api-key: ${{ secrets.LLM_API_KEY }}
      github-token: ${{ secrets.GITHUB_TOKEN }}

Cloud Mode:

- name: Run PR Review
  uses: OpenHands/software-agent-sdk/.github/actions/pr-review@main
  with:
      mode: cloud
      openhands-cloud-api-key: ${{ secrets.OPENHANDS_CLOUD_API_KEY }}
      github-token: ${{ secrets.GITHUB_TOKEN }}

Checklist

If the PR is changing/adding functionality, are there tests to reflect this?
If there is an example, have you run the example to make sure that it works?
If there are instructions on how to run the code, have you followed the instructions and made sure that it works?
If the feature is significant enough to require documentation, is there a PR open on the OpenHands/docs repository with the same branch name?
Is the github CI passing?

@xingyaoww can click here to continue refining the PR

Agent Server images for this PR

• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant	Architectures	Base Image	Docs / Tags
java	amd64, arm64	`eclipse-temurin:17-jdk`	Link
python	amd64, arm64	`nikolaik/python-nodejs:python3.12-nodejs22`	Link
golang	amd64, arm64	`golang:1.21-bookworm`	Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:94e4b0d-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-94e4b0d-python \
  ghcr.io/openhands/agent-server:94e4b0d-python

All tags pushed for this build

ghcr.io/openhands/agent-server:94e4b0d-golang-amd64
ghcr.io/openhands/agent-server:94e4b0d-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:94e4b0d-golang-arm64
ghcr.io/openhands/agent-server:94e4b0d-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:94e4b0d-java-amd64
ghcr.io/openhands/agent-server:94e4b0d-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:94e4b0d-java-arm64
ghcr.io/openhands/agent-server:94e4b0d-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:94e4b0d-python-amd64
ghcr.io/openhands/agent-server:94e4b0d-nikolaik_s_python-nodejs_tag_python3.12-nodejs22-amd64
ghcr.io/openhands/agent-server:94e4b0d-python-arm64
ghcr.io/openhands/agent-server:94e4b0d-nikolaik_s_python-nodejs_tag_python3.12-nodejs22-arm64
ghcr.io/openhands/agent-server:94e4b0d-golang
ghcr.io/openhands/agent-server:94e4b0d-java
ghcr.io/openhands/agent-server:94e4b0d-python

About Multi-Architecture Support

Each variant tag (e.g., 94e4b0d-python) is a multi-arch manifest supporting both amd64 and arm64
Docker automatically pulls the correct architecture for your platform
Individual architecture tags (e.g., 94e4b0d-python-amd64) are also available if needed

This adds a new 'cloud' mode to the PR review workflow that launches reviews in OpenHands Cloud instead of running locally in GitHub Actions. Changes: - Add MODE environment variable (sdk/cloud) to agent_script.py - Add run_cloud_mode() function that: - Calls OpenHands Cloud API to start a conversation - Posts a comment on the PR with the cloud conversation URL - Exits without monitoring (review continues asynchronously) - Add post_github_comment() helper function - Add _start_cloud_conversation() helper function - Update composite action.yml with new inputs: - mode (sdk/cloud) - openhands-cloud-api-key - openhands-cloud-api-url - Update workflow.yml example to support cloud mode - Update README.md with cloud mode documentation - Add tests for the new functionality Cloud mode benefits: - Faster CI completion (exits after starting the review) - Track review progress in OpenHands Cloud UI - Interact with the review conversation - Uses the LLM model configured in OpenHands Cloud account Co-authored-by: openhands <openhands@all-hands.dev>

.github/actions/pr-review/action.yml

examples/03_github_workflows/02_pr_review/agent_script.py

- Extract shared setup code into _prepare_review_context() helper function to eliminate code duplication between run_cloud_mode() and run_sdk_mode() - Fix inconsistent GITHUB_TOKEN requirements: update action.yml to correctly document that GITHUB_TOKEN is required for both modes (used to post PR comments) - Remove redundant OPENHANDS_CLOUD_API_KEY validation in run_cloud_mode() (already validated in main()) - Extract comment body into CLOUD_REVIEW_COMMENT_TEMPLATE constant - Improve tests to actually test behavior by calling main() instead of just verifying list membership - Add test for _prepare_review_context() helper function

examples/03_github_workflows/02_pr_review/agent_script.py

Move _prepare_review_context call to main() and pass the prompt to both run_sdk_mode and run_cloud_mode functions. This reduces code duplication and makes the flow clearer - context preparation happens once, then the mode-specific logic runs. - run_cloud_mode now takes (pr_info, prompt) instead of (pr_info, review_style) - run_sdk_mode now takes (prompt, skill_trigger) instead of (pr_info, review_style)

xingyaoww · 2026-02-08T16:40:18Z

🧪 Cloud Mode Test Results (Bug Fix Verification)

Status: ✅ Bug fix verified - Cloud mode working correctly

Issue Found and Fixed

The _start_cloud_conversation function was attempting to pass a secrets field to the OpenHands Cloud API, but this field is not supported by the API schema (causes HTTP 422 "Extra inputs are not permitted" error).

Fix Applied

Removed the secrets field from the API request payload
Added informational log message explaining that OpenHands Cloud uses connected GitHub accounts instead

Test Results

API Key: ✅ Valid and working
Conversation Creation with github_token: ✅ Successful (no more 422 error)
Conversation ID: 5be1faf2ef364d3f84df3055922867db

📍 Test Conversation URL: https://app.all-hands.dev/conversations/5be1faf2ef364d3f84df3055922867db

This test was performed to verify the bug fix for PR #1966.

The OpenHands Cloud API does not accept a 'secrets' field in the request payload (causes HTTP 422 'Extra inputs are not permitted' error). The Cloud API has 'additionalProperties: false' in its schema, which means it only accepts the documented fields: initial_user_msg, repository, git_provider, selected_branch, etc. Instead of passing secrets directly, OpenHands Cloud uses the user's connected GitHub account for repository access. Added an informational log message to explain this behavior when github_token is provided. Co-authored-by: openhands <openhands@all-hands.dev>

Rewrote cloud mode to use OpenHandsCloudWorkspace instead of directly calling the /api/conversations endpoint. This provides: - Proper sandbox provisioning via OpenHands Cloud API - LLM configuration sent to the cloud sandbox - GITHUB_TOKEN passed to sandbox for posting review comments - Same agent capabilities as SDK mode, but running in the cloud Changes: - agent_script.py: Rewrite run_cloud_mode() to use OpenHandsCloudWorkspace - agent_script.py: Remove _start_cloud_conversation() and CLOUD_REVIEW_COMMENT_TEMPLATE - agent_script.py: Cloud mode now requires LLM_API_KEY (sent to sandbox) - action.yml: Update to require LLM_API_KEY for both modes - action.yml: Add openhands-workspace to dependencies - README.md: Update documentation for new cloud mode architecture - tests: Update tests for new requirements See: https://docs.openhands.dev/sdk/guides/agent-server/cloud-workspace Co-authored-by: openhands <openhands@all-hands.dev>

xingyaoww · 2026-02-08T16:57:26Z

🔧 Major Refactor: Cloud Mode Now Uses OpenHandsCloudWorkspace

I've refactored the cloud mode implementation to properly use OpenHandsCloudWorkspace instead of directly calling the /api/conversations endpoint.

What Changed

Before:

Cloud mode called /api/conversations directly to create a "fire and forget" conversation
The GITHUB_TOKEN could not be passed (API rejected it)
The cloud agent used the user's connected GitHub account

After:

Cloud mode uses OpenHandsCloudWorkspace to provision a proper sandbox
LLM configuration is sent to the cloud sandbox
GITHUB_TOKEN is passed to the sandbox so the agent can post review comments
Same capabilities as SDK mode, but running in a cloud-managed sandbox

New Requirements for Cloud Mode

Cloud mode now requires both:

LLM_API_KEY - Sent to the cloud sandbox
OPENHANDS_CLOUD_API_KEY - For sandbox provisioning

Documentation

See the official guide: https://docs.openhands.dev/sdk/guides/agent-server/cloud-workspace

Commits

10fcb495 - refactor: Use OpenHandsCloudWorkspace for cloud mode

examples/03_github_workflows/02_pr_review/agent_script.py

- Revert agent_script.py to main version and add cloud mode support - Share core agent definition logic between SDK and cloud modes - Cloud mode uses OpenHandsCloudWorkspace with keep_alive=True - Cloud mode calls run(blocking=False) to start agent without waiting - Add CLOUD_MODE_INSTRUCTION to instruct agent to post review comment when done - Post PR comment with conversation URL for tracking progress - Update tests to match new implementation The only difference between modes is: - SDK mode: runs locally and waits for completion - Cloud mode: runs in cloud sandbox, exits immediately after starting Co-authored-by: openhands <openhands@all-hands.dev>

xingyaoww · 2026-02-09T17:33:12Z

@OpenHands pls merge from main and resolve the conflicts

openhands-ai · 2026-02-09T17:33:22Z

I'm on it! xingyaoww can track my progress at all-hands.dev

Resolved conflicts in: - .github/actions/pr-review/action.yml: Added both cloud mode inputs and lmnr-api-key - examples/03_github_workflows/02_pr_review/agent_script.py: Kept cloud mode logic and added Laminar trace handling - examples/03_github_workflows/02_pr_review/README.md: Combined cloud mode docs with observability section Co-authored-by: openhands <openhands@all-hands.dev>

openhands-ai · 2026-02-09T17:36:42Z

Successfully merged main into feat/pr-review-cloud-mode and resolved all conflicts. The changes have been pushed to the remote branch.

Summary of Conflict Resolution

Three files had merge conflicts that were resolved:

1. `.github/actions/pr-review/action.yml`

Merged: Cloud mode inputs (openhands-cloud-api-key, openhands-cloud-api-url) AND the new lmnr-api-key input for Laminar observability
Merged: Both openhands-workspace package AND lmnr package in the install step
Merged: Both cloud mode env vars AND LMNR_PROJECT_API_KEY in the run step
Kept: github-token as required (needed for both modes)

2. `examples/03_github_workflows/02_pr_review/agent_script.py`

Kept: from pydantic import SecretStr import (needed for cloud mode)
Merged: Cloud mode logic (non-blocking run, conversation URL posting) with the improved Laminar trace handling from main (better comments explaining the trace ID usage)

3. `examples/03_github_workflows/02_pr_review/README.md`

Merged: Action inputs table now includes both cloud mode inputs AND lmnr-api-key
Kept: The "Review Evaluation (Observability)" section from main
Updated: Input descriptions to reflect that llm-model and llm-api-key are used in both modes

The merge commit has been pushed to origin/feat/pr-review-cloud-mode.

View full conversation

xingyaoww · 2026-02-09T17:37:31Z

🤖 OpenHands PR Review Started

The code review is running in OpenHands Cloud.

📍 Track progress: https://app.all-hands.dev/conversations/8cff5913-4ed8-48c9-9473-e00e443de03f

The agent will post a review comment when the analysis is complete.

xingyaoww · 2026-02-09T17:57:47Z

✅ Cloud Mode Testing Results

I tested the cloud mode feature using the BOT_OPENHANDS_API_KEY and it works successfully!

Test Configuration

Mode: cloud
LLM Model: litellm_proxy/claude-sonnet-4-5-20250929
LLM Base URL: https://llm-proxy.app.all-hands.dev
Cloud API URL: https://app.all-hands.dev
Review Style: roasted
Target PR: This PR (feat: Add cloud mode for PR review workflow #1966)

Results

Step	Status
Sandbox creation	✅ `4pwQH455yV5r8w9ODkBcga`
Sandbox ready	✅ Running at runtime URL
Agent trigger	✅ HTTP 200 OK
PR comment posted	✅ HTTP 201 Created

Cloud Conversation

The PR review is running here: https://app.all-hands.dev/conversations/8cff5913-4ed8-48c9-9473-e00e443de03f

Minor Observation

There were some validation warnings about an enable_truncation field (schema mismatch between local SDK models and the cloud server response). These are non-blocking - the conversation started and is running successfully. This might be worth investigating for cleaner integration, but it does not prevent the cloud mode from functioning.

pydantic_core._pydantic_core.ValidationError: 1 validation error for Event
system_prompt.enable_truncation
  Extra inputs are not permitted [type=extra_forbidden, input_value=True, input_type=bool]

Summary

The cloud mode implementation works as expected:

Creates a sandbox via OpenHands Cloud API
Sends the PR review prompt to the agent
Starts the agent in non-blocking mode
Posts a tracking comment on the PR with the conversation URL
Exits immediately (fast CI completion as advertised)

Great work on this feature! 🎉

xingyaoww · 2026-02-10T08:49:01Z

🤖 OpenHands PR Review Started

The code review is running in OpenHands Cloud.

📍 Track progress: https://app.all-hands.dev/conversations/578fcb531dd34ef3b87fe7fd55b29e46

The agent will post a review comment when the analysis is complete.

Instead of using OpenHandsCloudWorkspace to create a sandbox and then creating a conversation on the agent-server inside it, this change uses the OpenHands Cloud /api/conversations endpoint directly. Benefits: - No LLM credentials needed - cloud uses user's configured LLM - Proper conversation URL that works in the OpenHands Cloud UI - Simpler implementation - just one API call Changes: - Add _start_cloud_conversation() to call Cloud API directly - Update validation to only require LLM_API_KEY for sdk mode - Include GITHUB_TOKEN in the prompt for cloud mode (API doesn't support passing secrets separately) - Update action.yml descriptions to reflect mode differences Tested: Successfully created conversation 578fcb531dd34ef3b87fe7fd55b29e46 Co-authored-by: openhands <openhands@all-hands.dev>

.github/actions/pr-review/action.yml

examples/03_github_workflows/02_pr_review/workflow.yml

examples/03_github_workflows/02_pr_review/agent_script.py

tests/github_workflows/test_pr_review_agent.py

Cloud mode should NOT require LLM_API_KEY - it uses the user's cloud-configured LLM via the /api/conversations endpoint. This reverts the refactoring to use OpenHandsCloudWorkspace, which required passing LLM configuration. Cloud mode: - Uses Cloud API Direct (/api/conversations endpoint) - No LLM_API_KEY needed (uses cloud-configured LLM) - Asynchronous (fire and forget) - Posts comment with conversation URL SDK mode: - Uses local workspace with Conversation + Agent - Requires LLM_API_KEY - Synchronous (waits for completion) Co-authored-by: openhands <openhands@all-hands.dev>

xingyaoww · 2026-02-10T11:17:44Z

🤖 OpenHands PR Review Started

The code review is running in OpenHands Cloud.

📍 Track progress: https://app.all-hands.dev/conversations/16dd88f3ab9c4af89ecb83436e35a172

The agent will post review comments when the analysis is complete.

all-hands-bot · 2026-02-10T11:19:15Z

🔴 Code Review: Needs Improvement

Taste Rating: 🔴 Needs improvement - Violates fundamental principles of simplicity and good design.

[CRITICAL ISSUES] (Must fix - these break fundamental principles)

[agent_script.py, Lines 154-180] 🔨 Data Structure & Duplication:
You've created post_github_comment() and create_cloud_conversation() with nearly identical HTTP request boilerplate. This is the kind of copy-paste programming that makes codebases rot. Extract a common HTTP helper or use a proper library. The fact that you're manually constructing urllib.request.Request objects twice with the same pattern is a code smell that should have been obvious.

[agent_script.py, Lines 260-400+] 🔨 Complexity Explosion:
The main() function has become a monstrosity. You've taken what should be two separate, clean functions (run_sdk_mode() and run_cloud_mode()) and crammed them into a single function with a massive if/else block. This is >3 levels of nesting in places. The function is now doing at least 5 different things:

Environment validation
PR info gathering
Mode selection
Cloud API calls
SDK agent execution

This violates every principle of clean code. Split it up.

[agent_script.py, Lines 182-230] 🔒 Security Risk - Token in Prompt:

CLOUD_MODE_PROMPT = """...
The GITHUB_TOKEN environment variable is already set for you...
"""

You're telling an AI agent that GITHUB_TOKEN is available. While the token isn't literally in the prompt, you're instructing the agent to use it. This is a security design that relies on the agent behaving correctly. What happens when the agent decides to log debug info including environment variables? Document the threat model or redesign this.

[action.yml, Line 14] ⚠️ Breaking Change:

llm-api-key:
    description: LLM API key (required for 'sdk' mode only...)
    required: false  # WAS: required: true

You've changed llm-api-key from required to optional. Existing workflows that don't specify a mode will now fail at runtime instead of at workflow validation time. This is the kind of "we don't break userspace" violation that would get you a stern email. Add proper deprecation warnings or keep the validation tight.

[IMPROVEMENT OPPORTUNITIES] (Should fix - violates good taste)

[agent_script.py, Lines 182-230] 📝 Prompt Engineering Smell:
The CLOUD_MODE_PROMPT is a massive string with embedded instructions for using gh CLI. This is fragile - you're essentially writing a shell script inside a Python string inside an AI prompt. If the gh CLI changes its output format, this breaks silently. Consider whether the cloud agent should have a dedicated skill/tool for PR review instead of relying on prompt engineering.

[test_pr_review_agent.py, Lines 1-20] 🔧 Import Hack:

sys.path.insert(0, str(pr_review_path))

This is the kind of import hack that makes Python projects unmaintainable. If you need to test this module, either:

Make it a proper package with __init__.py
Use relative imports correctly
Install it in editable mode

Don't paper over structural problems with sys.path manipulation.

[test_pr_review_agent.py, Lines 75-95] 🤔 Trivial Tests:

def test_mode_defaults_to_sdk():
    mode = os.getenv("MODE", "sdk").lower()
    assert mode == "sdk"

You're testing that os.getenv returns a default value. This isn't testing YOUR code, it's testing Python's standard library. Write tests that verify actual behavior of your functions, not language features.

[agent_script.py, Lines 260-280] 📊 Implicit Data Structure:
The pr_info dictionary is passed around but its structure is never defined. Add a TypedDict or dataclass:

class PRInfo(TypedDict):
    number: str
    title: str
    body: str
    repo_name: str
    base_branch: str
    head_branch: str

This makes the code self-documenting and catches bugs at type-check time.

[action.yml, Lines 90-95] 🔧 Unnecessary Dependency:

uv pip install --system ./software-agent-sdk/openhands-workspace

You've added openhands-workspace as a dependency. Is this actually used in the PR review workflow? I don't see it imported in agent_script.py. Don't add dependencies "just in case."

[STYLE NOTES] (Minor - only mention if genuinely important)

[README.md, Lines 91-120] 📚 Documentation Duplication:
You have two nearly identical YAML blocks showing SDK vs Cloud configuration. Consider using a single block with comments showing the differences, or a table. The current approach will lead to documentation drift.

[agent_script.py, Line 17-45] 📝 Docstring Bloat:
The module docstring is now 45+ lines. This belongs in a README, not in the code. Keep docstrings focused on API documentation, not tutorials.

VERDICT:

❌ Needs rework - Fundamental design issues must be addressed first.

The core feature (cloud mode) is reasonable, but the implementation violates basic software engineering principles:

The main() function is doing too much
HTTP request code is duplicated
Breaking change to required inputs without migration path
Test quality is poor

KEY INSIGHT:
This PR adds a feature by cramming more code into existing functions instead of properly decomposing the problem. The result is a main() function that's impossible to test in isolation and will be a maintenance nightmare. Extract run_sdk_mode() and run_cloud_mode() as separate functions, create a shared HTTP helper, and add proper type hints for the data structures being passed around.

"Talk is cheap. Show me the code." - But in this case, show me LESS code that does the same thing.

xingyaoww · 2026-02-10T13:35:48Z

@OpenHands please help me fix this: #1966 (comment)

openhands-ai · 2026-02-10T13:35:57Z

I'm on it! xingyaoww can track my progress at all-hands.dev

- Extract common HTTP helper (_make_http_request) to reduce duplication - Refactor main() into run_sdk_mode() and run_cloud_mode() functions - Add PRInfo TypedDict for type safety - Improve test quality with meaningful tests for actual functions - Trim module docstring to focus on API documentation - Remove unused openhands-workspace dependency from action.yml Co-authored-by: openhands <openhands@all-hands.dev>

openhands-ai · 2026-02-10T13:46:17Z

I've addressed the code review feedback from the PR comment. Here's a summary of the changes:

Changes Made

1. Extracted common HTTP helper (`_make_http_request`)

Created a reusable helper function that handles HTTP requests with proper error handling
Eliminated code duplication between post_github_comment() and _start_cloud_conversation()

2. Refactored `main()` into separate functions

Split the monolithic main() into:
- run_cloud_mode(pr_info, skill_trigger) - handles cloud mode execution
- run_sdk_mode(pr_info, skill_trigger, review_style) - handles SDK mode execution
- _print_cost_summary(conversation) - prints cost metrics
- _save_laminar_trace(pr_info, commit_id, review_style) - saves trace info
- _get_required_vars_for_mode(mode) - returns required env vars per mode
- _get_pr_info() - extracts PR info from environment

3. Added `PRInfo` TypedDict for type safety

Defined a TypedDict to make the pr_info dictionary structure explicit and self-documenting

4. Improved test quality

Replaced trivial tests (like test_mode_defaults_to_sdk() that just tested Python's os.getenv) with meaningful tests that verify actual function behavior
Added tests for _make_http_request, _start_cloud_conversation, _get_required_vars_for_mode, and _get_pr_info
Organized tests into logical test classes

5. Trimmed module docstring

Reduced the 45+ line docstring to a brief 8-line description
Tutorial content belongs in README, not in code

6. Removed unused `openhands-workspace` dependency

Removed from action.yml since it's not imported or used in agent_script.py

Checklist

✅ HTTP request duplication eliminated
✅ main() refactored into smaller, testable functions
✅ Type safety added with PRInfo TypedDict
✅ Tests improved to test actual function behavior
✅ Module docstring trimmed
✅ Unused dependency removed
✅ All tests pass (15 tests)
✅ Pre-commit hooks pass (pyright, ruff, etc.)
✅ Changes pushed to feat/pr-review-cloud-mode branch

The changes have been pushed to the PR at #1966

View full conversation

examples/03_github_workflows/02_pr_review/agent_script.py

- Replace direct API calls to /api/conversations with OpenHandsCloudWorkspace - Cloud mode now uses the SDK's proper abstraction for cloud sandbox management - Cloud mode now requires both OPENHANDS_CLOUD_API_KEY and LLM_API_KEY (LLM config is sent to the cloud sandbox) - Update tests to reflect the new requirements Co-authored-by: openhands <openhands@all-hands.dev>

all-hands-bot

🔴 Needs improvement - Critical documentation/implementation mismatch

Taste Rating: 🔴 Violates fundamental principles

[CRITICAL ISSUES] (Must fix)

Documentation Lie: The README and PR description claim cloud mode doesn't need LLM_API_KEY, but the code at agent_script.py line ~143 does llm_api_key = _get_required_env("LLM_API_KEY"). Either the code is wrong or the docs are wrong. This will cause user confusion and workflow failures.
Over-Engineering: You wrote a 65-line generic HTTP client (_make_http_request) in a PR review script. Either use the requests library or stick with gh CLI. This reinvents the wheel and makes the script harder to understand. "Bad programmers worry about the code. Good programmers use libraries."
Architecture Problem: main() has two completely different execution paths based on mode. This violates single responsibility. Better design: workspace = CloudWorkspace(...) if mode == 'cloud' else LocalWorkspace(...); run_review(workspace, pr_info). One execution path, workspace abstraction handles differences.
Testing Theater: The tests mock urllib.request.urlopen and assert it was called. They don't test actual logic. What happens if the response is malformed JSON? Your tests won't catch it. Test real code paths or don't bother testing.

[IMPROVEMENT OPPORTUNITIES]

Special Case Hell: Mode validation appears in 3+ places (Python, bash, action.yml). This is a leaking abstraction. Consider a strategy pattern where each mode registers its requirements.
Missing Error Handling: In run_cloud_mode(), if conversation.run() fails, you already posted "review started" but it never completes. User is left hanging.
Breaking Change Not Highlighted: Cloud mode changes workflow from blocking to non-blocking. SDK mode waits for completion; cloud mode exits immediately. This could break CI/CD pipelines.

VERDICT: ❌ Needs rework

Fix the LLM_API_KEY documentation/code mismatch first. Then address the over-engineering and architecture issues.

KEY INSIGHT:

The fundamental problem is trying to bolt two completely different execution models (local blocking vs cloud async) onto the same codebase without proper abstraction. This creates leaky abstractions, duplicated validation logic, and contradictory documentation.

all-hands-bot · 2026-02-10T14:03:31Z

Additional Inline Comments

📍 Line-by-Line Issues

examples/03_github_workflows/02_pr_review/README.md:66
🔴 Critical: Documentation says "you don't need LLM_API_KEY" but agent_script.py:257 requires it with _get_required_env("LLM_API_KEY"). Fix the docs or fix the code.

examples/03_github_workflows/02_pr_review/agent_script.py:257
🔴 Critical: This contradicts README.md:66 which tells users they don't need LLM_API_KEY in cloud mode. The comment on line 255 says it's required for OpenHandsCloudWorkspace - if that's true, UPDATE THE DOCS.

examples/03_github_workflows/02_pr_review/agent_script.py:49-114
🟠 Important: The _make_http_request() function is 65 lines of reinvented HTTP client. Use requests library or gh CLI instead of writing this from scratch.

examples/03_github_workflows/02_pr_review/agent_script.py:381-395
🟠 Important: main() branches into two completely different code paths. Extract common logic and use workspace abstraction to unify the flow.

tests/github_workflows/test_pr_review_agent.py:29-50
🟠 Important: These tests mock the HTTP layer and assert mocks were called. They don't test real behavior. Test actual logic with real data structures.

openhands-ai · 2026-02-10T14:08:36Z

Looks like there are a few issues preventing this PR from being merged!

GitHub Actions are failing:
- Review Thread Gate

If you'd like me to help, just leave a comment, like

@OpenHands please fix the failing actions on PR #1966 at branch `feat/pr-review-cloud-mode`

Feel free to include any additional details that might help me get this PR into a better state.

_{^{You can manage your notification settings}}

xingyaoww · 2026-02-10T14:14:27Z

@OpenHands pls fix #1966 (review)

openhands-ai · 2026-02-10T14:14:42Z

I'm on it! xingyaoww can track my progress at all-hands.dev

- Fix documentation/code mismatch: Update README.md and action.yml to clarify that cloud mode requires both OPENHANDS_CLOUD_API_KEY and LLM_API_KEY (the LLM config is sent to the cloud sandbox) - Replace _make_http_request with requests library for simpler HTTP handling - Add error handling for cloud mode: Post failure comment if conversation.run() fails after posting 'review started' comment - Add openhands-workspace package to action.yml dependencies (required for OpenHandsCloudWorkspace) - Update tests to work with requests library Co-authored-by: openhands <openhands@all-hands.dev>

openhands-ai · 2026-02-10T14:21:20Z

I've addressed the PR review feedback from #1966 (review). The changes have been pushed to the feat/pr-review-cloud-mode branch.

Summary of Changes

1. Fixed Documentation/Code Mismatch (Critical Issue #1)

README.md: Updated to clarify that cloud mode requires both OPENHANDS_CLOUD_API_KEY and LLM_API_KEY (the LLM config is sent to the cloud sandbox)
action.yml: Updated input description for llm-api-key to indicate it's required for both modes
action.yml validation: Updated to check for LLM_API_KEY before mode-specific checks

2. Replaced Over-Engineered HTTP Client (Critical Issue #2)

Removed the 65-line _make_http_request function
Replaced with simple requests library calls in get_pr_diff_via_github_api() and post_github_comment()
Added requests to the action.yml dependencies

3. Added Error Handling for Cloud Mode (Improvement #6)

Added try/except around conversation.run() in run_cloud_mode()
If the review fails after posting "review started", a failure comment is posted to the PR with the error message and conversation URL

4. Fixed Missing Dependency

Added openhands-workspace package to action.yml dependencies (required for OpenHandsCloudWorkspace)

5. Updated Tests

Removed tests for the deleted _make_http_request function
Updated TestPostGithubComment and added TestGetPrDiffViaGithubApi to test the new requests-based implementations
All 15 tests pass

Checklist

Documentation/code mismatch fixed
Over-engineered HTTP client replaced with requests
Error handling added for cloud mode conversation failure
Tests updated and passing
Changes pushed to remote branch

View full conversation

xingyaoww · 2026-02-10T15:04:01Z

@OpenHands /codereview-roasted

openhands-ai · 2026-02-10T15:04:18Z

I'm on it! xingyaoww can track my progress at all-hands.dev

openhands-ai · 2026-02-10T15:06:26Z

Code Review Summary for PR #1966: "feat: Add cloud mode for PR review workflow"

🔴 Taste Rating: Needs Improvement

I performed a Linus Torvalds-style "roasted" code review of this PR. Here are the key findings:

Critical Issues (Must Fix)

Design Contradiction: keep_alive=True but blocking on run() (agent_script.py, lines 219-258)
- The code uses keep_alive=True (suggesting fire-and-forget) but then blocks on conversation.run(). This defeats the stated benefit of "faster CI completion". Pick one design.
Complexity: run_cloud_mode() is doing too much (~100 lines)
- Violates single responsibility: formats prompt, creates LLM config, sets up workspace, creates agent, posts comments, handles errors with nested try/except, prints cost summary. Should be split into smaller functions.
Confusing API: llm-api-key marked optional but actually required (action.yml)
- The input says required: false but validation still fails without it. Don't lie to users.

Improvement Opportunities

CLOUD_MODE_PROMPT assumes gh CLI exists - No verification that the cloud sandbox has gh installed and authenticated.
Inconsistent agent creation between modes - Cloud mode uses get_default_agent() while SDK mode uses Agent() with different configuration. This will cause different behavior.
Over-engineering: Cloud mode requires LLM_API_KEY - The whole point of cloud mode should be "just use the cloud's LLM". Requiring users to provide their own key creates unnecessary coupling.
Redundancy: Passing GITHUB_TOKEN to secrets - Comment says it's "automatically available" but code passes it anyway.
Missing integration tests - TestRunCloudMode only tests _get_required_vars_for_mode(), not the actual run_cloud_mode() function.

Verdict

❌ Needs rework before merging. The keep_alive=True + blocking run() contradiction is a fundamental design issue that undermines the stated benefits of cloud mode.

Key Insight

The cloud mode implementation tries to be both "fire-and-forget" and "wait for completion". Pick one design and commit to it.

No code changes were made - this was a review-only task as requested by /codereview-roasted.

View full conversation

xingyaoww · 2026-02-10T15:10:58Z

@OpenHands pls address #1966 (comment)

keep_alive=True, we should exit the script once we've hit the .run()

Confusing API: llm-api-key marked optional but actually required (action.yml)

We should fix the validation - we should only validate it on the SDK path

…ional Address review feedback from PR #1966: 1. Cloud mode with keep_alive=True now exits immediately after starting the conversation by using run(blocking=False). This provides the intended 'faster CI completion' benefit where the workflow exits immediately while the review continues asynchronously in the cloud. 2. LLM_API_KEY is now optional for cloud mode - the cloud uses the user's configured LLM settings from their OpenHands Cloud account. SDK mode still requires LLM_API_KEY for local execution. Changes: - agent_script.py: Use run(blocking=False) for cloud mode, make LLM_API_KEY optional with graceful handling - action.yml: Update validation to only require llm-api-key for SDK mode, update description to clarify - tests: Update tests to reflect new behavior - README.md: Update documentation to reflect LLM_API_KEY is optional for cloud mode Co-authored-by: openhands <openhands@all-hands.dev>

xingyaoww · 2026-02-10T16:19:34Z

@OpenHands /codereview-roasted

openhands-ai · 2026-02-10T16:19:48Z

I'm on it! xingyaoww can track my progress at all-hands.dev

openhands-ai · 2026-02-10T16:22:57Z

Code Review Summary for PR #1966: "feat: Add cloud mode for PR review workflow"

Taste Rating: 🔴 Needs Improvement

I provided a Linus Torvalds-style "roasted" code review of this PR. No code changes were made - this was a review-only request.

Critical Issues Identified:

run_cloud_mode() complexity (Lines 179-266): 90-line function doing 9 different things - violates single responsibility principle. Should be decomposed.
Silent failure pattern in _get_pr_info() (Lines 401-410): Returns empty strings for missing env vars instead of failing fast. This allows the code to proceed with garbage data.
Awkward LLM configuration (Lines 211-216): Conditional api_key handling creates confusion about whether it's truly optional or "sometimes optional."

Improvement Opportunities:

Verbose secrets dict construction could be simplified
CLOUD_MODE_PROMPT is a fragile 36-line string with embedded bash commands
Unused import (get_default_agent) only used in cloud mode
Test file uses sys.path.insert() hack instead of proper module structure
Duplicated validation logic between bash (action.yml) and Python

Key Insight:

The fundamental issue is that "cloud mode" and "sdk mode" have different data requirements but share the same entry point with conditional logic scattered throughout. A cleaner design would have each mode as a separate, focused module with a thin dispatcher in main().

Verdict: ❌ Needs rework

The core functionality is sound, but the run_cloud_mode() function needs decomposition, and the silent failure pattern in _get_pr_info() must be fixed before this is production-ready.

View full conversation

openhands-agent and others added 2 commits February 8, 2026 15:03

Merge branch 'main' into feat/pr-review-cloud-mode

bcd905b

xingyaoww commented Feb 8, 2026

View reviewed changes

.github/actions/pr-review/action.yml Outdated Show resolved Hide resolved

Apply suggestion from @xingyaoww

b457d09

xingyaoww commented Feb 8, 2026

View reviewed changes

examples/03_github_workflows/02_pr_review/agent_script.py Show resolved Hide resolved

xingyaoww commented Feb 8, 2026

View reviewed changes

examples/03_github_workflows/02_pr_review/agent_script.py Outdated Show resolved Hide resolved

openhands-agent added 2 commits February 8, 2026 16:40

xingyaoww commented Feb 9, 2026

View reviewed changes

examples/03_github_workflows/02_pr_review/agent_script.py Outdated Show resolved Hide resolved

This comment was marked as outdated.

Sign in to view