🧹 Replace hardcoded dummy logic in get_repo_info#172
Conversation
Co-authored-by: abhimehro <84992105+abhimehro@users.noreply.github.com>
|
👋 Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with New to Jules? Learn more at jules.google/docs. For security, I will only act on instructions from the user who triggered this task. |
|
Merging to
After your PR is submitted to the merge queue, this comment will be automatically updated with its status. If the PR fails, failure details will also be posted here |
PR SummaryMedium Risk Overview Tests are updated to mock Reviewed by Cursor Bugbot for commit e123015. Configure here. |
There was a problem hiding this comment.
Gates Failed
Enforce advisory code health rules
(1 file with Code Duplication)
Gates Passed
5 Quality Gates Passed
See analysis details in CodeScene
Reason for failure
| Enforce advisory code health rules | Violations | Code Health Impact | |
|---|---|---|---|
| test_code_health_scanner.py | 1 advisory rule | 10.00 → 9.39 | Suppress |
Quality Gate Profile: Pay Down Tech Debt
Install CodeScene MCP: safeguard and uplift AI-generated code. Catch issues early with our IDE extension and CLI tool.
| def test_get_repo_info_https(): | ||
| with patch("subprocess.check_output") as mock_run: | ||
| mock_run.side_effect = [ | ||
| "https://github.com/account/project.git\n", | ||
| "hash\n" | ||
| ] | ||
| account, project, commit_hash = get_repo_info() | ||
| assert account == "account" | ||
| assert project == "project" | ||
| assert commit_hash == "hash" |
There was a problem hiding this comment.
❌ New issue: Code Duplication
The module contains 3 functions with similar structure: test_get_repo_info_https,test_get_repo_info_no_dot_git,test_get_repo_info_ssh
| # Matches formats: | ||
| # https://github.com/account/project.git | ||
| # git@github.com:account/project.git | ||
| match = re.search(r"[:/]([^/:]+)/([^/:]+?)(?:\.git)?$", origin_url) |
There was a problem hiding this comment.
📝 Info: Lazy quantifier in regex is safe due to $ anchor
The regex r"[:/]([^/:]+)/([^/:]+?)(?:\.git)?$" at code_health_scanner.py:29 uses a lazy quantifier +? for the project capture group. This might appear risky since a lazy match combined with an optional (?:\.git)? group could theoretically under-match. However, the $ anchor forces the engine to backtrack until the entire remaining string is consumed, so ([^/:]+?) always expands to capture the full project name (minus the optional .git suffix). Verified empirically with URLs like https://github.com/account/project.git, git@github.com:account/project.git, https://github.com/account/project, and edge cases like account/my.gitproject.git — all produce correct results.
Was this helpful? React with 👍 or 👎 to provide feedback.
| env = os.environ.copy() | ||
| env.pop("GH_TOKEN", None) |
There was a problem hiding this comment.
🚩 Only GH_TOKEN is stripped from subprocess environment
The security hardening at code_health_scanner.py:14-15 strips only GH_TOKEN from the environment before invoking git subprocesses. Other potentially sensitive variables (e.g., GITHUB_TOKEN, AWS_SECRET_ACCESS_KEY, GITLAB_TOKEN) remain in the child process environment. The comment says this prevents exfiltration "if subprocess is compromised," but a compromised subprocess could still access other secrets. This is a defense-in-depth measure and the risk is low (the subprocess is just git with hardcoded arguments), but it may be worth considering a more restrictive allowlist approach if the threat model warrants it.
Was this helpful? React with 👍 or 👎 to provide feedback.
| assert project == "project" | ||
| assert commit_hash == "hash" | ||
| def test_get_repo_info_https(): | ||
| with patch("subprocess.check_output") as mock_run: |
There was a problem hiding this comment.
📝 Info: Test mock patches subprocess at module level — correct for this import style
The tests use patch("subprocess.check_output") rather than patch("code_health_scanner.subprocess.check_output"). This works correctly because code_health_scanner.py uses import subprocess (importing the module) and calls subprocess.check_output(...) at runtime, so the attribute lookup on the module object picks up the mock. If the import style were ever changed to from subprocess import check_output, these tests would silently stop mocking and could start making real git calls.
Was this helpful? React with 👍 or 👎 to provide feedback.
| import os | ||
| import subprocess | ||
| from unittest.mock import patch | ||
| import pytest |
There was a problem hiding this comment.
🎯 What: Replaced the dummy
get_repo_infofunction with actual logic that retrieves the repository account, project name, and current commit hash usinggitcommands.💡 Why: Returning hardcoded values indicates incomplete logic and hinders the scanner's ability to provide accurate repository context. Implementing actual retrieval improves the maintainability and accuracy of the tool.
✅ Verification: Added comprehensive unit tests in
tests/test_code_health_scanner.pyusing mocks to verify parsing of various git URL formats (HTTPS, SSH) and proper error handling. All 12 tests passed successfully.✨ Result:
get_repo_infonow dynamically retrieves repository information from the environment, making the code health scanner more robust and ready for real-world use.═════ ELIR ═════
PURPOSE: Implements actual git repository info retrieval in
get_repo_infousingsubprocessandre.SECURITY: Strips
GH_TOKENfrom environment before calling git and logs only exception types to prevent info disclosure.FAILS IF:
gitis not installed or the directory is not a git repo (returns "unknown" values).VERIFY: Run
pytest tests/test_code_health_scanner.py.MAINTAIN: Regex handles standard GitHub HTTPS/SSH URL formats.
PR created automatically by Jules for task 14172787027687589562 started by @abhimehro