🧹 Refactor: Extract response validation from _gh_get#802
Conversation
|
Merging to
After your PR is submitted to the merge queue, this comment will be automatically updated with its status. If the PR fails, failure details will also be posted here |
| allowed_types = ["application/json", "text/json", "text/plain"] | ||
| if not any(t in content_type for t in allowed_types): | ||
| raise ValueError( | ||
| f"Invalid Content-Type from {sanitize_for_log(url)}: {content_type}. " |
There was a problem hiding this comment.
📝 Info: Refactoring fixes an unsanitized URL in the 304 fallback error message
The old 304 retry fallback code at the now-deleted line used {url} directly in the Content-Type error message (f"Invalid Content-Type from {url}: {content_type}.") while the main path used sanitize_for_log(url). The new shared _parse_and_validate_response function consistently uses sanitize_for_log(url) (line 1454), which is a correctness improvement — the old code was inconsistent.
Was this helpful? React with 👍 or 👎 to provide feedback.
| chunks = [] | ||
| current_size = 0 | ||
| # Optimization: Use 16KB chunks to reduce loop overhead/appends for large files | ||
| for chunk in r.iter_bytes(chunk_size=16 * 1024): |
There was a problem hiding this comment.
📝 Info: Chunk size now consistent across both code paths
The old 304 retry fallback path called r_retry.iter_bytes() without a chunk_size argument (defaulting to httpx's internal default), while the main 200 OK path used chunk_size=16 * 1024. The new shared function uses chunk_size=16 * 1024 for both paths (main.py:1480), making behavior consistent. This is a minor improvement in the retry path.
Was this helpful? React with 👍 or 👎 to provide feedback.
| cl = r.headers.get("Content-Length") | ||
| if cl: | ||
| try: | ||
| if int(cl) > MAX_RESPONSE_SIZE: |
There was a problem hiding this comment.
🚩 MAX_RESPONSE_SIZE is defined twice in main.py
Pre-existing issue: MAX_RESPONSE_SIZE is defined at both main.py:892 (10MB) and main.py:1068 (10MB). Both values are identical so there's no runtime impact, but the duplication is unnecessary and could lead to divergence if one is updated without the other. Not introduced by this PR but worth noting since the extracted function depends on this constant.
Was this helpful? React with 👍 or 👎 to provide feedback.
Co-authored-by: abhimehro <84992105+abhimehro@users.noreply.github.com>
There was a problem hiding this comment.
Code Health Improved
(1 files improve in Code Health)
Gates Passed
6 Quality Gates Passed
See analysis details in CodeScene
View Improvements
| File | Code Health Impact | Categories Improved |
|---|---|---|
| main.py | 1.66 → 1.70 | Lines of Code in a Single File, Complex Method, Bumpy Road Ahead, Overall Code Complexity |
Quality Gate Profile: Pay Down Tech Debt
Install CodeScene MCP: safeguard and uplift AI-generated code. Catch issues early with our IDE extension and CLI tool.
|
|
||
| # _api_stats_lock, _api_get, _api_delete, _api_post, _api_post_form, | ||
| # retry_with_jitter, _retry_request imported from api_client above | ||
| def _parse_and_validate_response(r: httpx.Response, url: str) -> dict: |
There was a problem hiding this comment.
📝 Info: Function placement after unrelated import comment is slightly misleading
The new _parse_and_validate_response function is defined immediately after the comment # _api_stats_lock, _api_get, _api_delete, _api_post, _api_post_form, # retry_with_jitter, _retry_request imported from api_client above at main.py:1444-1445. This comment describes imports and is unrelated to the new function, making it read as if the function is part of that import block. A blank line or moving the comment would improve readability.
Was this helpful? React with 👍 or 👎 to provide feedback.
🎯 What: Extracted the repetitive
_parse_and_validate_responsehelper from_gh_getto handle content validation, size checking, and JSON parsing.💡 Why: The
_gh_getfunction was too long and complex due to nested try-catches and duplicate validation blocks for the 304 fallback branch. This extraction drastically reduces complexity and enhances readability and maintainability.✅ Verification: Verified by running the formatters/linters (
ruff format/ruff check) and the full test suite (uv run pytest), ensuring 100% of the tests passed including benchmark thresholds.✨ Result:
_gh_getis much shorter, more focused, and easier to follow without altering any core caching or HTTP behavior.