🛡️ Sentinel: [security improvement] Sanitize Content-Type in ValueError messages#783
🛡️ Sentinel: [security improvement] Sanitize Content-Type in ValueError messages#783abhimehro wants to merge 1 commit into
Conversation
|
Merging to
After your PR is submitted to the merge queue, this comment will be automatically updated with its status. If the PR fails, failure details will also be posted here |
There was a problem hiding this comment.
Gates Passed
6 Quality Gates Passed
See analysis details in CodeScene
Quality Gate Profile: Pay Down Tech Debt
Install CodeScene MCP: safeguard and uplift AI-generated code. Catch issues early with our IDE extension and CLI tool.
There was a problem hiding this comment.
Pull request overview
This PR mitigates potential log injection by sanitizing attacker-controlled Content-Type header values before embedding them in ValueError messages raised by the GitHub fetch helper (_gh_get) in main.py.
Changes:
- Sanitized
content_typein both_gh_getContent-Type validation error paths using the existingsanitize_for_log()helper. - Made the fallback (retry) branch’s error message consistent by also sanitizing the URL before logging/raising.
There was a problem hiding this comment.
Code Review
This pull request improves logging security in main.py by ensuring that url and content_type are sanitized using sanitize_for_log() before being included in ValueError messages. The reviewer pointed out that the Content-Type validation logic is duplicated in two separate locations within the _gh_get function and suggested refactoring it into a helper function or a module-level constant to improve maintainability.
| allowed_types = ["application/json", "text/json", "text/plain"] | ||
| if not any(t in content_type for t in allowed_types): | ||
| raise ValueError( | ||
| f"Invalid Content-Type from {url}: {content_type}. " | ||
| f"Invalid Content-Type from {sanitize_for_log(url)}: {sanitize_for_log(content_type)}. " | ||
| f"Expected one of: {', '.join(allowed_types)}" | ||
| ) |
There was a problem hiding this comment.
The logic for validating the Content-Type and raising a ValueError is duplicated in two places within _gh_get (here and around line 1594). This duplication can lead to inconsistencies if the allowed types or the error message format need to be updated in the future. Consider refactoring this into a helper function or at least moving allowed_types to a module-level constant to improve maintainability and ensure consistent validation across both code paths.
Severity: Medium
Vulnerability: External input (
content_type) was passed intoValueErrorstring without sanitization. Since exceptions are commonly logged, including raw, potentially attacker-controlled HTTP header data could lead to Log Injection attacks.Impact: Attackers could inject arbitrary log entries or control characters, confusing monitoring systems or hiding traces of malicious activity.
Fix: Used the existing
sanitize_for_log()function to sanitize thecontent_typevariables before embedding them in the exception message.Verification: Verified using
uv run pytest tests/test_content_type.pyand the full test suite.