[spark-compete] Narrow bare except in 4 filesystem-walker helpers in system_map.py to OSError#536
Conversation
… OSError inspect_file_metadata (L1387, stat()), count_files_under (L1513, rglob), count_schema_files (L1543, glob), and summarize_memory_run_artifacts (L2229, iterdir + child.stat()) all wrap a single Path-traversal call in try/except Exception and surface the failure as out['error']. Each helper's try body contains only stdlib Path traversal: stat(), rglob(), glob(), iterdir(). The only exceptions those can raise are OSError subclasses (PermissionError when a subtree is denied, FileNotFoundError when the path vanishes mid-walk, NotADirectoryError when the path is replaced with a regular file). The bare Exception swallows any unrelated runtime error (NameError from a refactor, TypeError from a logic bug) into the same out['error'] field; the operator's system-map snapshot reports a filesystem error and they assume the directory is broken when really a programming bug in the helper hid the real failure. Narrow tuple is exhaustive for stdlib filesystem traversal; except body is byte-identical pre/post for all four helpers. Four focused tests use unittest.mock.patch to inject PermissionError into stat/rglob/glob/iterdir and assert each helper still routes through the narrow tuple with a non-generic class name.
QA write-up — for reviewer eyesTL;DR. Four filesystem-walker helpers in Beforedef inspect_file_metadata(path: Path) -> dict[str, Any]:
...
try:
stat = path.stat()
except Exception as exc:
out["error"] = f"{type(exc).__name__}: {exc}"
return out
...
def count_files_under(path: Path, *, max_files: int = 5000) -> dict[str, Any]:
...
try:
for child in path.rglob("*"):
...
except Exception as exc:
out["error"] = f"{type(exc).__name__}: {exc}"
return out
...
def count_schema_files(path: Path, *, max_files: int = 500) -> dict[str, Any]:
...
try:
for child in sorted(path.glob("*.schema.json"), ...):
...
except Exception as exc:
...
def summarize_memory_run_artifacts(builder_home: Path) -> dict[str, Any]:
...
try:
for prefix in prefixes:
runs = [child for child in artifacts.iterdir() if ...]
latest_mtime = max((child.stat().st_mtime for child in runs), default=None)
...
except Exception as exc:
out["error"] = f"{type(exc).__name__}: {exc}"
return outIf a typo or refactor lands a NameError or AttributeError inside one of these try bodies, the operator sees After except OSError as exc:
out["error"] = f"{type(exc).__name__}: {exc}"Same SmokeFour unittest.mock.patch tests pass: What didn't changeOne file changed for the fix (four lines), plus tests. Except bodies byte-identical pre/post for all four helpers. The |
|
Thanks, this was useful. We maintainer-adopted this exception-boundary improvement in #603 and merged it into What shipped:
Verification on the adopted PR:
Source credit is recorded for this PR, and #603 itself should not receive participant credit. Points are still gated because this PR body had Closing this PR as adopted into #603. No code action is needed unless you believe #603 missed part of the issue. |
|
R22/R23 final scoring update: this PR did receive final R22/R23 public leaderboard credit. Earlier gate, credit-review, or points-lock wording described the pre-final review state, not a final rejection or permanent zero. Final R22/R23 credit for this PR: 48 points. Thanks for the contribution. |
{
"schema": "spark-compete-hotfix-v1",
"event": "spark-compete-first-event",
"submission_mode": "public_repo_pr",
"submission_target_url": "#536",
"team": {
"name": "SparkThisUp",
"members": ["ValHallaBuilder", "Baz707", "DanFireDash"],
"github_accounts": ["4gjnbzb4zf-sudo"],
"llm_device_holder": "ValHallaBuilder",
"device_holder_github": "4gjnbzb4zf-sudo"
},
"target_repo": {
"id": "vibeforge1111/spark-cli",
"source": "https://github.com/vibeforge1111/spark-cli",
"owner_surface": "spark-cli"
},
"issue": {
"type": "bug",
"severity": "low",
"title": "Four filesystem-walker helpers in system_map.py swallow programming bugs as generic Exception while reporting them as a filesystem error",
"actual_behavior": "src/spark_cli/system_map.py has four small filesystem-traversal helpers used to fill out the operator-facing system-map snapshot: inspect_file_metadata (L1377, single stat() call), count_files_under (L1488, rglob walker), count_schema_files (L1523, glob walker), and summarize_memory_run_artifacts (L2198, iterdir + per-child stat). Each helper wraps a single stdlib Path traversal call in
try: ... except Exception as exc: out['error'] = f'{type(exc).__name__}: {exc}'. The try body never calls anything beyond stdlib filesystem APIs (stat / rglob / glob / iterdir / Path.is_dir / Path.is_file / Path.startswith / Counter.update), all of which raise only OSError subclasses. The bare Exception folds unrelated runtime errors (NameError from a refactor, TypeError from a logic bug, AttributeError from a typo) into the same 'filesystem error' field; the operator sees out['error'] in the system-map snapshot and assumes a directory is broken or unreadable when really a programming bug in the helper hid the real failure.","expected_behavior": "The same out['error'] short-circuit fires only for the documented failure modes: PermissionError when a subtree is denied, FileNotFoundError when the path vanishes mid-walk, NotADirectoryError when the path is replaced with a regular file, and the broader OSError parent. Programming bugs propagate as a real traceback.",
"repro_steps": [
"gh pr checkout ",
"PYTHONPATH=src python3 -m unittest tests.test_system_map.SparkSystemMapTests.test_inspect_file_metadata_records_os_error_when_stat_fails tests.test_system_map.SparkSystemMapTests.test_count_files_under_records_os_error_when_rglob_fails tests.test_system_map.SparkSystemMapTests.test_count_schema_files_records_os_error_when_glob_fails tests.test_system_map.SparkSystemMapTests.test_summarize_memory_run_artifacts_records_os_error_when_iterdir_fails -> 4 tests OK; each asserts a PermissionError still routes through the narrow OSError tuple with a non-generic class name.",
"Subclass check: PYTHONPATH=src python3 -c "print(issubclass(PermissionError, OSError), issubclass(FileNotFoundError, OSError), issubclass(NotADirectoryError, OSError))" -> True True True.",
"Inverse: drop a Python-side typo into one of the try bodies (e.g. rename
extension_counts[suffix] += 1toextension_counts[suffix] +1); before the change, the SyntaxError or AttributeError gets swallowed and the operator seesout['error']: ...as if the directory were unreadable. After the change, the AttributeError propagates so the operator sees the real bug."],
"affected_workflow": "system_map.py is the system-map compile module driven by
spark report system-map. The four helpers feed the per-section snapshot the operator looks at to confirm Spark's local layout: inspect_file_metadata reports per-config-file size and modified-time, count_files_under reports overall directory shape and extension distribution, count_schema_files reports declared schemas under the schemas directory, and summarize_memory_run_artifacts reports memory-QA run counts and timestamps. When any one of these silently swallows a programming bug, the operator sees a filesystem error in the section it produced and assumes the local directory is broken -- they never see the real Python traceback that would point at the broken helper, so they spend time auditing filesystem permissions instead of reading the code."},
"evidence": {
"safe_links_only": true,
"before_after_proof": "Four paired try/except patterns. Before:
except Exception as exc: out['error'] = f'{type(exc).__name__}: {exc}'. After:except OSError as exc: out['error'] = f'{type(exc).__name__}: {exc}'. Except bodies byte-identical; only the catch class changes. OSError covers PermissionError, FileNotFoundError, NotADirectoryError, IsADirectoryError, and FileExistsError, which are the only classes stdlib Path traversal (stat / rglob / glob / iterdir) can raise. Diff: 4 insertions, 4 deletions in src/spark_cli/system_map.py plus 70 insertions in tests/test_system_map.py for four focused unittest.mock.patch tests.","links": ["https://github.com//pull/536"],
"forbidden": ["pdf","zip","exe","unknown downloads","shortened links","archives","binaries","tokens","browser cookies","wallet material","raw logs","raw conversations","raw memory","raw patches","private repo maps","private scoring details"]
},
"proposed_fix": {
"approach": "Replace
except Exception as exc:withexcept OSError as exc:in all four helpers. The narrow class is exhaustive for stdlib Path traversal: stat / rglob / glob / iterdir / Path.is_dir / Path.is_file all raise only OSError subclasses, and the helpers do nothing else inside their try blocks beyond those traversal calls plus pure-Python dict/counter updates. The except bodies are byte-identical pre/post. Four unittest.mock.patch tests are added in tests/test_system_map.py; each test injects PermissionError into stat / rglob / glob / iterdir respectively and asserts the helper still routes through the narrow tuple with a non-generic class name.","files_expected": ["src/spark_cli/system_map.py", "tests/test_system_map.py"],
"tests_or_smoke": "Manual smoke: missing-path inputs still return early via the .exists() fast-path (no traceback, no error key). The four new unittest.mock.patch tests pass: PYTHONPATH=src python3 -m unittest tests.test_system_map.SparkSystemMapTests.test_inspect_file_metadata_records_os_error_when_stat_fails tests.test_system_map.SparkSystemMapTests.test_count_files_under_records_os_error_when_rglob_fails tests.test_system_map.SparkSystemMapTests.test_count_schema_files_records_os_error_when_glob_fails tests.test_system_map.SparkSystemMapTests.test_summarize_memory_run_artifacts_records_os_error_when_iterdir_fails -> 4 tests OK. AST parse of system_map.py is clean."
},
"pr": {
"branch": "sentinel/exception-narrow/system-map-filesystem-walkers",
"title_prefix": "[spark-compete]",
"author_github": "4gjnbzb4zf-sudo",
"body_must_include": ["packet","team","pr_author","repo","actual_behavior","expected_behavior","repro_steps","before_after_proof","tests_or_smoke","duplicate_notes","risk_notes","review_claim"],
"url": "#536"
},
"review_claim": {
"impact_claim": "low",
"evidence_types": ["redacted_terminal_excerpt"],
"duplicate_notes": "Pre-flight
gh pr list --search 'inspect_file_metadata OR count_schema_files OR summarize_memory_run_artifacts narrow'returned zero hits. Our own #526, #532, #533, #534, #535 narrowed other categories in system_map.py (six sqlite helpers, read_json + read_toml, inspect_spawner_prd_auto_trace, git_summary + run_git, inspect_builder_trace_groups + inspect_builder_trace_health). These four filesystem-walker helpers (L1387, L1513, L1543, L2229) are a distinct category and were not touched by any prior PR. No competitor PR touches them either.","risk_notes": "Local scope: one file changed for the fix (four lines), plus tests. The except bodies are byte-identical pre/post for all four helpers. The .exists() fast-path is unchanged in each. The OSError narrow is exhaustive for stdlib Path traversal -- all the exceptions stat / rglob / glob / iterdir can raise are OSError subclasses, and the try bodies do nothing else beyond pure-Python aggregation. The only observable behavior change is that programming bugs no longer get silently swallowed.",
"review_state_requested": "pr_review"
}
}