[spark-compete] fix: respect explicit empty SPARK_HOME instead of writing sandbox paths under the current working directory#719
Conversation
…ting sandbox paths under the current working directory
QA write-up -- for reviewer eyesTL;DR. Bug# src/spark_cli/sandbox/paths.py:20-21 (pre-patch)
def spark_home() -> Path:
return Path(os.environ.get("SPARK_HOME", Path.home() / ".spark")).expanduser()
Fix# src/spark_cli/sandbox/paths.py:20-29 (post-patch)
def spark_home() -> Path:
# Honor missing-vs-explicit-empty: only fall back to the default when
# SPARK_HOME is unset, not when the operator set it to an empty string.
# os.environ.get("SPARK_HOME", default) returns "" (not the default)
# when SPARK_HOME == '', and Path("").expanduser() resolves to the
# current working directory, which is surprising and unsafe.
configured = os.environ.get("SPARK_HOME")
if not configured:
return (Path.home() / ".spark").expanduser()
return Path(configured).expanduser()
Before / after3/3 boundary cases. Empty and unset coalesce to the default; non-empty explicit value is honored unchanged. VerificationCompiles clean before and after. Scope
Why this matters
|
{
"schema": "spark-compete-hotfix-v1",
"event": "spark-compete-first-event",
"submission_mode": "public_repo_pr",
"submission_target_url": "#719",
"team": {
"name": "SparkThisUp",
"members": ["ValHallaBuilder", "Baz707", "DanFireDash"],
"github_accounts": ["4gjnbzb4zf-sudo"],
"llm_device_holder": "ValHallaBuilder",
"device_holder_github": "4gjnbzb4zf-sudo"
},
"target_repo": {
"id": "vibeforge1111/spark-cli",
"source": "https://github.com/vibeforge1111/spark-cli",
"owner_surface": "spark-cli"
},
"issue": {
"type": "bug",
"severity": "low",
"title": "sandbox/paths.py spark_home() resolves to the current working directory when the operator exports SPARK_HOME='' because os.environ.get(K, default) returns the empty string instead of the default",
"actual_behavior": "src/spark_cli/sandbox/paths.py:20 defines
def spark_home(): return Path(os.environ.get('SPARK_HOME', Path.home() / '.spark')).expanduser(). The two-argument form ofos.environ.getonly returns the default when the key is missing from os.environ; when the key is present but the value is the empty string (operator-explicitexport SPARK_HOME=, common after a stale shellrc clear or a docker-composeenvironment: SPARK_HOME: ''block), the function returns''.Path('').expanduser()then returnsPath('.'), and the first downstream caller that does.resolve()(or that joins onto it and resolves the result) anchors the spark home at whatever directory spark was launched from. sandbox_config_dir() and sandbox_log_dir() both compose onto spark_home(), so 'config' and 'logs/remote' get written under the operator's cwd instead of under ~/.spark.","expected_behavior": "When SPARK_HOME is set to the empty string (operator-explicit) OR when SPARK_HOME is unset, spark_home() returns the standard default of
~/.spark(since empty string is not a meaningful spark-home path). When SPARK_HOME is set to a non-empty path, that path is honored exactly as before.","repro_steps": [
"gh pr checkout ",
"python3 -c "import os; from pathlib import Path; os.environ['SPARK_HOME']=''; v = os.environ.get('SPARK_HOME', Path.home() / '.spark'); p = Path(v).expanduser(); print('pre-patch:', repr(p), '->', p.resolve())" -> prints 'pre-patch: PosixPath('.') -> /tmp' (resolves to whatever cwd happens to be).",
"python3 -c "import sys; sys.path.insert(0, 'src'); import os; os.environ['SPARK_HOME']=''; from spark_cli.sandbox.paths import spark_home; print('post-patch:', spark_home())" -> prints 'post-patch: /home//.spark' (matches the unset default).",
"python3 -c "import sys; sys.path.insert(0, 'src'); import os; os.environ.pop('SPARK_HOME', None); from spark_cli.sandbox.paths import spark_home; print('post-patch unset:', spark_home())" -> prints 'post-patch unset: /home//.spark' (default unchanged).",
"python3 -c "import sys; sys.path.insert(0, 'src'); import os; os.environ['SPARK_HOME']='/tmp/spark-custom'; from spark_cli.sandbox.paths import spark_home; print('post-patch set:', spark_home())" -> prints 'post-patch set: /tmp/spark-custom' (explicit non-empty unchanged).",
"python3 -m py_compile src/spark_cli/sandbox/paths.py -> compiles cleanly before and after the patch."
],
"affected_workflow": "spark_home() is the canonical anchor for the sandbox subsystem: sandbox_config_dir(home) returns
home / 'config', sandbox_log_dir(home) returnshome / 'logs' / 'remote'. When the operator clears SPARK_HOME and runsspark sandbox docker doctor(or any sandbox command) from a checkout directory, the doctor probe writes its temp / cache state under that checkout dir instead of under ~/.spark. Subsequentspark verifyinvocations from a different directory then can't find the previously-written sandbox config and report 'not configured', confusing the operator who thinks they have a working sandbox install. The downstream chip-labs and spawner-ui pipelines that read sandbox config (via the same spark_home() helper indirectly through sandbox_config_dir) also lose track of the operator's actual configuration. Honoring the empty-string env case makes the sandbox subsystem robust against operator-explicit env clears."},
"evidence": {
"safe_links_only": true,
"before_after_proof": "Diff in src/spark_cli/sandbox/paths.py replaces the inline
os.environ.get('SPARK_HOME', Path.home() / '.spark')with an explicit capture-and-falsy-check that treats both None (unset) and '' (operator-explicit empty) as 'use the default'. Non-empty explicit values continue to be honored. The change is local to the spark_home() helper; sandbox_config_dir() and sandbox_log_dir() both consume it unchanged. python3 -m py_compile is clean before and after.","links": ["https://github.com//pull/719"],
"forbidden": ["pdf","zip","exe","unknown downloads","shortened links","archives","binaries","tokens","browser cookies","wallet material","raw logs","raw conversations","raw memory","raw patches","private repo maps","private scoring details"]
},
"proposed_fix": {
"approach": "Replace
Path(os.environ.get('SPARK_HOME', Path.home() / '.spark')).expanduser()with an explicit capture: readconfigured = os.environ.get('SPARK_HOME'); ifnot configured(covers both None for unset and '' for operator-explicit empty), return(Path.home() / '.spark').expanduser(); otherwise returnPath(configured).expanduser(). Single function, single file. Empty string is not a meaningful spark-home path, so coalescing it with the unset case matches operator intent.","files_expected": ["src/spark_cli/sandbox/paths.py"],
"tests_or_smoke": "Python reproducer: pre-patch SPARK_HOME='' -> Path('.') -> .resolve() == cwd (bug); post-patch SPARK_HOME='' -> ~/.spark (default). Unset and non-empty-explicit cases are unchanged. python3 -m py_compile clean."
},
"pr": {
"branch": "sentinel/truthy-empty-env/spark-home-empty",
"title_prefix": "[spark-compete]",
"author_github": "4gjnbzb4zf-sudo",
"body_must_include": ["packet","team","pr_author","repo","actual_behavior","expected_behavior","repro_steps","before_after_proof","tests_or_smoke","duplicate_notes","risk_notes","review_claim"],
"url": "#719"
},
"review_claim": {
"impact_claim": "low",
"evidence_types": ["redacted_terminal_excerpt"],
"duplicate_notes": "Pre-flight
gh pr list --repo vibeforge1111/spark-cli --state open --search 'SPARK_HOME'returned PRs #79, #155, #194, #346, #581, #306, #314, #484 and others -- all about SPARK_HOME write-deny prefixes, secret-ref restrictions, or resolve-from-spark-home for desktop-fallback. None touch the empty-string-vs-unset semantics of spark_home() in sandbox/paths.py. No overlap with this PR.","risk_notes": "Local scope: one helper, one file, ~9 lines added. Behavior change is narrow: SPARK_HOME='' now resolves to the same default as unset (instead of resolving to cwd). Unset SPARK_HOME continues to default to ~/.spark (unchanged). Non-empty SPARK_HOME continues to be honored exactly as before. The module-level
SPARK_HOMEconstant in src/spark_cli/cli.py:54 still uses the two-argos.environ.get('SPARK_HOME', Path.home() / '.spark')form -- this PR intentionally limits scope to the sandbox helper which is the higher-traffic, runtime-evaluated path; the module-level constant is evaluated once at import time. python3 -m py_compile clean before and after.","review_state_requested": "pr_review"
}
}