Harden local state storage boundaries#3438
Conversation
Confine task artifact and automation/run storage IDs before path construction, enable SQLite foreign keys for every state connection, create file-backed secrets with private mode at first write, and keep explicit CODEWHALE_HOME isolated from ambient legacy state. Regression coverage exercises path-like task IDs, automation and run IDs, thread deletion cascades, secure file-store round trips, and explicit home isolation while preserving default legacy backfill/migration behavior. Verification: - cargo fmt --all -- --check - git diff --check - cargo test -p codewhale-tui --bin codewhale-tui --locked write_task_artifact_rejects_path_like_task_ids - cargo test -p codewhale-tui --bin codewhale-tui --locked automation_ - cargo test -p codewhale-state --locked - cargo test -p codewhale-secrets --locked - cargo test -p codewhale-config --locked - cargo test -p codewhale-tui --bin codewhale-tui --locked
There was a problem hiding this comment.
Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.
There was a problem hiding this comment.
Code Review
This pull request introduces isolation support via the CODEWHALE_HOME environment variable, enables SQLite foreign keys with cascading deletes, secures file writing permissions in the keyring store, and adds path traversal validation for storage IDs. Review feedback suggests ensuring default_tasks_dir() also respects CODEWHALE_HOME to maintain isolation, resolving a potential TOCTOU race condition in the keyring store by operating directly on the file descriptor, and deduplicating the path validation logic across modules.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| } | ||
|
|
||
| fn write_artifact(&self, task_id: &str, label: &str, content: &str) -> Result<PathBuf> { | ||
| validate_task_storage_id(task_id)?; |
There was a problem hiding this comment.
The default_tasks_dir() function (defined around line 1695) does not check the CODEWHALE_HOME environment variable, unlike default_automations_dir() in automation_manager.rs and default_state_db_path() in state/src/lib.rs. This breaks the isolation boundary of CODEWHALE_HOME for background tasks, causing tasks from different isolated environments to potentially collide in ~/.codewhale/tasks.
Please update default_tasks_dir() to check CODEWHALE_HOME:
pub fn default_tasks_dir() -> PathBuf {
if let Ok(path) = std::env::var("DEEPSEEK_TASKS_DIR")
&& !path.trim().is_empty()
{
return PathBuf::from(path);
}
if let Some(home) = std::env::var_os("CODEWHALE_HOME").filter(|value| !value.is_empty()) {
return PathBuf::from(home).join("tasks");
}
dirs::home_dir()
.map(|home| default_tasks_dir_for_home(&home))
.unwrap_or_else(|| PathBuf::from(".codewhale").join("tasks"))
}| if let Ok(meta) = fs::metadata(&self.path) { | ||
| let mut perms = meta.permissions(); | ||
| perms.set_mode(0o600); | ||
| let _ = fs::set_permissions(&self.path, perms); | ||
| } |
There was a problem hiding this comment.
To avoid a potential Time-of-Check to Time-of-Use (TOCTOU) race condition and unnecessary path resolution syscalls, perform the metadata and permission operations directly on the opened file descriptor rather than querying the file path again.
| if let Ok(meta) = fs::metadata(&self.path) { | |
| let mut perms = meta.permissions(); | |
| perms.set_mode(0o600); | |
| let _ = fs::set_permissions(&self.path, perms); | |
| } | |
| if let Ok(meta) = file.metadata() { | |
| let mut perms = meta.permissions(); | |
| perms.set_mode(0o600); | |
| let _ = file.set_permissions(perms); | |
| } |
| validate_storage_component("task id", task_id) | ||
| } | ||
|
|
||
| fn validate_storage_component(label: &str, value: &str) -> Result<()> { |
| fn validate_storage_id(label: &str, value: &str) -> Result<()> { | ||
| let value = value.trim(); | ||
| if value.is_empty() { | ||
| bail!("{label} must not be empty"); | ||
| } | ||
| if value.contains('/') || value.contains('\\') { | ||
| bail!("{label} must not contain path separators: {value}"); | ||
| } | ||
| let mut components = Path::new(value).components(); | ||
| let Some(component) = components.next() else { | ||
| bail!("{label} must not be empty"); | ||
| }; | ||
| if components.next().is_some() { | ||
| bail!("{label} must be a single path component: {value}"); | ||
| } | ||
| if !matches!(component, std::path::Component::Normal(_)) { | ||
| bail!("{label} must be a normal path component: {value}"); | ||
| } | ||
| Ok(()) | ||
| } |
There was a problem hiding this comment.
|
Closing this duplicate branch in favor of #3433. #3433 already carries the same local-state hardening scope and now also includes the stricter Relevant verification preserved on #3433:
|
Summary
CODEWHALE_HOMEisolated from ambient legacy state/config while preserving default-home legacy backfill and migrationVerification
cargo fmt --all -- --checkgit diff --checkcargo test -p codewhale-tui --bin codewhale-tui --locked write_task_artifact_rejects_path_like_task_idscargo test -p codewhale-tui --bin codewhale-tui --locked automation_cargo test -p codewhale-state --lockedcargo test -p codewhale-secrets --lockedcargo test -p codewhale-config --lockedcargo test -p codewhale-tui --bin codewhale-tui --locked(5117 passed, 2 ignored)Risk
todo/task/automation behavior is preserved for normal UUID IDs. Path-like IDs now fail before storage paths are built.$HOME/.codewhaleinstalls still read/migrate legacy.deepseekstate; only explicitCODEWHALE_HOMEis treated as an isolation boundary.