Harden local state storage boundaries#3433
Conversation
Constrain task and automation storage identifiers before joining them into local state paths, enable SQLite foreign-key enforcement on each state connection, create file-backed secrets with private Unix mode at first write, and keep explicit CODEWHALE_HOME isolated from legacy DeepSeek fallback state/config. Verification: - cargo fmt --all -- --check - cargo test -p codewhale-tui --bin codewhale-tui --locked artifact_writer_rejects_path_component_task_ids - cargo test -p codewhale-tui --bin codewhale-tui --locked rejects_storage_ids_with_path_components - cargo test -p codewhale-state --locked - cargo test -p codewhale-secrets --locked file_store - cargo test -p codewhale-config --locked explicit_codewhale_home_does_not_read_or_migrate_legacy - cargo test -p codewhale-config --locked migrate_config_reports_copied_legacy_path
There was a problem hiding this comment.
Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.
There was a problem hiding this comment.
Code Review
This pull request introduces several security and robustness improvements across multiple crates. Key changes include adding path traversal validation for automation, run, and task IDs to prevent directory escape, enabling SQLite foreign key constraints to ensure cascading deletes of child rows, and restricting file permissions (0o600) when writing secrets on Unix. Additionally, configuration logic is refactored to skip legacy migrations when an explicit CODEWHALE_HOME is set. Feedback on these changes suggests performing an atomic write when saving secrets to prevent potential data corruption or loss of API keys if the write operation is interrupted.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| #[cfg(unix)] | ||
| { | ||
| use std::io::Write; | ||
| use std::os::unix::fs::OpenOptionsExt; | ||
| let mut file = fs::OpenOptions::new() | ||
| .create(true) | ||
| .truncate(true) | ||
| .write(true) | ||
| .mode(0o600) | ||
| .open(&self.path)?; | ||
| file.write_all(body.as_bytes())?; | ||
| } | ||
| #[cfg(not(unix))] | ||
| { | ||
| fs::write(&self.path, body)?; | ||
| } |
There was a problem hiding this comment.
Writing directly to the target secrets file (self.path) can lead to data corruption or loss of all stored API keys if the write operation is interrupted (e.g., due to a sudden power loss, disk full error, or process termination). Since this file contains highly sensitive credentials, it is much safer to perform an atomic write by writing to a temporary file in the same directory and then renaming it to the target path.
let tmp_path = self.path.with_extension("tmp");
#[cfg(unix)]
{
use std::io::Write;
use std::os::unix::fs::OpenOptionsExt;
let mut file = fs::OpenOptions::new()
.create(true)
.truncate(true)
.write(true)
.mode(0o600)
.open(&tmp_path)?;
file.write_all(body.as_bytes())?;
}
#[cfg(not(unix))]
{
fs::write(&tmp_path, &body)?;
}
fs::rename(&tmp_path, &self.path)?;Resolve caller-supplied PR attempt task IDs through TaskManager before writing the diff artifact so the write path proves the durable task exists and uses the canonical task ID. Keep the no-manager active-task fallback path path-component validated, and add a regression that a missing manager task fails before an artifact directory is created. Verification: - cargo fmt --all -- --check - git diff --check - cargo test -p codewhale-tui --bin codewhale-tui --locked task_artifact_helper_requires_existing_manager_task_before_write - cargo test -p codewhale-tui --bin codewhale-tui --locked artifact_writer_rejects_path_component_task_ids
There was a problem hiding this comment.
Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.
Goal
Tighten local state/storage boundaries for the v0.8.65 release lane without changing public workflow behavior.
Changes
pr_attempt_recordtask IDs throughTaskManagerbefore writing the diff artifact, so caller-supplied IDs must identify an existing durable task and the artifact path uses the canonical task ID.CODEWHALE_HOMEas an isolation boundary for generic state/config fallback and migration.Verification
cargo fmt --all -- --checkgit diff --checkcargo test -p codewhale-tui --bin codewhale-tui --locked artifact_writer_rejects_path_component_task_idscargo test -p codewhale-tui --bin codewhale-tui --locked task_artifact_helper_requires_existing_manager_task_before_writecargo test -p codewhale-tui --bin codewhale-tui --locked rejects_storage_ids_with_path_componentscargo test -p codewhale-state --lockedcargo test -p codewhale-secrets --locked file_storecargo test -p codewhale-config --locked explicit_codewhale_home_does_not_read_or_migrate_legacycargo test -p codewhale-config --locked migrate_config_reports_copied_legacy_pathRisks
CODEWHALE_HOMEnow bypasses legacy~/.deepseekconfig/state fallback entirely. That matches the isolation contract, while default$HOMEinstalls still preserve legacy migration/backfill behavior.Issue
Part of the v0.8.65 local-state hardening work.