Skip to content

Harden local state storage boundaries#3433

Draft
Hmbown wants to merge 2 commits into
mainfrom
codex/v0.8.65-storage-hardening
Draft

Harden local state storage boundaries#3433
Hmbown wants to merge 2 commits into
mainfrom
codex/v0.8.65-storage-hardening

Conversation

@Hmbown

@Hmbown Hmbown commented Jun 23, 2026

Copy link
Copy Markdown
Owner

Goal

Tighten local state/storage boundaries for the v0.8.65 release lane without changing public workflow behavior.

Changes

  • Reject path-shaped task IDs before writing task artifacts.
  • Resolve pr_attempt_record task IDs through TaskManager before writing the diff artifact, so caller-supplied IDs must identify an existing durable task and the artifact path uses the canonical task ID.
  • Reject path-shaped automation/run IDs before reading, writing, listing, or deleting automation storage.
  • Enable SQLite foreign-key enforcement for every state-store connection so thread deletion cascades to child rows.
  • Create file-backed secret files with private Unix mode at first write.
  • Treat explicit CODEWHALE_HOME as an isolation boundary for generic state/config fallback and migration.

Verification

  • cargo fmt --all -- --check
  • git diff --check
  • cargo test -p codewhale-tui --bin codewhale-tui --locked artifact_writer_rejects_path_component_task_ids
  • cargo test -p codewhale-tui --bin codewhale-tui --locked task_artifact_helper_requires_existing_manager_task_before_write
  • cargo test -p codewhale-tui --bin codewhale-tui --locked rejects_storage_ids_with_path_components
  • cargo test -p codewhale-state --locked
  • cargo test -p codewhale-secrets --locked file_store
  • cargo test -p codewhale-config --locked explicit_codewhale_home_does_not_read_or_migrate_legacy
  • cargo test -p codewhale-config --locked migrate_config_reports_copied_legacy_path

Risks

  • Explicit CODEWHALE_HOME now bypasses legacy ~/.deepseek config/state fallback entirely. That matches the isolation contract, while default $HOME installs still preserve legacy migration/backfill behavior.
  • Automation/task IDs are now required to be single path components. Existing generated IDs already satisfy this.

Issue

Part of the v0.8.65 local-state hardening work.

Constrain task and automation storage identifiers before joining them into local state paths, enable SQLite foreign-key enforcement on each state connection, create file-backed secrets with private Unix mode at first write, and keep explicit CODEWHALE_HOME isolated from legacy DeepSeek fallback state/config.

Verification:

- cargo fmt --all -- --check

- cargo test -p codewhale-tui --bin codewhale-tui --locked artifact_writer_rejects_path_component_task_ids

- cargo test -p codewhale-tui --bin codewhale-tui --locked rejects_storage_ids_with_path_components

- cargo test -p codewhale-state --locked

- cargo test -p codewhale-secrets --locked file_store

- cargo test -p codewhale-config --locked explicit_codewhale_home_does_not_read_or_migrate_legacy

- cargo test -p codewhale-config --locked migrate_config_reports_copied_legacy_path

@greptile-apps greptile-apps Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces several security and robustness improvements across multiple crates. Key changes include adding path traversal validation for automation, run, and task IDs to prevent directory escape, enabling SQLite foreign key constraints to ensure cascading deletes of child rows, and restricting file permissions (0o600) when writing secrets on Unix. Additionally, configuration logic is refactored to skip legacy migrations when an explicit CODEWHALE_HOME is set. Feedback on these changes suggests performing an atomic write when saving secrets to prevent potential data corruption or loss of API keys if the write operation is interrupted.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread crates/secrets/src/lib.rs
Comment on lines +483 to +498
#[cfg(unix)]
{
use std::io::Write;
use std::os::unix::fs::OpenOptionsExt;
let mut file = fs::OpenOptions::new()
.create(true)
.truncate(true)
.write(true)
.mode(0o600)
.open(&self.path)?;
file.write_all(body.as_bytes())?;
}
#[cfg(not(unix))]
{
fs::write(&self.path, body)?;
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Writing directly to the target secrets file (self.path) can lead to data corruption or loss of all stored API keys if the write operation is interrupted (e.g., due to a sudden power loss, disk full error, or process termination). Since this file contains highly sensitive credentials, it is much safer to perform an atomic write by writing to a temporary file in the same directory and then renaming it to the target path.

        let tmp_path = self.path.with_extension("tmp");
        #[cfg(unix)]
        {
            use std::io::Write;
            use std::os::unix::fs::OpenOptionsExt;
            let mut file = fs::OpenOptions::new()
                .create(true)
                .truncate(true)
                .write(true)
                .mode(0o600)
                .open(&tmp_path)?;
            file.write_all(body.as_bytes())?;
        }
        #[cfg(not(unix))]
        {
            fs::write(&tmp_path, &body)?;
        }
        fs::rename(&tmp_path, &self.path)?;

Resolve caller-supplied PR attempt task IDs through TaskManager before writing the diff artifact so the write path proves the durable task exists and uses the canonical task ID.

Keep the no-manager active-task fallback path path-component validated, and add a regression that a missing manager task fails before an artifact directory is created.

Verification:

- cargo fmt --all -- --check

- git diff --check

- cargo test -p codewhale-tui --bin codewhale-tui --locked task_artifact_helper_requires_existing_manager_task_before_write

- cargo test -p codewhale-tui --bin codewhale-tui --locked artifact_writer_rejects_path_component_task_ids

@greptile-apps greptile-apps Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant