SQLite log WAL grows rapidly from self-watched CODEX_HOME inotify events

This is a sanitized report for a severe SQLite WAL growth issue observed while running the `codext` fork. A related upstream Codex issue is open at https://github.com/openai/codex/issues/28997, but the newest evidence was collected from `codext`/vendored-`codex` processes, so this may be fork-specific or fork-amplified.

Additional sanitized findings from a deeper investigation:

The strongest evidence now points to a self-watch / self-log loop around `$CODEX_HOME`.

## Likely loop

1. Codex writes a diagnostic row to `logs_2.sqlite`.
2. SQLite updates `logs_2.sqlite-wal`.
3. A Codex inotify watcher sees `logs_2.sqlite-wal` modified.
4. Codex records a `TRACE` row for that inotify event into the same `logs_2.sqlite`.
5. That write modifies `logs_2.sqlite-wal` again.
6. Repeat.

## Important environment note

The newest evidence was collected while the affected session was being run through `codext`, which vendors/runs a `codex` binary. The process model involved a `codext` app-server plus remote `codext` sessions, each launching vendored `codex` processes. This may be fork-specific, fork-amplified, or caused by an interaction between upstream Codex file watching/logging code and `codext`s app-server/remote process model.

## Observed state

Sanitized paths:

```text
CODEX_HOME=/home/<user>/.codex-work
SQLite DB=/home/<user>/.codex-work/logs_2.sqlite
SQLite WAL=/home/<user>/.codex-work/logs_2.sqlite-wal
```

The current post-cleanup WAL still grew back to roughly 12 GB:

```text
11,884,873,392  /home/<user>/.codex-work/logs_2.sqlite-wal
115,650,560     /home/<user>/.codex-work/logs_2.sqlite
23,101,440      /home/<user>/.codex-work/logs_2.sqlite-shm
```

Earlier growth samples during active sessions showed severe WAL growth:

```text
2,492,175,672 -> 2,842,709,392 bytes in 10 seconds
5,953,070,432 -> 6,316,977,672 bytes in 10 seconds
8,921,312,072 -> 9,101,376,672 bytes in 5 seconds
```

This is after a previous stale/closed WAL reached 219 GB and filled the `/home` filesystem.

## SQLite log-table evidence

Counts from the affected `logs_2.sqlite`:

```text
total_rows=47,565
TRACE=42,794
INFO=3,533
DEBUG=864
WARN=356
ERROR=18
inotify_rows=39,532
logs_2.sqlite-wal mentions=28,727
```

Top repeated inotify messages:

```text
28,699  inotify event: Event { wd: WatchDescriptor { id: 1, fd: (Weak) }, mask: EventMask(MODIFY), cookie: 0, name: Some("logs_2.sqlite-wal") }
 8,049  inotify event: Event { wd: WatchDescriptor { id: 1, fd: (Weak) }, mask: EventMask(MODIFY), cookie: 0, name: Some("logs_2.sqlite") }
   147  inotify event: Event { wd: WatchDescriptor { id: 1, fd: (Weak) }, mask: EventMask(MODIFY), cookie: 0, name: Some("state_5.sqlite-wal") }
```

There are also unrelated file-open watcher events for system files such as `ld.so.cache`, `locale.alias`, and `passwd`, which appear to come from a separate `/etc` watch. The disk-filling loop is the one involving `logs_2.sqlite*`.

## Kernel inotify evidence

The active affected process had an inotify file descriptor with:

```text
FD 30 anon_inode:inotify
  inotify wd:1 ino:c6860b ...
```

The inode maps to `$CODEX_HOME`:

```text
hex c6860b == decimal 13010443
13010443 /home/<user>/.codex-work
```

The repeated SQLite log rows also use `WatchDescriptor { id: 1, ... }`, and the row names are `logs_2.sqlite-wal` / `logs_2.sqlite`. That ties the logged file events directly to a watcher rooted at `$CODEX_HOME`, not just to a random project directory.

## Open file holders

The affected `codext`/vendored `codex` process held open handles to:

```text
/home/<user>/.codex-work/logs_2.sqlite
/home/<user>/.codex-work/logs_2.sqlite-wal
/home/<user>/.codex-work/logs_2.sqlite-shm
```

Other active `codext` app-server/remote processes held handles to a separate profile's `logs_2.sqlite-wal`.

## Process model clue

The process tree included:

```text
node .../bin/codext ... app-server --listen ws://127.0.0.1:<port>
.../codex ... app-server --listen ws://127.0.0.1:<port>
node .../bin/codext ... --remote ws://127.0.0.1:<port> -C /home/<user>/Projects/<repo>
.../codex ... --remote ws://127.0.0.1:<port> -C /home/<user>/Projects/<repo>
node .../bin/codext ... --sandbox danger-full-access --ask-for-approval never
.../codex ... --sandbox danger-full-access --ask-for-approval never
```

The exact project names have been omitted intentionally. Non-Codex dev-server processes running in one project did not hold handles to `logs_2.sqlite*`; only Codex/codext processes did.

## Trust/root clue

The current Git repo root resolved correctly to a project directory under `/home/<user>/Projects/<repo>`, but the UI reportedly displayed a trust warning for `/home/<user>`, not the project root. If a broad home directory becomes a trusted/watch surface, it can include `$CODEX_HOME` and therefore Codex's own SQLite state.

Even without that clue, the kernel inotify evidence above shows a watcher rooted at `/home/<user>/.codex-work`.

## Relevant source areas

These upstream files look relevant:

```text
codex-rs/file-watcher/src/lib.rs
codex-rs/app-server/src/fs_watch.rs
codex-rs/app-server/src/skills_watcher.rs
codex-rs/tui/src/onboarding/onboarding_screen.rs
```

Specifically:

- `file-watcher` can watch requested paths and can fall back to the nearest existing ancestor.
- app-server exposes `fs/watch`.
- skills roots can be watched recursively.
- trust onboarding falls back to `cwd` if no Git root is resolved.

## Expected behavior

Codex should not log file watcher events for its own SQLite diagnostic/state files into the same SQLite log sink.

At minimum, file watcher trace logging should suppress:

```text
logs_2.sqlite
logs_2.sqlite-wal
logs_2.sqlite-shm
state_5.sqlite
state_5.sqlite-wal
state_5.sqlite-shm
goals_1.sqlite*
memories_1.sqlite*
```

Better fixes:

1. Never watch `$CODEX_HOME` as a filesystem watch root unless explicitly required.
2. Never log watcher events for Codex's own SQLite files into the SQLite log sink.
3. Respect log-level filtering before inserting TRACE rows into `logs_2.sqlite`.
4. Add WAL size limits, checkpointing, rotation, or emergency safeguards so diagnostic logs cannot consume hundreds of GB.
5. If a broad home directory trust target is selected, exclude Codex state directories from any watch surfaces.

## Local mitigation being tested

The local mitigation is to move SQLite-backed runtime state outside the watched `CODEX_HOME` path:

```toml
sqlite_home = "/home/<user>/.local/share/codex-sqlite/work"
```

This was added after the issue was observed. Already-running Codex/codext processes can continue holding the old `logs_2.sqlite*` files until restarted.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SQLite log WAL grows rapidly from self-watched CODEX_HOME inotify events #11

Likely loop

Important environment note

Observed state

SQLite log-table evidence

Kernel inotify evidence

Open file holders

Process model clue

Trust/root clue

Relevant source areas

Expected behavior

Local mitigation being tested

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

SQLite log WAL grows rapidly from self-watched CODEX_HOME inotify events #11

Description

Likely loop

Important environment note

Observed state

SQLite log-table evidence

Kernel inotify evidence

Open file holders

Process model clue

Trust/root clue

Relevant source areas

Expected behavior

Local mitigation being tested

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions