fix(ingest): clear agent_event by primary id so index drift can't crash re-ingest#141
Merged
Conversation
…sh re-ingest
The per-session clear ran `DELETE agent_event WHERE agent_session = X`, which
SurrealDB plans through the `agent_event_session_seq` (agent_session, seq)
UNIQUE index. A long-lived DB can accumulate stale/ghost index entries (seen
across a SurrealDB version change + prior partial ingests). When that happens
the index-driven DELETE silently skips the drifted rows, but their entries
still block the fresh `(agent_session, seq)` INSERT, so the next ingest crashes:
Database index `agent_event_session_seq` already contains
[agent_session:codex__…, 181], with record `agent_event:…__compacted_181__…`
Observed live: 27 codex sessions held ~19k duplicate `(agent_session, seq)`
rows the index let in; a bare WHERE-delete removed only a subset (190 -> 10 on
one session) while a SELECT and a delete-by-id both saw/removed every row.
Fix: delete via `id IN (SELECT VALUE id FROM agent_event WHERE agent_session = X)`
so the inner SELECT reliably enumerates every row and the outer DELETE removes
them by PRIMARY id, never consulting the corruptible secondary index. This also
self-heals a corrupt session on its next re-ingest.
Adds scripts/repair-agent-event-index.ts (+ unit-tested `planSessionDedup`) to
globally dedupe + rebuild the index for an already-corrupt DB, and strengthens
the builder test to forbid regressing to the bare-WHERE delete.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Deploying ax with
|
| Latest commit: |
30c406c
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://c1d11d7b.ax-62d.pages.dev |
| Branch Preview URL: | https://fix-agent-event-index-clear.ax-62d.pages.dev |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
ax update/ax ingestcrashed mid-pipeline with:Root cause
The per-session clear ran
DELETE agent_event WHERE agent_session = X. SurrealDB plans that through theagent_event_session_seq(agent_session, seq)UNIQUE index. A long-lived DB had drifted: it held genuine duplicate(agent_session, seq)rows the index let in, and the index-driven DELETE silently skipped the drifted rows — yet their ghost entries still blocked the fresh(agent_session, seq)INSERT, crashing the next ingest.Verified live:
DELETE … WHERE agent_session=Xtook one session 190→10 rows (10 ghosts survived with the correctagent_sessionlink).SELECT … WHERE agent_session=XandDELETE agent_event:<id>(by primary id) both saw/removed every row.(session, seq)groups, ~19,089 excess rows, 27 codex sessions.Fix
Delete via
id IN (SELECT VALUE id FROM agent_event WHERE agent_session = X)— the inner SELECT reliably enumerates every row; the outer DELETE removes them by primary id, never consulting the corruptible secondary index. This also self-heals a corrupt session on its next re-ingest.Plus
scripts/repair-agent-event-index.ts(+ unit-testedplanSessionDedup) to globally dedupe + rebuild the index for an already-corrupt DB (--dry-runsupported), and a strengthened builder test that forbids regressing to the bare-WHERE delete.Verification
ax ingestran end-to-endEXIT=0.bun testprovider-events + repair script: 9 pass.--dry-runagainst repaired DB: 0 duplicates.🤖 Generated with Claude Code