fix: gate cortex memory injection to real user turns#23
Conversation
📝 WalkthroughWalkthroughAdds a classifier-based memory gating API and synthetic-turn detection, implements a GBrain "shadow" observe-only JSONL event logger and wires shadow event emission into recall and capture flows; adds tests for gating, extraction, and relevance functions. Changes
Sequence Diagram(s)sequenceDiagram
actor Client
participant Memory as Memory Lane
participant Shadow as GBrain Shadow
participant CLI as CLI Search
participant Log as JSONL Log
rect rgba(100,150,200,0.5)
Note over Memory: Memory Recall Phase
Client->>Memory: Request with prompt + ctx
Memory->>Memory: classifyTurnForMemory(prompt, messages, ctx)
alt allowed
Memory->>Memory: perform authoritative recall/injection
end
alt gbrainShadowEnabled && gbrainShadowRecall
Memory->>Shadow: spawn shadow search
Shadow->>CLI: run search CLI (e.g., bun run src/cli.ts)
CLI-->>Shadow: search results
Shadow->>Log: append GBrainShadowEvent (best-effort)
end
Memory->>Client: return response
end
rect rgba(150,100,200,0.5)
Note over Memory: Memory Capture Phase
Client->>Memory: Agent end / turn content
Memory->>Memory: classifyTurnForMemory(turnText, messages, ctx)
alt allowed
Memory->>Memory: fire-and-forget client.remember() (authoritative)
end
alt gbrainShadowEnabled && gbrainShadowCapture
Memory->>Shadow: spawn shadow search with extracted text
Shadow->>CLI: run search CLI
CLI-->>Shadow: search results
Shadow->>Log: append GBrainShadowEvent (best-effort)
end
Memory->>Client: complete capture flow
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
|
@coderabbitai review\n\nPlease focus on the real-user-turn gating logic in , especially synthetic control-turn exclusions for plan decisions, question answers, heartbeat, boot, and session bootstrap, plus any regression risk in the generated dist artifacts. |
|
Focusing on:
✅ Actions performedFull review triggered. |
|
Handled the remaining actionable review item too. Latest fix pushed:
I also resolved the already-addressed review threads and left the informational-only one open with a reply. |
| if (cfg.gbrainShadowEnabled && cfg.gbrainShadowRecall) { | ||
| maybeLogGBrainShadowEvent(cfg, createGBrainShadowEvent({ | ||
| phase: "recall", | ||
| sessionId: ctx.sessionKey, | ||
| providerBaseUrl: cfg.gbrainShadowProviderBaseUrl, | ||
| model: cfg.gbrainShadowModel, | ||
| prompt: event.prompt, | ||
| maxPromptChars: cfg.gbrainShadowMaxPromptChars, | ||
| status: "simulated", | ||
| note: "Observe-only recall seam. Shadow metadata recorded without affecting authoritative injection.", | ||
| divergenceFromAuthoritative: "GBrain remains non-authoritative here. This branch only records shadow metadata, not live retrieval results.", | ||
| })); | ||
| } |
There was a problem hiding this comment.
🚩 Synchronous file I/O on the critical recall path when gbrainShadow is enabled
At src/index.ts:2143-2155, maybeLogGBrainShadowEvent is called inside the async before_agent_start handler. This invokes appendGBrainShadowLog (src/index.ts:1442-1464) which performs appendFileSync, then readFileSync + writeFileSync for log trimming — all synchronous, blocking the Node.js event loop on the user-facing recall critical path. The same pattern occurs in the agent_end handler at src/index.ts:2216-2228, though that path is less latency-sensitive (fire-and-forget). The feature is off by default (gbrainShadowEnabled: false) and files are bounded (200 entries), but when enabled under concurrency, the sync I/O could create tail latency. Consider using async variants (fs.promises.appendFile, etc.) or deferring the write with setImmediate/process.nextTick.
Was this helpful? React with 👍 or 👎 to provide feedback.
|
Pushed final follow-up in Addressed remaining actionable review points:
I am leaving only the informational/architectural notes that are not worth expanding this PR for. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 7199cffa2d
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if (role !== "user") continue; | ||
| const text = extractMessageText(message); | ||
| return isSyntheticPromptText(text); |
There was a problem hiding this comment.
Check newest user message for synthetic-turn detection
hasSyntheticUserTurn iterates from oldest to newest but returns on the first user message it sees, so in a normal transcript (e.g., prior real user turn followed by a new [PLAN_DECISION] user control turn) it evaluates the older real message and exits early. That causes classifyTurnForMemory to allow recall/capture for synthetic turns, which defeats this commit’s turn-gating goal and can inject/store stale conversation context.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
🚩 Compiled dist/ artifacts included in PR diff
The PR includes changes to dist/__tests__/turn-gating.test.js, dist/__tests__/turn-gating.test.d.ts, dist/index.js, and dist/index.d.ts. Committing compiled output is a deliberate pattern in this repo (the dist/ directory was already tracked), but the compiled test files in dist/__tests__/ may be unintentional — test files are typically not distributed.
Was this helpful? React with 👍 or 👎 to provide feedback.
| if (maxEntries > 0) { | ||
| try { | ||
| const lines = readFileSync(logPath, "utf8").split("\n").filter(Boolean); | ||
| if (lines.length > maxEntries) { | ||
| writeFileSync(logPath, `${lines.slice(-maxEntries).join("\n")}\n`, "utf8"); | ||
| } | ||
| } catch { | ||
| // best-effort retention trim only | ||
| } |
There was a problem hiding this comment.
📝 Info: Shadow log retention trim has a read-after-write race window
In appendGBrainShadowLog (src/index.ts:1450-1461), the pattern is: appendFileSync → readFileSync → writeFileSync. If multiple concurrent calls run (e.g., rapid agent_end events), two calls could both read the file after their respective appends, and both decide to trim. The last writer wins, potentially dropping entries from the other call. Since this is observe-only logging with best-effort semantics and wrapped in try/catch, this won't cause functional issues, but the log could briefly exceed maxEntries or lose a few entries under high concurrency.
Was this helpful? React with 👍 or 👎 to provide feedback.
| function buildGBrainShadowLogPath(ownerId: string): string { | ||
| const pluginDir = typeof __dirname === "string" ? __dirname : dirname(__filename); | ||
| const safeOwnerId = sanitizeShadowLogKey(ownerId); | ||
| return join(pluginDir, "shadow-logs", `gbrain-shadow-${safeOwnerId}.jsonl`); |
There was a problem hiding this comment.
🚩 Shadow log files written inside dist/plugin directory
buildGBrainShadowLogPath at src/index.ts:1438-1441 resolves the log path relative to __dirname (which resolves to the dist/ directory at runtime). This means shadow logs are written into dist/shadow-logs/. This directory may get wiped on rebuild/redeploy, and writing into the build output directory is unconventional. Consider whether a more persistent location (e.g., a configurable path or a well-known data directory) would be more appropriate for production use.
Was this helpful? React with 👍 or 👎 to provide feedback.
|
Final CodeRabbit follow-up pushed in Addressed:
That should clear the remaining CodeRabbit change requests. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a803aac082
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
|
||
| function isSyntheticPromptText(prompt: string | undefined): boolean { | ||
| if (!prompt) return false; | ||
| if (isBootPrompt(prompt) || SYSTEM_EVENT_PATTERNS.test(prompt)) return true; |
There was a problem hiding this comment.
Narrow synthetic-pattern checks before rejecting user turns
This new classifier treats any prompt matching SYSTEM_EVENT_PATTERNS as synthetic, but that regex includes broad unanchored phrases such as source: subagent and type: subagent task. A real user asking about those log fields will now be misclassified as synthetic, which suppresses both recall injection and capture for that turn. Restricting this check to explicit wrapper formats (or anchoring to known system-event envelopes) avoids dropping legitimate technical conversations.
Useful? React with 👍 / 👎.
| appendFileSync(logPath, line, "utf8"); | ||
|
|
||
| if (maxEntries > 0) { | ||
| try { | ||
| const lines = readFileSync(logPath, "utf8").split("\n").filter(Boolean); |
There was a problem hiding this comment.
Avoid synchronous shadow logging on the hot recall path
Shadow logging now performs synchronous filesystem writes/reads for every logged event (appendFileSync plus retention reads), and this path is invoked during before_agent_start before the 3s injection cutoff. When shadow mode is enabled on slower disks or busy hosts, this observe-only logging can consume recall budget and cause authoritative memory injection to be skipped. Moving log I/O off the request path (or making it async/buffered) would preserve the non-interference goal.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/index.ts (1)
1519-1523:⚠️ Potential issue | 🟠 Major
systemcontrol turns can still leak into capture.
classifyTurnForMemory(undefined, event.messages, hookCtx)only screens synthetic prompt text and the newest syntheticuserturn. ThenextractMessagesdrops non-user/assistantmessages before the fallback boot/system checks run. If session bootstrap or other synthetic control markers arrive assystemmessages, they bypass classification and disappear before Lines 2215-2218, so this path can still callremember().Run the synthetic boot/system checks on raw
event.messages, or preservesystemmarkers until capture gating is finished.Also applies to: 1553-1556, 2200-2218
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/index.ts` around lines 1519 - 1523, The bug is that synthetic "system" control turns can be dropped by extractMessages before classifyTurnForMemory/boot-system gating runs, allowing control messages to leak into remember(); fix by running the synthetic boot/system checks against the raw event.messages (or modifying extractMessages to preserve system-role messages until capture gating completes) so classifyTurnForMemory only evaluates safe, pre-filtered turns; update calls around classifyTurnForMemory(event.messages, hookCtx) and extractMessages(...) to ensure system markers are either checked first or retained until after remember() gating is done.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/index.ts`:
- Around line 147-149: The shadow-log behavior currently defaults to enabled and
writes serialized prompt/conversation previews via createGBrainShadowEvent;
change gbrainShadowLogEnabled default to false (explicit opt-in) and/or redact
sensitive content before persistence in createGBrainShadowEvent (e.g., truncate
or replace previews beyond gbrainShadowMaxPromptChars and remove identifiable
text) and ensure writes respect gbrainShadowLogEnabled and
gbrainShadowLogMaxEntries limits; update any related code paths
(createGBrainShadowEvent and the config entries gbrainShadowLogEnabled,
gbrainShadowMaxPromptChars, gbrainShadowLogMaxEntries) so shadow-logs/
persistence only occurs when explicitly enabled and redaction/truncation is
applied.
---
Outside diff comments:
In `@src/index.ts`:
- Around line 1519-1523: The bug is that synthetic "system" control turns can be
dropped by extractMessages before classifyTurnForMemory/boot-system gating runs,
allowing control messages to leak into remember(); fix by running the synthetic
boot/system checks against the raw event.messages (or modifying extractMessages
to preserve system-role messages until capture gating completes) so
classifyTurnForMemory only evaluates safe, pre-filtered turns; update calls
around classifyTurnForMemory(event.messages, hookCtx) and extractMessages(...)
to ensure system markers are either checked first or retained until after
remember() gating is done.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 7fd4ed6c-6649-4810-89ca-03794e7fde3b
⛔ Files ignored due to path filters (3)
dist/index.d.ts.mapis excluded by!**/dist/**,!**/*.mapdist/index.jsis excluded by!**/dist/**dist/index.js.mapis excluded by!**/dist/**,!**/*.map
📒 Files selected for processing (1)
src/index.ts
📜 Review details
🔇 Additional comments (1)
src/index.ts (1)
920-986: Thedist/bundle is properly updated with the new gating logic. All new markers (classifyTurnForMemory,synthetic-user-turn,real-user-turn) appear in compiled output, and both hook call sites (lines 1735, 1872 in dist/index.js) use the updated classifier. The generated artifacts are current and will ship the correct behavior.
| gbrainShadowLogEnabled: true, | ||
| gbrainShadowMaxPromptChars: 2000, | ||
| gbrainShadowLogMaxEntries: 200, |
There was a problem hiding this comment.
Make shadow log persistence explicit opt-in.
createGBrainShadowEvent serializes prompt and conversation previews, and gbrainShadowLogEnabled defaults to true. As soon as gbrainShadowEnabled is turned on, this writes a second plaintext copy of user content under shadow-logs/, outside Cortex retention/deletion controls.
Default file persistence to off, or redact previews before append.
🔐 Minimal safeguard
- gbrainShadowLogEnabled: true,
+ gbrainShadowLogEnabled: false,Also applies to: 1498-1514
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/index.ts` around lines 147 - 149, The shadow-log behavior currently
defaults to enabled and writes serialized prompt/conversation previews via
createGBrainShadowEvent; change gbrainShadowLogEnabled default to false
(explicit opt-in) and/or redact sensitive content before persistence in
createGBrainShadowEvent (e.g., truncate or replace previews beyond
gbrainShadowMaxPromptChars and remove identifiable text) and ensure writes
respect gbrainShadowLogEnabled and gbrainShadowLogMaxEntries limits; update any
related code paths (createGBrainShadowEvent and the config entries
gbrainShadowLogEnabled, gbrainShadowMaxPromptChars, gbrainShadowLogMaxEntries)
so shadow-logs/ persistence only occurs when explicitly enabled and
redaction/truncation is applied.
| if (isBootPrompt(prompt)) return { skip: true, lane: "boot" }; | ||
| return { skip: false, lane: runKind }; | ||
| if (ctx?.isHeartbeat) return { allow: false, reason: "heartbeat" }; | ||
| if (runKind !== "main" && runKind !== "unknown") return { allow: false, reason: runKind }; |
There was a problem hiding this comment.
🚩 Behavioral change: 'unknown' runKind now permits memory injection
The old shouldSkipMemoryInjection blocked all non-"main" runKinds including "unknown" (src/index.ts:970). The new classifyTurnForMemory explicitly allows "unknown" through: if (runKind !== "main" && runKind !== "unknown"). This means that contexts with no session key and no runKind (which resolve to "unknown" via resolveHookRunKind at src/index.ts:904-912) now allow memory injection in before_agent_start, whereas previously they were conservatively blocked. This only affects the recall path — the old capture path (agent_end) never checked runKind. The change is likely intentional to avoid over-blocking when context is minimal, but it relaxes a safety guard.
Was this helpful? React with 👍 or 👎 to provide feedback.
| function hasSyntheticUserTurn(messages: unknown[] | undefined): boolean { | ||
| if (!Array.isArray(messages) || messages.length === 0) return false; | ||
| for (let index = messages.length - 1; index >= Math.max(0, messages.length - 10); index -= 1) { | ||
| const message = messages[index]; | ||
| if (!message || typeof message !== "object") continue; | ||
| const role = (message as Record<string, unknown>).role; | ||
| if (role !== "user") continue; | ||
| const text = extractMessageText(message); | ||
| return isSyntheticPromptText(text); | ||
| } | ||
| return false; | ||
| } |
There was a problem hiding this comment.
📝 Info: hasSyntheticUserTurn checks only the most recent user message, not all user messages
The hasSyntheticUserTurn function (src/index.ts:950-961) scans backwards through the last 10 messages and returns the synthetic status of the FIRST user message it encounters (the most recent one). It does not check whether ANY user message is synthetic — only the latest one. This means if a conversation has [user: PLAN_DECISION, assistant: ok, user: real question], the function finds real question first and returns false (allow). This is the intended design per the commit message fix: check newest user turn for synthetic gating, but worth noting: a conversation that starts with synthetic control messages but ends with a real user turn will be allowed through for capture, with the synthetic messages later filtered by extractMessages.
Was this helpful? React with 👍 or 👎 to provide feedback.
| // Skip synthetic control/system wrapper user turns, but keep assistant/system markers | ||
| // long enough for downstream capture-skip guards to see them. | ||
| if (role === "user" && isSyntheticPromptText(text)) continue; |
There was a problem hiding this comment.
📝 Info: extractMessages now only filters synthetic patterns on user role, not assistant
A previous commit in this PR changed extractMessages from if (isSyntheticPromptText(text)) continue (which filtered ALL roles) to if (role === "user" && isSyntheticPromptText(text)) continue (src/index.ts:1553-1555). The comment says "keep assistant/system markers long enough for downstream capture-skip guards to see them." This means if an assistant message happens to contain text matching SYNTHETIC_TURN_PATTERNS (e.g., an assistant echoing [PLAN_DECISION]: approved), it will now be kept in the extracted messages and sent to Cortex for capture. This is likely correct since assistant messages reflecting on control prompts are legitimate conversation content, but it's a subtle behavioral change.
Was this helpful? React with 👍 or 👎 to provide feedback.
| const SYNTHETIC_TURN_PATTERNS: RegExp[] = [ | ||
| /^\s*\[PLAN_DECISION\]:/i, | ||
| /^\s*\[QUESTION_ANSWER\]:/i, | ||
| /^\s*\(session bootstrap\)\s*$/i, | ||
| /^\s*HEARTBEAT_OK\b/i, | ||
| /^\s*Read HEARTBEAT\.md if it exists\s*$/i, | ||
| ]; |
There was a problem hiding this comment.
📝 Info: Regex patterns changed from /im to /i — multiline matching removed
The SYNTHETIC_TURN_PATTERNS regexes were changed from /im flags to /i only (src/index.ts:920-926). Without the m flag, ^ only matches the start of the entire string, not the start of each line. Since extractMessageText joins multi-block content with \n, a message like "preamble\n[PLAN_DECISION]: approved" would NOT be caught as synthetic by ^\s*\[PLAN_DECISION\]:/i (without /m). This tightening is appropriate — synthetic control messages should be standalone, and the prefix should be at position 0 — but it differs from the previous behavior where any line starting with the pattern would match.
Was this helpful? React with 👍 or 👎 to provide feedback.
Summary
Verification
Summary by CodeRabbit
Tests
New Features