fix: gate cortex memory injection to real user turns by 100yenadmin · Pull Request #23 · electricsheephq/evaos-cortex-plugin

100yenadmin · 2026-04-18T18:35:16Z

Summary

add a shared real-user-turn classifier for Cortex memory recall and capture
exclude synthetic control turns like plan decisions, question answers, heartbeat, boot, and session bootstrap from memory injection/capture
add regression coverage for turn gating and rebuild the plugin dist artifacts

Verification

npm run build
earlier in-session live deployment/restart verification confirmed no fresh MEMORY_PREAMBLE hook failures and boot turns were skipped

Summary by CodeRabbit

Tests
- Added tests for turn-gating, synthetic-turn filtering, transcript-based detection, and memory relevance decisions.
New Features
- Turn-gating to control which conversation turns are captured or injected into memory.
- Automatic detection and exclusion of synthetic/system-generated messages from captures.
- Optional observe-only shadow logging with per-owner event records, phase-specific capture/recall entries, retention controls, and preview sizing.
- Memory relevance check made available for external use.

coderabbitai · 2026-04-18T18:35:31Z

📝 Walkthrough

Walkthrough

Adds a classifier-based memory gating API and synthetic-turn detection, implements a GBrain "shadow" observe-only JSONL event logger and wires shadow event emission into recall and capture flows; adds tests for gating, extraction, and relevance functions.

Changes

Cohort / File(s)	Summary
Tests `src/__tests__/turn-gating.test.ts`	New test suite for `classifyTurnForMemory`, `extractMessages`, and `isMemoryRelevant` covering synthetic control markers, heartbeat/system text, synthetic user turns, and real user queries.
Memory gating & extraction `src/index.ts`	Adds `classifyTurnForMemory(prompt, messages, ctx) -> {allow, reason}` and synthetic-text/user-turn detection; updates `extractMessages` to filter synthetic control/user turns before capture; makes `isMemoryRelevant` exported; replaces inline skip logic with classifier-driven checks.
GBrain shadow logging & helpers `src/index.ts`	Introduces GBrain shadow observe-only configuration, `getGBrainShadowLogPath`, `createGBrainShadowEvent`, and best-effort JSONL append with optional trimming; integrates shadow event emission into recall and capture paths (failures are swallowed).
Exports & plugin metadata `src/index.ts`	Exports new utilities (`classifyTurnForMemory`, `isMemoryRelevant`, `getGBrainShadowLogPath`, `createGBrainShadowEvent`); adds observe-only config fields and provider/model metadata to plugin config/registration.

Sequence Diagram(s)

sequenceDiagram
    actor Client
    participant Memory as Memory Lane
    participant Shadow as GBrain Shadow
    participant CLI as CLI Search
    participant Log as JSONL Log

    rect rgba(100,150,200,0.5)
    Note over Memory: Memory Recall Phase
    Client->>Memory: Request with prompt + ctx
    Memory->>Memory: classifyTurnForMemory(prompt, messages, ctx)
    alt allowed
        Memory->>Memory: perform authoritative recall/injection
    end
    alt gbrainShadowEnabled && gbrainShadowRecall
        Memory->>Shadow: spawn shadow search
        Shadow->>CLI: run search CLI (e.g., bun run src/cli.ts)
        CLI-->>Shadow: search results
        Shadow->>Log: append GBrainShadowEvent (best-effort)
    end
    Memory->>Client: return response
    end

    rect rgba(150,100,200,0.5)
    Note over Memory: Memory Capture Phase
    Client->>Memory: Agent end / turn content
    Memory->>Memory: classifyTurnForMemory(turnText, messages, ctx)
    alt allowed
        Memory->>Memory: fire-and-forget client.remember() (authoritative)
    end
    alt gbrainShadowEnabled && gbrainShadowCapture
        Memory->>Shadow: spawn shadow search with extracted text
        Shadow->>CLI: run search CLI
        CLI-->>Shadow: search results
        Shadow->>Log: append GBrainShadowEvent (best-effort)
    end
    Memory->>Client: complete capture flow
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

feat: add v2 memory injection formatting #10 — Modifies memory injection screening and formatting in src/index.ts, related to classifier-driven skip logic and shadow logging integration.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 11.11% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title directly describes the main change: gating cortex memory injection to exclude synthetic turns and only process real user turns.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/cortex-real-user-turn-gating-small

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

100yenadmin · 2026-04-18T18:35:36Z

@coderabbitai review\n\nPlease focus on the real-user-turn gating logic in , especially synthetic control-turn exclusions for plan decisions, question answers, heartbeat, boot, and session bootstrap, plus any regression risk in the generated dist artifacts.

coderabbitai · 2026-04-18T18:35:49Z

@100yenadmin, I'll kick off a focused review on the real-user-turn gating logic — specifically the synthetic control-turn exclusions (plan decisions, question answers, heartbeat, boot, session bootstrap) and any regression risk in the dist artifacts.

Focusing on:

Real-user-turn classifier correctness and edge cases in the gating logic
Completeness of synthetic control-turn exclusions
Regression coverage in the new test cases
Risk surface introduced by committing generated dist artifacts

✅ Actions performed

Full review triggered.

100yenadmin · 2026-04-18T18:50:56Z

Handled the remaining actionable review item too.

Latest fix pushed: 701eeb6

preserve assistant/system event markers for downstream capture-skip detection
only drop synthetic user control turns during extraction

I also resolved the already-addressed review threads and left the informational-only one open with a reply.

devin-ai-integration

Devin Review found 3 new potential issues.

devin-ai-integration · 2026-04-18T18:58:50Z

+          if (cfg.gbrainShadowEnabled && cfg.gbrainShadowRecall) {
+            maybeLogGBrainShadowEvent(cfg, createGBrainShadowEvent({
+              phase: "recall",
+              sessionId: ctx.sessionKey,
+              providerBaseUrl: cfg.gbrainShadowProviderBaseUrl,
+              model: cfg.gbrainShadowModel,
+              prompt: event.prompt,
+              maxPromptChars: cfg.gbrainShadowMaxPromptChars,
+              status: "simulated",
+              note: "Observe-only recall seam. Shadow metadata recorded without affecting authoritative injection.",
+              divergenceFromAuthoritative: "GBrain remains non-authoritative here. This branch only records shadow metadata, not live retrieval results.",
+            }));
+          }


🚩 Synchronous file I/O on the critical recall path when gbrainShadow is enabled

At src/index.ts:2143-2155, maybeLogGBrainShadowEvent is called inside the async before_agent_start handler. This invokes appendGBrainShadowLog (src/index.ts:1442-1464) which performs appendFileSync, then readFileSync + writeFileSync for log trimming — all synchronous, blocking the Node.js event loop on the user-facing recall critical path. The same pattern occurs in the agent_end handler at src/index.ts:2216-2228, though that path is less latency-sensitive (fire-and-forget). The feature is off by default (gbrainShadowEnabled: false) and files are bounded (200 entries), but when enabled under concurrency, the sync I/O could create tail latency. Consider using async variants (fs.promises.appendFile, etc.) or deferring the write with setImmediate/process.nextTick.

Was this helpful? React with 👍 or 👎 to provide feedback.

100yenadmin · 2026-04-18T19:13:52Z

Pushed final follow-up in 7199cff.

Addressed remaining actionable review points:

allow capture when hook context resolves to unknown instead of dropping the turn
align synthetic-user-turn scanning with the same recent-message window as extraction
tighten synthetic control regexes to anchored whole-turn matches
honor configured preview caps for shadow conversation previews
document non-text/empty content behavior in extractMessageText
replace broad hook-context spread with explicit field selection

I am leaving only the informational/architectural notes that are not worth expanding this PR for.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7199cffa2d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-18T19:16:11Z

+    if (role !== "user") continue;
+    const text = extractMessageText(message);
+    return isSyntheticPromptText(text);


Check newest user message for synthetic-turn detection

hasSyntheticUserTurn iterates from oldest to newest but returns on the first user message it sees, so in a normal transcript (e.g., prior real user turn followed by a new [PLAN_DECISION] user control turn) it evaluates the older real message and exits early. That causes classifyTurnForMemory to allow recall/capture for synthetic turns, which defeats this commit’s turn-gating goal and can inject/store stale conversation context.

Useful? React with 👍 / 👎.

devin-ai-integration

Devin Review found 4 new potential issues.

devin-ai-integration · 2026-04-18T19:17:59Z

🚩 Compiled dist/ artifacts included in PR diff

The PR includes changes to dist/__tests__/turn-gating.test.js, dist/__tests__/turn-gating.test.d.ts, dist/index.js, and dist/index.d.ts. Committing compiled output is a deliberate pattern in this repo (the dist/ directory was already tracked), but the compiled test files in dist/__tests__/ may be unintentional — test files are typically not distributed.

Was this helpful? React with 👍 or 👎 to provide feedback.

devin-ai-integration · 2026-04-18T19:18:02Z

+    if (maxEntries > 0) {
+      try {
+        const lines = readFileSync(logPath, "utf8").split("\n").filter(Boolean);
+        if (lines.length > maxEntries) {
+          writeFileSync(logPath, `${lines.slice(-maxEntries).join("\n")}\n`, "utf8");
+        }
+      } catch {
+        // best-effort retention trim only
+      }


📝 Info: Shadow log retention trim has a read-after-write race window

In appendGBrainShadowLog (src/index.ts:1450-1461), the pattern is: appendFileSync → readFileSync → writeFileSync. If multiple concurrent calls run (e.g., rapid agent_end events), two calls could both read the file after their respective appends, and both decide to trim. The last writer wins, potentially dropping entries from the other call. Since this is observe-only logging with best-effort semantics and wrapped in try/catch, this won't cause functional issues, but the log could briefly exceed maxEntries or lose a few entries under high concurrency.

Was this helpful? React with 👍 or 👎 to provide feedback.

devin-ai-integration · 2026-04-18T19:18:03Z

+function buildGBrainShadowLogPath(ownerId: string): string {
+  const pluginDir = typeof __dirname === "string" ? __dirname : dirname(__filename);
+  const safeOwnerId = sanitizeShadowLogKey(ownerId);
+  return join(pluginDir, "shadow-logs", `gbrain-shadow-${safeOwnerId}.jsonl`);


🚩 Shadow log files written inside dist/plugin directory

buildGBrainShadowLogPath at src/index.ts:1438-1441 resolves the log path relative to __dirname (which resolves to the dist/ directory at runtime). This means shadow logs are written into dist/shadow-logs/. This directory may get wiped on rebuild/redeploy, and writing into the build output directory is unconventional. Consider whether a more persistent location (e.g., a configurable path or a well-known data directory) would be more appropriate for production use.

Was this helpful? React with 👍 or 👎 to provide feedback.

100yenadmin · 2026-04-18T19:19:46Z

Final CodeRabbit follow-up pushed in d1e37ce.

Addressed:

hasSyntheticUserTurn now scans the bounded recent window instead of returning on the first user message
removed the redundant second classifyTurnForMemory call in recall path and reuse a single decision

That should clear the remaining CodeRabbit change requests.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a803aac082

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-18T19:26:28Z

+
+function isSyntheticPromptText(prompt: string | undefined): boolean {
+  if (!prompt) return false;
+  if (isBootPrompt(prompt) || SYSTEM_EVENT_PATTERNS.test(prompt)) return true;


Narrow synthetic-pattern checks before rejecting user turns

This new classifier treats any prompt matching SYSTEM_EVENT_PATTERNS as synthetic, but that regex includes broad unanchored phrases such as source: subagent and type: subagent task. A real user asking about those log fields will now be misclassified as synthetic, which suppresses both recall injection and capture for that turn. Restricting this check to explicit wrapper formats (or anchoring to known system-event envelopes) avoids dropping legitimate technical conversations.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-04-18T19:26:28Z

+    appendFileSync(logPath, line, "utf8");
+
+    if (maxEntries > 0) {
+      try {
+        const lines = readFileSync(logPath, "utf8").split("\n").filter(Boolean);


Avoid synchronous shadow logging on the hot recall path

Shadow logging now performs synchronous filesystem writes/reads for every logged event (appendFileSync plus retention reads), and this path is invoked during before_agent_start before the 3s injection cutoff. When shadow mode is enabled on slower disks or busy hosts, this observe-only logging can consume recall budget and cause authoritative memory injection to be skipped. Moving log I/O off the request path (or making it async/buffered) would preserve the non-interference goal.

Useful? React with 👍 / 👎.

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/index.ts (1)
1519-1523: ⚠️ Potential issue | 🟠 Major

system control turns can still leak into capture.

classifyTurnForMemory(undefined, event.messages, hookCtx) only screens synthetic prompt text and the newest synthetic user turn. Then extractMessages drops non-user/assistant messages before the fallback boot/system checks run. If session bootstrap or other synthetic control markers arrive as system messages, they bypass classification and disappear before Lines 2215-2218, so this path can still call remember().

Run the synthetic boot/system checks on raw event.messages, or preserve system markers until capture gating is finished.

Also applies to: 1553-1556, 2200-2218
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/index.ts` around lines 1519 - 1523, The bug is that synthetic "system"
control turns can be dropped by extractMessages before
classifyTurnForMemory/boot-system gating runs, allowing control messages to leak
into remember(); fix by running the synthetic boot/system checks against the raw
event.messages (or modifying extractMessages to preserve system-role messages
until capture gating completes) so classifyTurnForMemory only evaluates safe,
pre-filtered turns; update calls around classifyTurnForMemory(event.messages,
hookCtx) and extractMessages(...) to ensure system markers are either checked
first or retained until after remember() gating is done.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/index.ts`:
- Around line 147-149: The shadow-log behavior currently defaults to enabled and
writes serialized prompt/conversation previews via createGBrainShadowEvent;
change gbrainShadowLogEnabled default to false (explicit opt-in) and/or redact
sensitive content before persistence in createGBrainShadowEvent (e.g., truncate
or replace previews beyond gbrainShadowMaxPromptChars and remove identifiable
text) and ensure writes respect gbrainShadowLogEnabled and
gbrainShadowLogMaxEntries limits; update any related code paths
(createGBrainShadowEvent and the config entries gbrainShadowLogEnabled,
gbrainShadowMaxPromptChars, gbrainShadowLogMaxEntries) so shadow-logs/
persistence only occurs when explicitly enabled and redaction/truncation is
applied.

---

Outside diff comments:
In `@src/index.ts`:
- Around line 1519-1523: The bug is that synthetic "system" control turns can be
dropped by extractMessages before classifyTurnForMemory/boot-system gating runs,
allowing control messages to leak into remember(); fix by running the synthetic
boot/system checks against the raw event.messages (or modifying extractMessages
to preserve system-role messages until capture gating completes) so
classifyTurnForMemory only evaluates safe, pre-filtered turns; update calls
around classifyTurnForMemory(event.messages, hookCtx) and extractMessages(...)
to ensure system markers are either checked first or retained until after
remember() gating is done.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 7fd4ed6c-6649-4810-89ca-03794e7fde3b

📥 Commits

Reviewing files that changed from the base of the PR and between d1e37ce and a803aac.

⛔ Files ignored due to path filters (3)

dist/index.d.ts.map is excluded by !**/dist/**, !**/*.map
dist/index.js is excluded by !**/dist/**
dist/index.js.map is excluded by !**/dist/**, !**/*.map

📒 Files selected for processing (1)

src/index.ts

📜 Review details

🔇 Additional comments (1)

src/index.ts (1)

920-986: The dist/ bundle is properly updated with the new gating logic. All new markers (classifyTurnForMemory, synthetic-user-turn, real-user-turn) appear in compiled output, and both hook call sites (lines 1735, 1872 in dist/index.js) use the updated classifier. The generated artifacts are current and will ship the correct behavior.

coderabbitai · 2026-04-18T19:27:43Z

+    gbrainShadowLogEnabled: true,
+    gbrainShadowMaxPromptChars: 2000,
+    gbrainShadowLogMaxEntries: 200,


⚠️ Potential issue | 🟠 Major

Make shadow log persistence explicit opt-in.

createGBrainShadowEvent serializes prompt and conversation previews, and gbrainShadowLogEnabled defaults to true. As soon as gbrainShadowEnabled is turned on, this writes a second plaintext copy of user content under shadow-logs/, outside Cortex retention/deletion controls.

Default file persistence to off, or redact previews before append.

🔐 Minimal safeguard

- gbrainShadowLogEnabled: true, + gbrainShadowLogEnabled: false,

Also applies to: 1498-1514

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/index.ts` around lines 147 - 149, The shadow-log behavior currently defaults to enabled and writes serialized prompt/conversation previews via createGBrainShadowEvent; change gbrainShadowLogEnabled default to false (explicit opt-in) and/or redact sensitive content before persistence in createGBrainShadowEvent (e.g., truncate or replace previews beyond gbrainShadowMaxPromptChars and remove identifiable text) and ensure writes respect gbrainShadowLogEnabled and gbrainShadowLogMaxEntries limits; update any related code paths (createGBrainShadowEvent and the config entries gbrainShadowLogEnabled, gbrainShadowMaxPromptChars, gbrainShadowLogMaxEntries) so shadow-logs/ persistence only occurs when explicitly enabled and redaction/truncation is applied.

devin-ai-integration

Devin Review found 4 new potential issues.

devin-ai-integration · 2026-04-18T19:28:38Z

-  if (isBootPrompt(prompt)) return { skip: true, lane: "boot" };
-  return { skip: false, lane: runKind };
+  if (ctx?.isHeartbeat) return { allow: false, reason: "heartbeat" };
+  if (runKind !== "main" && runKind !== "unknown") return { allow: false, reason: runKind };


🚩 Behavioral change: 'unknown' runKind now permits memory injection

The old shouldSkipMemoryInjection blocked all non-"main" runKinds including "unknown" (src/index.ts:970). The new classifyTurnForMemory explicitly allows "unknown" through: if (runKind !== "main" && runKind !== "unknown"). This means that contexts with no session key and no runKind (which resolve to "unknown" via resolveHookRunKind at src/index.ts:904-912) now allow memory injection in before_agent_start, whereas previously they were conservatively blocked. This only affects the recall path — the old capture path (agent_end) never checked runKind. The change is likely intentional to avoid over-blocking when context is minimal, but it relaxes a safety guard.

Was this helpful? React with 👍 or 👎 to provide feedback.

devin-ai-integration · 2026-04-18T19:28:39Z

+function hasSyntheticUserTurn(messages: unknown[] | undefined): boolean {
+  if (!Array.isArray(messages) || messages.length === 0) return false;
+  for (let index = messages.length - 1; index >= Math.max(0, messages.length - 10); index -= 1) {
+    const message = messages[index];
+    if (!message || typeof message !== "object") continue;
+    const role = (message as Record<string, unknown>).role;
+    if (role !== "user") continue;
+    const text = extractMessageText(message);
+    return isSyntheticPromptText(text);
+  }
+  return false;
+}


📝 Info: hasSyntheticUserTurn checks only the most recent user message, not all user messages

The hasSyntheticUserTurn function (src/index.ts:950-961) scans backwards through the last 10 messages and returns the synthetic status of the FIRST user message it encounters (the most recent one). It does not check whether ANY user message is synthetic — only the latest one. This means if a conversation has [user: PLAN_DECISION, assistant: ok, user: real question], the function finds real question first and returns false (allow). This is the intended design per the commit message fix: check newest user turn for synthetic gating, but worth noting: a conversation that starts with synthetic control messages but ends with a real user turn will be allowed through for capture, with the synthetic messages later filtered by extractMessages.

Was this helpful? React with 👍 or 👎 to provide feedback.

devin-ai-integration · 2026-04-18T19:28:40Z

+    // Skip synthetic control/system wrapper user turns, but keep assistant/system markers
+    // long enough for downstream capture-skip guards to see them.
+    if (role === "user" && isSyntheticPromptText(text)) continue;


📝 Info: extractMessages now only filters synthetic patterns on user role, not assistant

A previous commit in this PR changed extractMessages from if (isSyntheticPromptText(text)) continue (which filtered ALL roles) to if (role === "user" && isSyntheticPromptText(text)) continue (src/index.ts:1553-1555). The comment says "keep assistant/system markers long enough for downstream capture-skip guards to see them." This means if an assistant message happens to contain text matching SYNTHETIC_TURN_PATTERNS (e.g., an assistant echoing [PLAN_DECISION]: approved), it will now be kept in the extracted messages and sent to Cortex for capture. This is likely correct since assistant messages reflecting on control prompts are legitimate conversation content, but it's a subtle behavioral change.

Was this helpful? React with 👍 or 👎 to provide feedback.

devin-ai-integration · 2026-04-18T19:28:41Z

+const SYNTHETIC_TURN_PATTERNS: RegExp[] = [
+  /^\s*\[PLAN_DECISION\]:/i,
+  /^\s*\[QUESTION_ANSWER\]:/i,
+  /^\s*\(session bootstrap\)\s*$/i,
+  /^\s*HEARTBEAT_OK\b/i,
+  /^\s*Read HEARTBEAT\.md if it exists\s*$/i,
+];


📝 Info: Regex patterns changed from /im to /i — multiline matching removed

The SYNTHETIC_TURN_PATTERNS regexes were changed from /im flags to /i only (src/index.ts:920-926). Without the m flag, ^ only matches the start of the entire string, not the start of each line. Since extractMessageText joins multi-block content with \n, a message like "preamble\n[PLAN_DECISION]: approved" would NOT be caught as synthetic by ^\s*\[PLAN_DECISION\]:/i (without /m). This tightening is appropriate — synthetic control messages should be standalone, and the prefix should be at position 0 — but it differs from the previous behavior where any line starting with the pattern would match.

Was this helpful? React with 👍 or 👎 to provide feedback.

fix: gate cortex memory injection to real user turns

c33c2eb

Copilot AI review requested due to automatic review settings April 18, 2026 18:35

Copilot started reviewing on behalf of 100yenadmin April 18, 2026 18:35 View session