Skip to content

fix: gate cortex memory injection to real user turns#23

Open
100yenadmin wants to merge 6 commits into
mainfrom
fix/cortex-real-user-turn-gating-small
Open

fix: gate cortex memory injection to real user turns#23
100yenadmin wants to merge 6 commits into
mainfrom
fix/cortex-real-user-turn-gating-small

Conversation

@100yenadmin
Copy link
Copy Markdown
Member

@100yenadmin 100yenadmin commented Apr 18, 2026

Summary

  • add a shared real-user-turn classifier for Cortex memory recall and capture
  • exclude synthetic control turns like plan decisions, question answers, heartbeat, boot, and session bootstrap from memory injection/capture
  • add regression coverage for turn gating and rebuild the plugin dist artifacts

Verification

  • npm run build
  • earlier in-session live deployment/restart verification confirmed no fresh MEMORY_PREAMBLE hook failures and boot turns were skipped

Open in Devin Review

Summary by CodeRabbit

  • Tests

    • Added tests for turn-gating, synthetic-turn filtering, transcript-based detection, and memory relevance decisions.
  • New Features

    • Turn-gating to control which conversation turns are captured or injected into memory.
    • Automatic detection and exclusion of synthetic/system-generated messages from captures.
    • Optional observe-only shadow logging with per-owner event records, phase-specific capture/recall entries, retention controls, and preview sizing.
    • Memory relevance check made available for external use.

Copilot AI review requested due to automatic review settings April 18, 2026 18:35
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 18, 2026

📝 Walkthrough

Walkthrough

Adds a classifier-based memory gating API and synthetic-turn detection, implements a GBrain "shadow" observe-only JSONL event logger and wires shadow event emission into recall and capture flows; adds tests for gating, extraction, and relevance functions.

Changes

Cohort / File(s) Summary
Tests
src/__tests__/turn-gating.test.ts
New test suite for classifyTurnForMemory, extractMessages, and isMemoryRelevant covering synthetic control markers, heartbeat/system text, synthetic user turns, and real user queries.
Memory gating & extraction
src/index.ts
Adds classifyTurnForMemory(prompt, messages, ctx) -> {allow, reason} and synthetic-text/user-turn detection; updates extractMessages to filter synthetic control/user turns before capture; makes isMemoryRelevant exported; replaces inline skip logic with classifier-driven checks.
GBrain shadow logging & helpers
src/index.ts
Introduces GBrain shadow observe-only configuration, getGBrainShadowLogPath, createGBrainShadowEvent, and best-effort JSONL append with optional trimming; integrates shadow event emission into recall and capture paths (failures are swallowed).
Exports & plugin metadata
src/index.ts
Exports new utilities (classifyTurnForMemory, isMemoryRelevant, getGBrainShadowLogPath, createGBrainShadowEvent); adds observe-only config fields and provider/model metadata to plugin config/registration.

Sequence Diagram(s)

sequenceDiagram
    actor Client
    participant Memory as Memory Lane
    participant Shadow as GBrain Shadow
    participant CLI as CLI Search
    participant Log as JSONL Log

    rect rgba(100,150,200,0.5)
    Note over Memory: Memory Recall Phase
    Client->>Memory: Request with prompt + ctx
    Memory->>Memory: classifyTurnForMemory(prompt, messages, ctx)
    alt allowed
        Memory->>Memory: perform authoritative recall/injection
    end
    alt gbrainShadowEnabled && gbrainShadowRecall
        Memory->>Shadow: spawn shadow search
        Shadow->>CLI: run search CLI (e.g., bun run src/cli.ts)
        CLI-->>Shadow: search results
        Shadow->>Log: append GBrainShadowEvent (best-effort)
    end
    Memory->>Client: return response
    end

    rect rgba(150,100,200,0.5)
    Note over Memory: Memory Capture Phase
    Client->>Memory: Agent end / turn content
    Memory->>Memory: classifyTurnForMemory(turnText, messages, ctx)
    alt allowed
        Memory->>Memory: fire-and-forget client.remember() (authoritative)
    end
    alt gbrainShadowEnabled && gbrainShadowCapture
        Memory->>Shadow: spawn shadow search with extracted text
        Shadow->>CLI: run search CLI
        CLI-->>Shadow: search results
        Shadow->>Log: append GBrainShadowEvent (best-effort)
    end
    Memory->>Client: complete capture flow
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 11.11% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly describes the main change: gating cortex memory injection to exclude synthetic turns and only process real user turns.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/cortex-real-user-turn-gating-small

Comment @coderabbitai help to get the list of available commands and usage tips.

@100yenadmin
Copy link
Copy Markdown
Member Author

@coderabbitai review\n\nPlease focus on the real-user-turn gating logic in , especially synthetic control-turn exclusions for plan decisions, question answers, heartbeat, boot, and session bootstrap, plus any regression risk in the generated dist artifacts.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 18, 2026

@100yenadmin, I'll kick off a focused review on the real-user-turn gating logic — specifically the synthetic control-turn exclusions (plan decisions, question answers, heartbeat, boot, session bootstrap) and any regression risk in the dist artifacts.

Focusing on:

  • Real-user-turn classifier correctness and edge cases in the gating logic
  • Completeness of synthetic control-turn exclusions
  • Regression coverage in the new test cases
  • Risk surface introduced by committing generated dist artifacts
✅ Actions performed

Full review triggered.

devin-ai-integration[bot]

This comment was marked as resolved.

coderabbitai[bot]

This comment was marked as resolved.

chatgpt-codex-connector[bot]

This comment was marked as resolved.

This comment was marked as resolved.

coderabbitai[bot]

This comment was marked as resolved.

devin-ai-integration[bot]

This comment was marked as resolved.

chatgpt-codex-connector[bot]

This comment was marked as resolved.

@100yenadmin
Copy link
Copy Markdown
Member Author

Handled the remaining actionable review item too.

Latest fix pushed: 701eeb6

  • preserve assistant/system event markers for downstream capture-skip detection
  • only drop synthetic user control turns during extraction

I also resolved the already-addressed review threads and left the informational-only one open with a reply.

coderabbitai[bot]

This comment was marked as resolved.

chatgpt-codex-connector[bot]

This comment was marked as resolved.

Copy link
Copy Markdown

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 3 new potential issues.

Open in Devin Review

Comment thread src/index.ts
Comment on lines +2143 to +2155
if (cfg.gbrainShadowEnabled && cfg.gbrainShadowRecall) {
maybeLogGBrainShadowEvent(cfg, createGBrainShadowEvent({
phase: "recall",
sessionId: ctx.sessionKey,
providerBaseUrl: cfg.gbrainShadowProviderBaseUrl,
model: cfg.gbrainShadowModel,
prompt: event.prompt,
maxPromptChars: cfg.gbrainShadowMaxPromptChars,
status: "simulated",
note: "Observe-only recall seam. Shadow metadata recorded without affecting authoritative injection.",
divergenceFromAuthoritative: "GBrain remains non-authoritative here. This branch only records shadow metadata, not live retrieval results.",
}));
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚩 Synchronous file I/O on the critical recall path when gbrainShadow is enabled

At src/index.ts:2143-2155, maybeLogGBrainShadowEvent is called inside the async before_agent_start handler. This invokes appendGBrainShadowLog (src/index.ts:1442-1464) which performs appendFileSync, then readFileSync + writeFileSync for log trimming — all synchronous, blocking the Node.js event loop on the user-facing recall critical path. The same pattern occurs in the agent_end handler at src/index.ts:2216-2228, though that path is less latency-sensitive (fire-and-forget). The feature is off by default (gbrainShadowEnabled: false) and files are bounded (200 entries), but when enabled under concurrency, the sync I/O could create tail latency. Consider using async variants (fs.promises.appendFile, etc.) or deferring the write with setImmediate/process.nextTick.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment thread src/index.ts
Comment thread src/index.ts
@100yenadmin
Copy link
Copy Markdown
Member Author

Pushed final follow-up in 7199cff.

Addressed remaining actionable review points:

  • allow capture when hook context resolves to unknown instead of dropping the turn
  • align synthetic-user-turn scanning with the same recent-message window as extraction
  • tighten synthetic control regexes to anchored whole-turn matches
  • honor configured preview caps for shadow conversation previews
  • document non-text/empty content behavior in extractMessageText
  • replace broad hook-context spread with explicit field selection

I am leaving only the informational/architectural notes that are not worth expanding this PR for.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7199cffa2d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/index.ts
Comment on lines +956 to +958
if (role !== "user") continue;
const text = extractMessageText(message);
return isSyntheticPromptText(text);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Check newest user message for synthetic-turn detection

hasSyntheticUserTurn iterates from oldest to newest but returns on the first user message it sees, so in a normal transcript (e.g., prior real user turn followed by a new [PLAN_DECISION] user control turn) it evaluates the older real message and exits early. That causes classifyTurnForMemory to allow recall/capture for synthetic turns, which defeats this commit’s turn-gating goal and can inject/store stale conversation context.

Useful? React with 👍 / 👎.

coderabbitai[bot]

This comment was marked as resolved.

Copy link
Copy Markdown

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 4 new potential issues.

Open in Devin Review

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚩 Compiled dist/ artifacts included in PR diff

The PR includes changes to dist/__tests__/turn-gating.test.js, dist/__tests__/turn-gating.test.d.ts, dist/index.js, and dist/index.d.ts. Committing compiled output is a deliberate pattern in this repo (the dist/ directory was already tracked), but the compiled test files in dist/__tests__/ may be unintentional — test files are typically not distributed.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment thread src/index.ts
Comment thread src/index.ts
Comment on lines +1453 to +1461
if (maxEntries > 0) {
try {
const lines = readFileSync(logPath, "utf8").split("\n").filter(Boolean);
if (lines.length > maxEntries) {
writeFileSync(logPath, `${lines.slice(-maxEntries).join("\n")}\n`, "utf8");
}
} catch {
// best-effort retention trim only
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 Info: Shadow log retention trim has a read-after-write race window

In appendGBrainShadowLog (src/index.ts:1450-1461), the pattern is: appendFileSync → readFileSync → writeFileSync. If multiple concurrent calls run (e.g., rapid agent_end events), two calls could both read the file after their respective appends, and both decide to trim. The last writer wins, potentially dropping entries from the other call. Since this is observe-only logging with best-effort semantics and wrapped in try/catch, this won't cause functional issues, but the log could briefly exceed maxEntries or lose a few entries under high concurrency.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment thread src/index.ts
Comment on lines +1438 to +1441
function buildGBrainShadowLogPath(ownerId: string): string {
const pluginDir = typeof __dirname === "string" ? __dirname : dirname(__filename);
const safeOwnerId = sanitizeShadowLogKey(ownerId);
return join(pluginDir, "shadow-logs", `gbrain-shadow-${safeOwnerId}.jsonl`);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚩 Shadow log files written inside dist/plugin directory

buildGBrainShadowLogPath at src/index.ts:1438-1441 resolves the log path relative to __dirname (which resolves to the dist/ directory at runtime). This means shadow logs are written into dist/shadow-logs/. This directory may get wiped on rebuild/redeploy, and writing into the build output directory is unconventional. Consider whether a more persistent location (e.g., a configurable path or a well-known data directory) would be more appropriate for production use.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

@100yenadmin
Copy link
Copy Markdown
Member Author

Final CodeRabbit follow-up pushed in d1e37ce.

Addressed:

  • hasSyntheticUserTurn now scans the bounded recent window instead of returning on the first user message
  • removed the redundant second classifyTurnForMemory call in recall path and reuse a single decision

That should clear the remaining CodeRabbit change requests.

coderabbitai[bot]

This comment was marked as resolved.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a803aac082

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/index.ts

function isSyntheticPromptText(prompt: string | undefined): boolean {
if (!prompt) return false;
if (isBootPrompt(prompt) || SYSTEM_EVENT_PATTERNS.test(prompt)) return true;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Narrow synthetic-pattern checks before rejecting user turns

This new classifier treats any prompt matching SYSTEM_EVENT_PATTERNS as synthetic, but that regex includes broad unanchored phrases such as source: subagent and type: subagent task. A real user asking about those log fields will now be misclassified as synthetic, which suppresses both recall injection and capture for that turn. Restricting this check to explicit wrapper formats (or anchoring to known system-event envelopes) avoids dropping legitimate technical conversations.

Useful? React with 👍 / 👎.

Comment thread src/index.ts
Comment on lines +1451 to +1455
appendFileSync(logPath, line, "utf8");

if (maxEntries > 0) {
try {
const lines = readFileSync(logPath, "utf8").split("\n").filter(Boolean);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Avoid synchronous shadow logging on the hot recall path

Shadow logging now performs synchronous filesystem writes/reads for every logged event (appendFileSync plus retention reads), and this path is invoked during before_agent_start before the 3s injection cutoff. When shadow mode is enabled on slower disks or busy hosts, this observe-only logging can consume recall budget and cause authoritative memory injection to be skipped. Moving log I/O off the request path (or making it async/buffered) would preserve the non-interference goal.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/index.ts (1)

1519-1523: ⚠️ Potential issue | 🟠 Major

system control turns can still leak into capture.

classifyTurnForMemory(undefined, event.messages, hookCtx) only screens synthetic prompt text and the newest synthetic user turn. Then extractMessages drops non-user/assistant messages before the fallback boot/system checks run. If session bootstrap or other synthetic control markers arrive as system messages, they bypass classification and disappear before Lines 2215-2218, so this path can still call remember().

Run the synthetic boot/system checks on raw event.messages, or preserve system markers until capture gating is finished.

Also applies to: 1553-1556, 2200-2218

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/index.ts` around lines 1519 - 1523, The bug is that synthetic "system"
control turns can be dropped by extractMessages before
classifyTurnForMemory/boot-system gating runs, allowing control messages to leak
into remember(); fix by running the synthetic boot/system checks against the raw
event.messages (or modifying extractMessages to preserve system-role messages
until capture gating completes) so classifyTurnForMemory only evaluates safe,
pre-filtered turns; update calls around classifyTurnForMemory(event.messages,
hookCtx) and extractMessages(...) to ensure system markers are either checked
first or retained until after remember() gating is done.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/index.ts`:
- Around line 147-149: The shadow-log behavior currently defaults to enabled and
writes serialized prompt/conversation previews via createGBrainShadowEvent;
change gbrainShadowLogEnabled default to false (explicit opt-in) and/or redact
sensitive content before persistence in createGBrainShadowEvent (e.g., truncate
or replace previews beyond gbrainShadowMaxPromptChars and remove identifiable
text) and ensure writes respect gbrainShadowLogEnabled and
gbrainShadowLogMaxEntries limits; update any related code paths
(createGBrainShadowEvent and the config entries gbrainShadowLogEnabled,
gbrainShadowMaxPromptChars, gbrainShadowLogMaxEntries) so shadow-logs/
persistence only occurs when explicitly enabled and redaction/truncation is
applied.

---

Outside diff comments:
In `@src/index.ts`:
- Around line 1519-1523: The bug is that synthetic "system" control turns can be
dropped by extractMessages before classifyTurnForMemory/boot-system gating runs,
allowing control messages to leak into remember(); fix by running the synthetic
boot/system checks against the raw event.messages (or modifying extractMessages
to preserve system-role messages until capture gating completes) so
classifyTurnForMemory only evaluates safe, pre-filtered turns; update calls
around classifyTurnForMemory(event.messages, hookCtx) and extractMessages(...)
to ensure system markers are either checked first or retained until after
remember() gating is done.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 7fd4ed6c-6649-4810-89ca-03794e7fde3b

📥 Commits

Reviewing files that changed from the base of the PR and between d1e37ce and a803aac.

⛔ Files ignored due to path filters (3)
  • dist/index.d.ts.map is excluded by !**/dist/**, !**/*.map
  • dist/index.js is excluded by !**/dist/**
  • dist/index.js.map is excluded by !**/dist/**, !**/*.map
📒 Files selected for processing (1)
  • src/index.ts
📜 Review details
🔇 Additional comments (1)
src/index.ts (1)

920-986: The dist/ bundle is properly updated with the new gating logic. All new markers (classifyTurnForMemory, synthetic-user-turn, real-user-turn) appear in compiled output, and both hook call sites (lines 1735, 1872 in dist/index.js) use the updated classifier. The generated artifacts are current and will ship the correct behavior.

Comment thread src/index.ts
Comment on lines +147 to +149
gbrainShadowLogEnabled: true,
gbrainShadowMaxPromptChars: 2000,
gbrainShadowLogMaxEntries: 200,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Make shadow log persistence explicit opt-in.

createGBrainShadowEvent serializes prompt and conversation previews, and gbrainShadowLogEnabled defaults to true. As soon as gbrainShadowEnabled is turned on, this writes a second plaintext copy of user content under shadow-logs/, outside Cortex retention/deletion controls.

Default file persistence to off, or redact previews before append.

🔐 Minimal safeguard
-    gbrainShadowLogEnabled: true,
+    gbrainShadowLogEnabled: false,

Also applies to: 1498-1514

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/index.ts` around lines 147 - 149, The shadow-log behavior currently
defaults to enabled and writes serialized prompt/conversation previews via
createGBrainShadowEvent; change gbrainShadowLogEnabled default to false
(explicit opt-in) and/or redact sensitive content before persistence in
createGBrainShadowEvent (e.g., truncate or replace previews beyond
gbrainShadowMaxPromptChars and remove identifiable text) and ensure writes
respect gbrainShadowLogEnabled and gbrainShadowLogMaxEntries limits; update any
related code paths (createGBrainShadowEvent and the config entries
gbrainShadowLogEnabled, gbrainShadowMaxPromptChars, gbrainShadowLogMaxEntries)
so shadow-logs/ persistence only occurs when explicitly enabled and
redaction/truncation is applied.

Copy link
Copy Markdown

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 4 new potential issues.

Open in Devin Review

Comment thread src/index.ts
if (isBootPrompt(prompt)) return { skip: true, lane: "boot" };
return { skip: false, lane: runKind };
if (ctx?.isHeartbeat) return { allow: false, reason: "heartbeat" };
if (runKind !== "main" && runKind !== "unknown") return { allow: false, reason: runKind };
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚩 Behavioral change: 'unknown' runKind now permits memory injection

The old shouldSkipMemoryInjection blocked all non-"main" runKinds including "unknown" (src/index.ts:970). The new classifyTurnForMemory explicitly allows "unknown" through: if (runKind !== "main" && runKind !== "unknown"). This means that contexts with no session key and no runKind (which resolve to "unknown" via resolveHookRunKind at src/index.ts:904-912) now allow memory injection in before_agent_start, whereas previously they were conservatively blocked. This only affects the recall path — the old capture path (agent_end) never checked runKind. The change is likely intentional to avoid over-blocking when context is minimal, but it relaxes a safety guard.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment thread src/index.ts
Comment on lines +950 to +961
function hasSyntheticUserTurn(messages: unknown[] | undefined): boolean {
if (!Array.isArray(messages) || messages.length === 0) return false;
for (let index = messages.length - 1; index >= Math.max(0, messages.length - 10); index -= 1) {
const message = messages[index];
if (!message || typeof message !== "object") continue;
const role = (message as Record<string, unknown>).role;
if (role !== "user") continue;
const text = extractMessageText(message);
return isSyntheticPromptText(text);
}
return false;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 Info: hasSyntheticUserTurn checks only the most recent user message, not all user messages

The hasSyntheticUserTurn function (src/index.ts:950-961) scans backwards through the last 10 messages and returns the synthetic status of the FIRST user message it encounters (the most recent one). It does not check whether ANY user message is synthetic — only the latest one. This means if a conversation has [user: PLAN_DECISION, assistant: ok, user: real question], the function finds real question first and returns false (allow). This is the intended design per the commit message fix: check newest user turn for synthetic gating, but worth noting: a conversation that starts with synthetic control messages but ends with a real user turn will be allowed through for capture, with the synthetic messages later filtered by extractMessages.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment thread src/index.ts
Comment on lines +1553 to +1555
// Skip synthetic control/system wrapper user turns, but keep assistant/system markers
// long enough for downstream capture-skip guards to see them.
if (role === "user" && isSyntheticPromptText(text)) continue;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 Info: extractMessages now only filters synthetic patterns on user role, not assistant

A previous commit in this PR changed extractMessages from if (isSyntheticPromptText(text)) continue (which filtered ALL roles) to if (role === "user" && isSyntheticPromptText(text)) continue (src/index.ts:1553-1555). The comment says "keep assistant/system markers long enough for downstream capture-skip guards to see them." This means if an assistant message happens to contain text matching SYNTHETIC_TURN_PATTERNS (e.g., an assistant echoing [PLAN_DECISION]: approved), it will now be kept in the extracted messages and sent to Cortex for capture. This is likely correct since assistant messages reflecting on control prompts are legitimate conversation content, but it's a subtle behavioral change.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment thread src/index.ts
Comment on lines +920 to +926
const SYNTHETIC_TURN_PATTERNS: RegExp[] = [
/^\s*\[PLAN_DECISION\]:/i,
/^\s*\[QUESTION_ANSWER\]:/i,
/^\s*\(session bootstrap\)\s*$/i,
/^\s*HEARTBEAT_OK\b/i,
/^\s*Read HEARTBEAT\.md if it exists\s*$/i,
];
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 Info: Regex patterns changed from /im to /i — multiline matching removed

The SYNTHETIC_TURN_PATTERNS regexes were changed from /im flags to /i only (src/index.ts:920-926). Without the m flag, ^ only matches the start of the entire string, not the start of each line. Since extractMessageText joins multi-block content with \n, a message like "preamble\n[PLAN_DECISION]: approved" would NOT be caught as synthetic by ^\s*\[PLAN_DECISION\]:/i (without /m). This tightening is appropriate — synthetic control messages should be standalone, and the prefix should be at position 0 — but it differs from the previous behavior where any line starting with the pattern would match.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants