Skip to content

fix: restore Cortex memory injection preamble#9

Open
100yenadmin wants to merge 4 commits into
mainfrom
sub/fix-memory-preamble
Open

fix: restore Cortex memory injection preamble#9
100yenadmin wants to merge 4 commits into
mainfrom
sub/fix-memory-preamble

Conversation

@100yenadmin
Copy link
Copy Markdown
Member

@100yenadmin 100yenadmin commented Apr 14, 2026

Summary

  • restore the v2 MEMORY_PREAMBLE constant
  • restore rich memory injection formatting with score, metadata, id suffix, and footer
  • pass total memory count into formatMemoryContext for the footer

Verification

  • npm run build
  • copied built dist/index.js into local cortex extension
  • verified built output contains the restored preamble, footer, and strip regex

Open with Devin

Summary by CodeRabbit

  • New Features

    • Enhanced memory safety with configurable injection screening controls to better protect against malicious inputs.
    • Improved memory context display with confidence scoring and item identification in results.
  • Chores

    • Added deployment verification to ensure artifact consistency across environments.

Copilot AI review requested due to automatic review settings April 14, 2026 15:35
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 14, 2026

📝 Walkthrough

Walkthrough

Added a deployment parity verification script and extended EvaMemoryConfig with injection-screening controls. Introduced APIs for detecting injection modes and filtering retrieved items based on configurable thresholds, hard floors, and deterministic suppression rules.

Changes

Cohort / File(s) Summary
Deployment Tooling
scripts/check-deploy-parity.sh
New Bash script enabling strict mode to compare built plugin artifact against locally deployed version. Exits with status code 1 and prints "DRIFT" message if files differ; prints "PLUGIN PARITY OK" on match.
Injection Screening Feature
src/index.ts
Extended EvaMemoryConfig with injection-screening controls (enableInjectionScreening, injectionHardFloor, per-mode thresholds). Added exported APIs: detectInjectionMode() for prompt classification (critical/technical/personal) and screenInjectionCandidates() for filtering items using hard rules, floor thresholds, and dynamic confidence scoring. Integrated screening into formatMemoryContext() pipeline with enhanced output formatting (percentage scores, item IDs, memory count summary).

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 60.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title describes restoring a memory injection preamble, which is the main change in src/index.ts. However, the PR also adds a new deployment parity check script and introduces significant injection screening functionality, neither of which are reflected in the title.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch sub/fix-memory-preamble

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Restores the Cortex memory injection preamble/formatting and introduces an injection-screening step to filter retrieved memories before they’re injected into the agent context.

Changes:

  • Reintroduce a rich MEMORY_PREAMBLE, add per-memory formatting (score, metadata, id suffix), and add a footer showing how many memories were injected.
  • Add configurable “injection screening” (mode detection + thresholding / contradiction suppression) and wire it into the recall injection pipeline.
  • Add a local deploy parity check script and update built dist/ artifacts accordingly.

Reviewed changes

Copilot reviewed 2 out of 6 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
src/index.ts Restores memory injection formatting/preamble/footer and adds injection screening logic + config fields.
scripts/check-deploy-parity.sh Adds a helper script to detect drift between repo build output and a locally deployed plugin build.
dist/index.js Rebuilt JS output reflecting the updated injection formatting and screening.
dist/index.js.map Rebuilt source map for updated JS output.
dist/index.d.ts Rebuilt type declarations reflecting new config fields and exported helpers.
dist/index.d.ts.map Rebuilt source map for updated type declarations.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/index.ts
Comment on lines +922 to +935
const CRITICAL_KEYWORDS = /bench-\d{8}-\d{6}|deploy|cortex error|config operation|migration|fly deploy|openclaw gateway|prod/i;
const RUN_ID_RE = /bench-\d{8}-\d{6}/g;
const GIT_TOKENS = /\b(git|PR #|commit|branch)\b/i;
const FILE_PATH_RE = /[./\\][a-zA-Z0-9_\-./\\]{2,}/;
const LIVENESS_CLAIM = /\b(still active|still running|is running|is active|is alive|currently running)\b/i;
const DEATH_CLAIM = /\b(was killed|is dead|died|crashed|no listener|restarted|dead\b|killed\b|stalled)\b/i;

/**
* Classify the current turn into an injection mode.
* critical > technical > personal (first match wins).
*/
export function detectInjectionMode(promptText: string): InjectionMode {
if (CRITICAL_KEYWORDS.test(promptText) || RUN_ID_RE.test(promptText)) return "critical";
if (
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0af21a6beb

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/index.ts
type InjectionMode = "critical" | "technical" | "personal";

const TECHNICAL_KEYWORDS = /\b(bench|cortex|debug|config|log|error|exception|script|deploy|git|commit|branch|pytest|migration|run|adapter)\b/i;
const CRITICAL_KEYWORDS = /bench-\d{8}-\d{6}|deploy|cortex error|config operation|migration|fly deploy|openclaw gateway|prod/i;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Restrict prod keyword to whole-word critical matches

The critical-mode classifier currently uses prod as an unbounded substring, so ordinary prompts like “product roadmap” or “productivity tips” are treated as critical. In this path, screenInjectionCandidates applies the highest threshold (injectionCriticalThreshold, default 0.75), which can drop otherwise relevant memories and make recall appear broken in non-critical conversations. Use a word-bounded token (for example \bprod\b) or a more specific production phrase to avoid these false positives.

Useful? React with 👍 / 👎.

Comment thread src/index.ts
const CRITICAL_KEYWORDS = /bench-\d{8}-\d{6}|deploy|cortex error|config operation|migration|fly deploy|openclaw gateway|prod/i;
const RUN_ID_RE = /bench-\d{8}-\d{6}/g;
const GIT_TOKENS = /\b(git|PR #|commit|branch)\b/i;
const FILE_PATH_RE = /[./\\][a-zA-Z0-9_\-./\\]{2,}/;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Narrow file-path regex to avoid classifying normal text as technical

FILE_PATH_RE is broad enough to match many non-path snippets (for example 1.23, U.S., or ...), so personal prompts that contain punctuation can be misclassified as technical. Because technical mode raises screening thresholds and enables category-lane filtering, this can suppress recall in everyday chats unrelated to code/files. Tightening this pattern to require path-like structure (such as a slash-separated segment or extension pattern) would prevent widespread false technical classification.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 7 potential issues.

Open in Devin Review

Comment thread src/index.ts
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Log message reports pre-screening count instead of actual injected count

After injection screening was added, the log at line 1691 still reports filtered.length (the count before screening) instead of screened.length (the count after screening). When screening drops memories, the log will overstate how many memories are being injected, making debugging injection issues misleading. For example, if filtered has 8 items but screening drops 5, the log says "injecting 8 memories" when only 3 survived.

(Refers to line 1691)

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment thread src/index.ts
* critical > technical > personal (first match wins).
*/
export function detectInjectionMode(promptText: string): InjectionMode {
if (CRITICAL_KEYWORDS.test(promptText) || RUN_ID_RE.test(promptText)) return "critical";
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Stateful /g regex used with .test() causes non-deterministic mode detection

RUN_ID_RE at src/index.ts:923 is declared with the global flag (/g) because it's needed for .matchAll() at lines 974 and 985. However, it's also used with .test() at line 934 inside the exported detectInjectionMode. When .test() is called on a global regex and finds a match, it advances lastIndex, causing subsequent .test() calls to start from a non-zero position and potentially miss the match. In the current code this is mitigated by short-circuit evaluation: CRITICAL_KEYWORDS (line 922) contains the same bench-\d{8}-\d{6} pattern and is tested first, so RUN_ID_RE.test() is only reached when there's no bench run ID (returning false, not advancing lastIndex). However, this is fragile — if CRITICAL_KEYWORDS is ever modified or the || order changed, the bug activates, causing detectInjectionMode to alternate between returning "critical" and "technical"/"personal" on consecutive calls with the same input containing a bench ID.

Suggested change
if (CRITICAL_KEYWORDS.test(promptText) || RUN_ID_RE.test(promptText)) return "critical";
if (CRITICAL_KEYWORDS.test(promptText) || /bench-\d{8}-\d{6}/.test(promptText)) return "critical";
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment thread src/index.ts
type InjectionMode = "critical" | "technical" | "personal";

const TECHNICAL_KEYWORDS = /\b(bench|cortex|debug|config|log|error|exception|script|deploy|git|commit|branch|pytest|migration|run|adapter)\b/i;
const CRITICAL_KEYWORDS = /bench-\d{8}-\d{6}|deploy|cortex error|config operation|migration|fly deploy|openclaw gateway|prod/i;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚩 CRITICAL_KEYWORDS regex lacks word boundaries on prod

The CRITICAL_KEYWORDS regex at src/index.ts:922 matches prod without \b word boundaries. This means prompts containing "product", "productive", "reproduce", or "productivity" will be classified as critical mode, applying the highest threshold (0.75 by default) and aggressively filtering out memories. If this plugin is used in conversations that naturally reference products or productivity, relevant memories could be silently dropped. The intent is likely to match "prod" as shorthand for production environment. Consider adding \bprod\b or \bprod(uction)?\b to be more precise.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment thread src/index.ts
// --- Layer 1.2: Category lane filter (technical sessions) ---
if (mode === "technical" || mode === "critical") {
const isPersonalCategory = category === "episodic" || category === "personal" || category === "relational" || category === "identity";
if (isPersonalCategory && score < 0.70) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 Info: Layer 1.2 category filter uses hardcoded 0.70 threshold, not configurable

The category lane filter at src/index.ts:1030 uses a hardcoded 0.70 threshold for suppressing personal/episodic memories in technical/critical mode. All other thresholds (hard floor, critical, technical, personal) are configurable via cfg. This inconsistency means operators cannot tune this specific behavior. If 0.70 is intentional as a fixed rule, a comment explaining why would help; otherwise, it should be a config field like the others.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment thread src/index.ts
const GIT_TOKENS = /\b(git|PR #|commit|branch)\b/i;
const FILE_PATH_RE = /[./\\][a-zA-Z0-9_\-./\\]{2,}/;
const LIVENESS_CLAIM = /\b(still active|still running|is running|is active|is alive|currently running)\b/i;
const DEATH_CLAIM = /\b(was killed|is dead|died|crashed|no listener|restarted|dead\b|killed\b|stalled)\b/i;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 Info: Redundant \b anchors inside DEATH_CLAIM alternatives

The DEATH_CLAIM regex at src/index.ts:927 has dead\b and killed\b as alternatives inside a group that already has a trailing \b: /\b(was killed|is dead|died|crashed|no listener|restarted|dead\b|killed\b|stalled)\b/i. The inner \b on dead\b and killed\b is redundant with the outer \b — two consecutive \b at the same position are semantically identical to one. This is harmless but reduces readability. The intent seems to be matching standalone "dead" and "killed" (without the prefix like "is dead" or "was killed"), which the outer \b already handles.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment thread src/index.ts
Comment on lines +1081 to +1082
lines.push(MEMORY_PREAMBLE);
let charCount = MEMORY_PREAMBLE.length;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 Info: MEMORY_PREAMBLE length counted against maxInjectionChars budget

The MEMORY_PREAMBLE constant (~800 chars) is now counted against the maxInjectionChars budget (src/index.ts:1082). With the default budget of 8000 chars, this consumes ~10% of the budget before any actual memory content is added. This is a behavioral change from the previous implementation where no preamble existed and the full budget went to memory items. If the budget is set low (e.g., 2000 chars), the preamble alone takes ~40%, leaving room for significantly fewer memories. This is likely intentional but worth noting for operators tuning maxInjectionChars.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment thread src/index.ts
const screened = cfg.enableInjectionScreening
? screenInjectionCandidates(filtered, event.prompt ?? "", cfg, (msg) => api.logger.info(msg))
: filtered;
const context = formatMemoryContext(screened, cfg.maxInjectionChars, filtered.length, cfg.maxInjectedMemories, cfg.minRelevanceScore);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 Info: Screening applies after echo filter but log reports pre-screening count

The integration at src/index.ts:1680-1684 correctly chains echo filtering → injection screening → formatting. However, totalCount passed to formatMemoryContext is filtered.length (post-echo, pre-screening), which means the "X of Y memories shown" footer message tells the LLM how many memories were available before screening. This seems intentional — it lets the LLM know there are more memories available via cortex_search even if screening dropped them. The separate log message count issue was reported as a bug.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@scripts/check-deploy-parity.sh`:
- Around line 5-10: The script uses fixed paths (REPO_DIST="dist/index.js" and a
hardcoded cd) which break when run outside that checkout; make the script
location-independent by resolving the repository root relative to the script
(e.g., use git rev-parse --show-toplevel or $(dirname "$0") to compute REPO_DIR)
and then set REPO_DIST="$REPO_DIR/dist/index.js" and update the suggested fix
command to reference that same REPO_DIR instead of cd ~/repos/.... Also ensure
LOCAL_DIST remains quoted and allow overriding via env vars if desired so diff
compares the correct absolute paths when invoked from any CWD.

In `@src/index.ts`:
- Around line 137-141: The numeric injection thresholds (injectionHardFloor,
injectionCriticalThreshold, injectionTechnicalThreshold,
injectionPersonalThreshold) must be validated/coerced to the [0,1] range inside
parseConfig: when c.<field> is a number, clamp it to Math.max(0, Math.min(1,
c.<field>)) and treat non-numbers/NaN as the corresponding defaults; preserve
current behavior for enableInjectionScreening. Also update the corresponding
schema entries to include minimum: 0 and maximum: 1 for each threshold so the
JSON/schema validation mirrors the runtime clamping.
- Around line 1007-1017: The log lines that print memory content previews (using
content.slice) are a privacy leak; update the two places that call
log?.(`[cortex-inject] ... ${content.slice(0, 80)}`) to avoid emitting stored
memory text and instead log non-sensitive metadata (e.g., memory ID,
category/tag, and incremented dropped count). Locate the block guarded by
LIVENESS_CLAIM and DEATH_CLAIM and the earlier stale run-state branch where
dropped++ occurs, replace content.slice usage with a safe identifier or category
(e.g., memory.id or memory.type) and a concise message like "[cortex-inject]
dropped contradicted memory: id=<id> category=<cat> totalDropped=<n>" so only
IDs/counts are logged.
- Around line 984-1005: The current logic sets isStale true whenever
promptHasDeathClaim is true, which bypasses the per-run checks; change the
conditional so that promptHasDeathClaim is only considered in the context of the
run IDs found in contentRunIds. Specifically, remove the unconditional branch
that sets isStale when promptHasDeathClaim is true and instead evaluate death
claims per run: iterate contentRunIds and for each runId check (1) if
promptRunIds.has(runId) && DEATH_CLAIM.test(promptText) then mark isStale, and
(2) if !promptRunIds.has(runId) && promptHasDeathClaim then mark isStale for
that run; ensure the outer check still requires LIVENESS_CLAIM.test(content) and
preserve use of symbols contentRunIds, LIVENESS_CLAIM, promptHasDeathClaim,
promptRunIds, DEATH_CLAIM, promptText, and isStale.
- Around line 923-940: The RUN_ID_RE regex is declared with the global 'g' flag
so using it in detectInjectionMode(promptText) via .test() makes the regex
stateful and breaks later .matchAll() usage; fix by adding a non-global variant
(e.g., RUN_ID_TEST_RE = /bench-\d{8}-\d{6}/) and change detectInjectionMode to
use RUN_ID_TEST_RE while keeping the original RUN_ID_RE (with 'g') for any
.matchAll() calls; update references so detectInjectionMode, RUN_ID_RE, and
RUN_ID_TEST_RE are used as described.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 0230bd26-618e-48ea-b7c2-10ab893ce053

📥 Commits

Reviewing files that changed from the base of the PR and between 4f87a34 and 0af21a6.

⛔ Files ignored due to path filters (4)
  • dist/index.d.ts is excluded by !**/dist/**
  • dist/index.d.ts.map is excluded by !**/dist/**, !**/*.map
  • dist/index.js is excluded by !**/dist/**
  • dist/index.js.map is excluded by !**/dist/**, !**/*.map
📒 Files selected for processing (2)
  • scripts/check-deploy-parity.sh
  • src/index.ts
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Upload results
🔇 Additional comments (1)
src/index.ts (1)

1050-1110: Restored memory block formatting looks coherent.

Always adding MEMORY_PREAMBLE and updating the empty-state guard to lines.length <= 2 keeps the restored footer path safe when no memory line fits.

Comment on lines +5 to +10
REPO_DIST="dist/index.js"
LOCAL_DIST="$HOME/.openclaw/extensions/cortex/dist/index.js"

if ! diff -q "$REPO_DIST" "$LOCAL_DIST" >/dev/null 2>&1; then
echo "DRIFT: repo dist differs from local deployed"
echo "Run: cd ~/repos/evaos-cortex-plugin && npm run build && cp -r dist/* ~/.openclaw/extensions/cortex/dist/"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Make the parity check location-independent.

REPO_DIST="dist/index.js" and the hardcoded cd ~/repos/evaos-cortex-plugin assume the script is always run from one specific checkout. Invoking it from any other working directory will report false drift.

Suggested fix
-REPO_DIST="dist/index.js"
-LOCAL_DIST="$HOME/.openclaw/extensions/cortex/dist/index.js"
+SCRIPT_DIR="$(cd -- "$(dirname -- "${BASH_SOURCE[0]}")" && pwd)"
+REPO_ROOT="$(cd -- "${SCRIPT_DIR}/.." && pwd)"
+REPO_DIST="${REPO_ROOT}/dist/index.js"
+LOCAL_DIST="${HOME}/.openclaw/extensions/cortex/dist/index.js"
+
+[[ -f "$REPO_DIST" ]] || { echo "Missing repo artifact: $REPO_DIST"; exit 1; }
+[[ -f "$LOCAL_DIST" ]] || { echo "Missing deployed artifact: $LOCAL_DIST"; exit 1; }
 
 if ! diff -q "$REPO_DIST" "$LOCAL_DIST" >/dev/null 2>&1; then
   echo "DRIFT: repo dist differs from local deployed"
-  echo "Run: cd ~/repos/evaos-cortex-plugin && npm run build && cp -r dist/* ~/.openclaw/extensions/cortex/dist/"
+  echo "Run: cd \"${REPO_ROOT}\" && npm run build && cp -r dist/* \"${HOME}/.openclaw/extensions/cortex/dist/\""
   exit 1
 fi
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
REPO_DIST="dist/index.js"
LOCAL_DIST="$HOME/.openclaw/extensions/cortex/dist/index.js"
if ! diff -q "$REPO_DIST" "$LOCAL_DIST" >/dev/null 2>&1; then
echo "DRIFT: repo dist differs from local deployed"
echo "Run: cd ~/repos/evaos-cortex-plugin && npm run build && cp -r dist/* ~/.openclaw/extensions/cortex/dist/"
SCRIPT_DIR="$(cd -- "$(dirname -- "${BASH_SOURCE[0]}")" && pwd)"
REPO_ROOT="$(cd -- "${SCRIPT_DIR}/.." && pwd)"
REPO_DIST="${REPO_ROOT}/dist/index.js"
LOCAL_DIST="${HOME}/.openclaw/extensions/cortex/dist/index.js"
[[ -f "$REPO_DIST" ]] || { echo "Missing repo artifact: $REPO_DIST"; exit 1; }
[[ -f "$LOCAL_DIST" ]] || { echo "Missing deployed artifact: $LOCAL_DIST"; exit 1; }
if ! diff -q "$REPO_DIST" "$LOCAL_DIST" >/dev/null 2>&1; then
echo "DRIFT: repo dist differs from local deployed"
echo "Run: cd \"${REPO_ROOT}\" && npm run build && cp -r dist/* \"${HOME}/.openclaw/extensions/cortex/dist/\""
exit 1
fi
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/check-deploy-parity.sh` around lines 5 - 10, The script uses fixed
paths (REPO_DIST="dist/index.js" and a hardcoded cd) which break when run
outside that checkout; make the script location-independent by resolving the
repository root relative to the script (e.g., use git rev-parse --show-toplevel
or $(dirname "$0") to compute REPO_DIR) and then set
REPO_DIST="$REPO_DIR/dist/index.js" and update the suggested fix command to
reference that same REPO_DIR instead of cd ~/repos/.... Also ensure LOCAL_DIST
remains quoted and allow overriding via env vars if desired so diff compares the
correct absolute paths when invoked from any CWD.

Comment thread src/index.ts
Comment on lines +137 to +141
enableInjectionScreening: c.enableInjectionScreening !== false,
injectionHardFloor: typeof c.injectionHardFloor === "number" ? c.injectionHardFloor : defaults.injectionHardFloor,
injectionCriticalThreshold: typeof c.injectionCriticalThreshold === "number" ? c.injectionCriticalThreshold : defaults.injectionCriticalThreshold,
injectionTechnicalThreshold: typeof c.injectionTechnicalThreshold === "number" ? c.injectionTechnicalThreshold : defaults.injectionTechnicalThreshold,
injectionPersonalThreshold: typeof c.injectionPersonalThreshold === "number" ? c.injectionPersonalThreshold : defaults.injectionPersonalThreshold,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Clamp the new screening thresholds to [0, 1].

These fields are interpreted as unit scores. Right now values like 75, -1, or NaN parse cleanly and can silently disable recall or let everything through. Validate them in parseConfig and mirror that with minimum/maximum in the schema.

Suggested fix
+  const clampUnit = (value: unknown, fallback: number): number =>
+    typeof value === "number" && Number.isFinite(value)
+      ? Math.min(1, Math.max(0, value))
+      : fallback;
+
   return {
@@
-    injectionHardFloor: typeof c.injectionHardFloor === "number" ? c.injectionHardFloor : defaults.injectionHardFloor,
-    injectionCriticalThreshold: typeof c.injectionCriticalThreshold === "number" ? c.injectionCriticalThreshold : defaults.injectionCriticalThreshold,
-    injectionTechnicalThreshold: typeof c.injectionTechnicalThreshold === "number" ? c.injectionTechnicalThreshold : defaults.injectionTechnicalThreshold,
-    injectionPersonalThreshold: typeof c.injectionPersonalThreshold === "number" ? c.injectionPersonalThreshold : defaults.injectionPersonalThreshold,
+    injectionHardFloor: clampUnit(c.injectionHardFloor, defaults.injectionHardFloor),
+    injectionCriticalThreshold: clampUnit(c.injectionCriticalThreshold, defaults.injectionCriticalThreshold),
+    injectionTechnicalThreshold: clampUnit(c.injectionTechnicalThreshold, defaults.injectionTechnicalThreshold),
+    injectionPersonalThreshold: clampUnit(c.injectionPersonalThreshold, defaults.injectionPersonalThreshold),
   };

Also applies to: 1186-1190

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/index.ts` around lines 137 - 141, The numeric injection thresholds
(injectionHardFloor, injectionCriticalThreshold, injectionTechnicalThreshold,
injectionPersonalThreshold) must be validated/coerced to the [0,1] range inside
parseConfig: when c.<field> is a number, clamp it to Math.max(0, Math.min(1,
c.<field>)) and treat non-numbers/NaN as the corresponding defaults; preserve
current behavior for enableInjectionScreening. Also update the corresponding
schema entries to include minimum: 0 and maximum: 1 for each threshold so the
JSON/schema validation mirrors the runtime clamping.

Comment thread src/index.ts
Comment on lines +923 to +940
const RUN_ID_RE = /bench-\d{8}-\d{6}/g;
const GIT_TOKENS = /\b(git|PR #|commit|branch)\b/i;
const FILE_PATH_RE = /[./\\][a-zA-Z0-9_\-./\\]{2,}/;
const LIVENESS_CLAIM = /\b(still active|still running|is running|is active|is alive|currently running)\b/i;
const DEATH_CLAIM = /\b(was killed|is dead|died|crashed|no listener|restarted|dead\b|killed\b|stalled)\b/i;

/**
* Classify the current turn into an injection mode.
* critical > technical > personal (first match wins).
*/
export function detectInjectionMode(promptText: string): InjectionMode {
if (CRITICAL_KEYWORDS.test(promptText) || RUN_ID_RE.test(promptText)) return "critical";
if (
TECHNICAL_KEYWORDS.test(promptText) ||
GIT_TOKENS.test(promptText) ||
FILE_PATH_RE.test(promptText)
) return "technical";
return "personal";
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cat -n src/index.ts | sed -n '920,950p'

Repository: 100yenadmin/evaos-cortex-plugin

Length of output: 1722


🏁 Script executed:

rg "RUN_ID_RE" src/index.ts -n

Repository: 100yenadmin/evaos-cortex-plugin

Length of output: 378


🏁 Script executed:

rg "matchAll" src/index.ts -n -B2 -A2

Repository: 100yenadmin/evaos-cortex-plugin

Length of output: 617


🏁 Script executed:

wc -l src/index.ts

Repository: 100yenadmin/evaos-cortex-plugin

Length of output: 92


🏁 Script executed:

cat -n src/index.ts | sed -n '1,100p'

Repository: 100yenadmin/evaos-cortex-plugin

Length of output: 4220


🏁 Script executed:

rg "const RUN_ID_RE|const CRITICAL_KEYWORDS|const TECHNICAL_KEYWORDS|const GIT_TOKENS|const FILE_PATH_RE|const LIVENESS_CLAIM|const DEATH_CLAIM" src/index.ts -n

Repository: 100yenadmin/evaos-cortex-plugin

Length of output: 735


🏁 Script executed:

rg "detectInjectionMode" src/index.ts -n -B5 -A15

Repository: 100yenadmin/evaos-cortex-plugin

Length of output: 1800


🏁 Script executed:

cat -n src/index.ts | sed -n '900,960p'

Repository: 100yenadmin/evaos-cortex-plugin

Length of output: 3028


Use a separate regex without the g flag for mode detection.

RUN_ID_RE has the g flag, which mutates lastIndex on .test() calls. This makes detectInjectionMode() stateful—the regex position persists across invocations. Worse, the same instance is later reused in .matchAll() calls (lines 974, 985), which will start from the wrong position instead of the beginning, causing dropped or incorrect matches.

Create a test-only variant without the g flag:

Fix
const RUN_ID_RE = /bench-\d{8}-\d{6}/g;
+const RUN_ID_TEST_RE = /bench-\d{8}-\d{6}/;

Then use RUN_ID_TEST_RE in detectInjectionMode() and keep RUN_ID_RE for matchAll().

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/index.ts` around lines 923 - 940, The RUN_ID_RE regex is declared with
the global 'g' flag so using it in detectInjectionMode(promptText) via .test()
makes the regex stateful and breaks later .matchAll() usage; fix by adding a
non-global variant (e.g., RUN_ID_TEST_RE = /bench-\d{8}-\d{6}/) and change
detectInjectionMode to use RUN_ID_TEST_RE while keeping the original RUN_ID_RE
(with 'g') for any .matchAll() calls; update references so detectInjectionMode,
RUN_ID_RE, and RUN_ID_TEST_RE are used as described.

Comment thread src/index.ts
Comment on lines +984 to +1005
// --- Layer 1.1: Stale run-state filter ---
const contentRunIds = [...(content.matchAll(RUN_ID_RE) ?? [])].map(m => m[0]);
if (contentRunIds.length > 0 && LIVENESS_CLAIM.test(content)) {
// If the prompt already contains a death claim, or the run ID isn’t a live process,
// drop this memory (it was captured when the run was alive, now stale).
let isStale = false;
if (promptHasDeathClaim) {
isStale = true;
} else {
// Check if prompt explicitly references this run as dead / a different run took over
for (const runId of contentRunIds) {
if (promptRunIds.has(runId) && DEATH_CLAIM.test(promptText)) {
isStale = true;
break;
}
// Also stale if prompt never mentions this run ID at all but does mention death
if (!promptRunIds.has(runId) && promptHasDeathClaim) {
isStale = true;
break;
}
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Narrow stale-run suppression to the referenced run IDs.

The unconditional if (promptHasDeathClaim) makes the ID-specific checks below effectively dead code. Any prompt containing a death token drops every liveness memory with a run ID, even when the prompt is about a different process.

Suggested fix
-      if (promptHasDeathClaim) {
-        isStale = true;
-      } else {
-        // Check if prompt explicitly references this run as dead / a different run took over
-        for (const runId of contentRunIds) {
-          if (promptRunIds.has(runId) && DEATH_CLAIM.test(promptText)) {
-            isStale = true;
-            break;
-          }
-          // Also stale if prompt never mentions this run ID at all but does mention death
-          if (!promptRunIds.has(runId) && promptHasDeathClaim) {
-            isStale = true;
-            break;
-          }
-        }
-      }
+      for (const runId of contentRunIds) {
+        if (promptRunIds.has(runId) && promptHasDeathClaim) {
+          isStale = true;
+          break;
+        }
+      }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// --- Layer 1.1: Stale run-state filter ---
const contentRunIds = [...(content.matchAll(RUN_ID_RE) ?? [])].map(m => m[0]);
if (contentRunIds.length > 0 && LIVENESS_CLAIM.test(content)) {
// If the prompt already contains a death claim, or the run ID isn’t a live process,
// drop this memory (it was captured when the run was alive, now stale).
let isStale = false;
if (promptHasDeathClaim) {
isStale = true;
} else {
// Check if prompt explicitly references this run as dead / a different run took over
for (const runId of contentRunIds) {
if (promptRunIds.has(runId) && DEATH_CLAIM.test(promptText)) {
isStale = true;
break;
}
// Also stale if prompt never mentions this run ID at all but does mention death
if (!promptRunIds.has(runId) && promptHasDeathClaim) {
isStale = true;
break;
}
}
}
// --- Layer 1.1: Stale run-state filter ---
const contentRunIds = [...(content.matchAll(RUN_ID_RE) ?? [])].map(m => m[0]);
if (contentRunIds.length > 0 && LIVENESS_CLAIM.test(content)) {
// If the prompt already contains a death claim, or the run ID isn't a live process,
// drop this memory (it was captured when the run was alive, now stale).
let isStale = false;
for (const runId of contentRunIds) {
if (promptRunIds.has(runId) && promptHasDeathClaim) {
isStale = true;
break;
}
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/index.ts` around lines 984 - 1005, The current logic sets isStale true
whenever promptHasDeathClaim is true, which bypasses the per-run checks; change
the conditional so that promptHasDeathClaim is only considered in the context of
the run IDs found in contentRunIds. Specifically, remove the unconditional
branch that sets isStale when promptHasDeathClaim is true and instead evaluate
death claims per run: iterate contentRunIds and for each runId check (1) if
promptRunIds.has(runId) && DEATH_CLAIM.test(promptText) then mark isStale, and
(2) if !promptRunIds.has(runId) && promptHasDeathClaim then mark isStale for
that run; ensure the outer check still requires LIVENESS_CLAIM.test(content) and
preserve use of symbols contentRunIds, LIVENESS_CLAIM, promptHasDeathClaim,
promptRunIds, DEATH_CLAIM, promptText, and isStale.

Comment thread src/index.ts
Comment on lines +1007 to +1017
log?.(`[cortex-inject] dropped stale run-state memory: ${content.slice(0, 80)}`);
dropped++;
continue;
}
}

// --- Bonus: Contradiction suppression ---
// Memory claims something is active, but prompt says it’s dead
if (LIVENESS_CLAIM.test(content) && DEATH_CLAIM.test(promptText)) {
log?.(`[cortex-inject] dropped contradicted memory (active claim vs dead context): ${content.slice(0, 80)}`);
dropped++;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Avoid logging memory content at info level.

These messages emit stored memory text into logs. This plugin handles long-term user memory, so logging even an 80-character preview is a privacy leak. Log IDs/categories/counts instead.

Suggested fix
-        log?.(`[cortex-inject] dropped stale run-state memory: ${content.slice(0, 80)}`);
+        log?.(
+          `[cortex-inject] dropped stale run-state memory (item=${item.item_id.slice(0, 8)}, category=${category || "unknown"})`,
+        );
@@
-      log?.(`[cortex-inject] dropped contradicted memory (active claim vs dead context): ${content.slice(0, 80)}`);
+      log?.(
+        `[cortex-inject] dropped contradicted memory (item=${item.item_id.slice(0, 8)}, category=${category || "unknown"})`,
+      );
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/index.ts` around lines 1007 - 1017, The log lines that print memory
content previews (using content.slice) are a privacy leak; update the two places
that call log?.(`[cortex-inject] ... ${content.slice(0, 80)}`) to avoid emitting
stored memory text and instead log non-sensitive metadata (e.g., memory ID,
category/tag, and incremented dropped count). Locate the block guarded by
LIVENESS_CLAIM and DEATH_CLAIM and the earlier stale run-state branch where
dropped++ occurs, replace content.slice usage with a safe identifier or category
(e.g., memory.id or memory.type) and a concise message like "[cortex-inject]
dropped contradicted memory: id=<id> category=<cat> totalDropped=<n>" so only
IDs/counts are logged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants