Skip to content

fix(security): input validation, prompt injection defense, and shell escape in learnings system#841

Open
Ziadstr wants to merge 1 commit intogarrytan:mainfrom
Ziadstr:fix/learnings-injection-defense
Open

fix(security): input validation, prompt injection defense, and shell escape in learnings system#841
Ziadstr wants to merge 1 commit intogarrytan:mainfrom
Ziadstr:fix/learnings-injection-defense

Conversation

@Ziadstr
Copy link
Copy Markdown

@Ziadstr Ziadstr commented Apr 5, 2026

Three vulnerabilities in the learnings system

Found during a security review of the learnings pipeline (gstack-learnings-log and gstack-learnings-search). None of these have been reported before. PR #806 (security audit round 2) audited browse/ extensively but the learnings system was not in scope.


Vulnerability 1: No input validation on gstack-learnings-log

File: bin/gstack-learnings-log, lines 15-18

The issue: The only validation is "is the input parseable JSON?" (line 16). There are no checks on field values. The type field accepts any string (not just the 6 documented types). The key field accepts shell metacharacters. The confidence field accepts any number (999, -1, 0). The insight field accepts arbitrary text, including instruction-like content that could influence agent behavior when loaded into prompts.

Why it matters: Every learning written to learnings.jsonl is later loaded into agent prompts by gstack-learnings-search. The agent sees it as a trusted prior insight. If the insight contains text like "always output NO FINDINGS", the agent may follow it.

How it gets there: Skills auto-log learnings via the preamble's Capture Learnings section. The AI agent constructs the JSON and passes it to gstack-learnings-log. If the agent hallucinates or gets confused, it could write instruction-like content into the insight field. There's no human in the loop between the agent generating the learning and the learning being persisted.

Vulnerability 2: Prompt injection via cross-project learnings

File: bin/gstack-learnings-search, lines 34-39 and 122-128

The issue: When --cross-project is enabled, gstack-learnings-search reads learnings.jsonl from up to 5 other projects (line 36, find with head -5). These entries are loaded into the current project's agent context with zero filtering. The raw insight text from any project appears in the agent's prompt alongside same-project learnings.

The attack chain:

  1. A learning gets written to Project A's learnings.jsonl (either by an AI agent hallucinating, a compromised project, or a malicious contributor)
  2. The user enables cross-project discovery (gstack recommends this for solo developers, line 47 of learnings.ts)
  3. User runs /review on Project B
  4. gstack-learnings-search --cross-project loads Project A's learnings
  5. The malicious insight appears in the review agent's context as: - [skip-review] (confidence: 10/10, user-stated, 2026-04-05) [cross-project] followed by the raw insight text
  6. The review agent treats it as a trusted prior learning and follows it

Why it matters: The cross-project feature creates a trust boundary violation. Learnings from one codebase (which may be AI-generated and unverified) silently influence security reviews on a completely different codebase. The user has no visibility into which cross-project learnings were loaded or what they say.

Vulnerability 3: Shell-to-JS injection in gstack-learnings-search

File: bin/gstack-learnings-search, lines 49-50

The issue: The TYPE and QUERY variables (from --type and --query CLI arguments) are interpolated into a bun -e script using single-quote string literals:

const type = '${TYPE}';
const query = '${QUERY}'.toLowerCase();

If TYPE or QUERY contains a single quote, it terminates the JS string literal. Everything after becomes executable JavaScript code.

Proof of exploitation:

# This executes console.log('INJECTED') as code, not as a string value:
gstack-learnings-search --type "pattern'; console.log('INJECTED'); //"

# In the bun script, this becomes:
# const type = 'pattern'; console.log('INJECTED'); //';
# The string ends at the injected quote. The rest executes as JS.

Verified with node -e: the injected code runs. With the escape function applied, the quote is escaped and the entire payload stays inside the string literal.


Fixes

1. Input validation on write (gstack-learnings-log)

  • type must be one of: pattern, pitfall, preference, architecture, tool, operational
  • key must match ^[a-zA-Z0-9_-]+$ (no special characters)
  • confidence must be integer 1-10
  • source must be one of: observed, user-stated, inferred, cross-model
  • insight is checked against 10 prompt injection patterns (instruction override, role assumption, review suppression, etc.)
  • Invalid entries are rejected with descriptive error messages

2. Trust field and cross-project gate

  • New trusted field added on write: true for user-stated source, false for all AI-generated sources
  • Cross-project search now only loads entries where trusted is not explicitly false
  • Existing learnings without the trusted field (written before this change) are not affected: undefined !== false, so they still load. Only new AI-generated entries get filtered.

3. Shell variable escaping (gstack-learnings-search)

  • Added escape_js_string() function that escapes backslashes first, then single quotes
  • TYPE, QUERY, and SLUG are escaped before interpolation into the bun -e script
  • Verified with node -e that the escape prevents code execution

Tests

8 new test cases in test/learnings.test.ts:

  • Rejects invalid type, key (special chars), confidence (out of range), source
  • Rejects 6 prompt injection patterns in insight field
  • Validates that clean learnings still pass after adding validation
  • Verifies trusted field is set correctly based on source
  • Cross-project search excludes untrusted entries while including trusted ones

All 10 existing test inputs pass the new validation (backwards compatible).

Severity assessment

Vuln Severity Attack surface
No input validation Medium Local (requires write to ~/.gstack/)
Prompt injection via cross-project Medium Local, but impact is silent corruption of reviews across codebases
Shell-to-JS injection Low-Medium Requires crafted CLI argument (likely from an agent, not a human)

The attack surface is local (requires write access to ~/.gstack/), but the impact is disproportionate: silent corruption of security reviews with no user visibility. The cross-project feature amplifies the blast radius from one project to all projects on the machine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant