Local source: coven-code/issues/09-block-untrusted-pr-input-from-durable-memory.md
Summary
Hosted review mode must prevent untrusted PR/fork content from directly creating durable memory. This is the core memory-poisoning defense.
Current Evidence
query/src/lib.rs triggers session memory extraction after enough messages and writes to <working_dir>/.coven-code/AGENTS.md.
- The extraction prompt analyzes the conversation transcript.
- There is no approval gate or actor trust check in that write path.
Problem
Contributor-controlled text can appear in:
- PR descriptions.
- Commit messages.
- Review comments.
- Changed files.
- Test output.
- Tool output from repository commands.
If those inputs are in the conversation, the extractor can persist false rules such as "this repo intentionally skips auth checks." The next review may trust that memory.
Proposed Design
In hosted mode:
- Mark sessions as trusted or untrusted based on GitHub actor and fork status.
- Disable durable memory writes for untrusted sessions.
- Allow session-local memory only for the current job.
- Optionally emit memory candidates for maintainer approval.
Acceptance Criteria
- Hosted sessions with untrusted input cannot write durable memory automatically.
- Untrusted sessions may produce memory candidates in the artifact ledger.
- Maintainer-approved candidates can be promoted to durable memory.
- Tests cover fork PR sessions not writing
.coven-code/AGENTS.md.
- Tests cover maintainer sessions writing only when policy allows.
Audit Requirements
Rejected memory candidates should be recorded with reason codes, but not loaded into future prompts.
Local source:
coven-code/issues/09-block-untrusted-pr-input-from-durable-memory.mdSummary
Hosted review mode must prevent untrusted PR/fork content from directly creating durable memory. This is the core memory-poisoning defense.
Current Evidence
query/src/lib.rstriggers session memory extraction after enough messages and writes to<working_dir>/.coven-code/AGENTS.md.Problem
Contributor-controlled text can appear in:
If those inputs are in the conversation, the extractor can persist false rules such as "this repo intentionally skips auth checks." The next review may trust that memory.
Proposed Design
In hosted mode:
Acceptance Criteria
.coven-code/AGENTS.md.Audit Requirements
Rejected memory candidates should be recorded with reason codes, but not loaded into future prompts.