fix(security): resolve OWASP scanner bypass via argument poisoning by SHAURYASANYAL3 · Pull Request #89 · sreerevanth/AgentWatch

SHAURYASANYAL3 · 2026-05-29T16:22:26Z

Summary

This PR fixes a security gap in OwaspScanner where tool call arguments were not included in heuristic analysis. As a result, malicious payloads embedded within nested JSON structures could bypass safety checks that were only inspecting raw command strings.

What Changed

Recursive Argument Flattening

Added _flatten_values() to OwaspScanner.
Recursively traverses dictionaries and lists to extract string content from nested tool arguments.

Expanded Scan Coverage

Updated _blob_of() to include flattened argument values in the generated scan payload.
Ensures security heuristics evaluate both command content and tool arguments.

Improved Threat Detection

The scanner can now detect:

A01: Prompt Injection
A04: Unsafe Code Execution
A05: Data Exfiltration

even when malicious content is embedded deep within structured tool arguments rather than exposed in the top-level command text.

Validation

Added the following test cases in tests/test_owasp_security.py:

test_owasp_detects_exfil_in_poisoned_arguments
test_owasp_detects_prompt_injection_in_arguments
test_owasp_detects_nested_unsafe_code

All three tests pass with this change and fail against the previous implementation, confirming the vulnerability and validating the fix.

Summary by CodeRabbit

New Features
- Claude Code adapter with CLI observation, real-time trace streaming, and safety enforcement.
- LangChain adapter with callback integration and advanced configuration.
Documentation
- Recommend editable dev-mode install for local development.
- Added detailed adapter guides and a universal one-liner in getting-started.
- Updated API description to clarify trace ingestion, sessions, safety, and dashboard updates.
Bug Fixes / Tests
- Enhanced security scanner to detect threats in nested/cyclic tool arguments.
- Added OWASP/security regression tests covering nested/cycle cases.
Chores
- New PR test workflow to run tests, linting, and post results.

…ts in scan

coderabbitai · 2026-05-29T16:22:39Z

Caution

Review failed

Failed to post review comments

📝 Walkthrough

Walkthrough

Enhances OWASP scanning to inspect nested tool-call arguments, adds tests for cycle-safe flattening and OWASP vector detection, introduces Claude Code and LangChain adapter docs, updates README/getting-started and CONTRIBUTING setup, expands FastAPI metadata, and adds a PR test GitHub Actions workflow.

Changes

OWASP Security Enhancement and Adapter Documentation

Layer / File(s)	Summary
OWASP Nested Argument Scanning `agentwatch/security/owasp.py`	Introduces `_flatten_values` helper that recursively extracts string-like content from nested data structures and updates `_blob_of` to include flattened `tool_call.arguments` in security scanning.
OWASP Vector Detection Tests `tests/test_owasp_security.py`	Adds test helper `_tool_event` and three pytest cases that validate detection of data exfiltration, prompt injection, and unsafe code execution when those threats are embedded in tool call arguments.
Cycle Handling & _flatten_values Tests `tests/test_safety.py`	Adds tests ensuring `OwaspScanner.scan` and `_flatten_values` handle self-referential/cyclical data without raising RecursionError and skip cycles while returning flattened values; also updates related test imports.
Adapter Documentation Pages `docs/adapters/claude-code.md`, `docs/adapters/langchain.md`	New documentation for Claude Code adapter (CLI usage, reasoning trace parsing, safety enforcement, WebSocket streaming) and LangChain adapter (installation, callback handler usage, features, event bus configuration).
Documentation Links and Getting Started `docs/getting-started.md`, `README.md`	Adds universal wrapper example (`watch()` usage) in getting-started and updates README Supported Frameworks section with links to new adapter guides.
Setup Instructions and API Metadata `CONTRIBUTING.md`, `agentwatch/api/server.py`	Replaces `requirements.txt` install with editable mode (`python -m pip install -e ".[dev]"`) in development setup and expands FastAPI application description metadata.
PR Tests Workflow `.github/workflows/test-on-pr.yml`	Adds a GitHub Actions workflow to run tests with coverage, run ruff, and post/update a PR comment with results; workflow fails if tests fail.
Test lint suppressions and import tweaks `tests/load/locustfile.py`, `tests/test_models.py`	Adds `noqa` suppressions to load tests and refactors test imports/`noqa` annotations in model tests.

Sequence Diagram(s)

sequenceDiagram
  participant AgentEvent
  participant OwaspScanner
  participant ScanRules
  AgentEvent->>OwaspScanner: submit tool_call (raw_command + arguments)
  OwaspScanner->>OwaspScanner: _flatten_values(arguments)
  OwaspScanner->>ScanRules: evaluate scan blob (raw_command + flattened values)

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

sreerevanth/AgentWatch#69: Both PRs modify OWASP scanning behavior by changing agentwatch/security/owasp.py's _blob_of handling of tool-call arguments.
sreerevanth/AgentWatch#29: Overlapping README adapter documentation and getting-started changes.

Suggested reviewers

sreerevanth

Poem

🐰 I nibble nested args with nimble paws,

Flattening secrets so scanners spot the flaws.
Tests hop round cycles, careful and spry,
Docs bloom for adapters, neat and dry,
CI hums—keep watch beneath moonlit sky.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 28.57% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The PR title clearly and specifically describes the main security fix: resolving an OWASP scanner bypass vulnerability caused by argument poisoning. It accurately reflects the primary change throughout the changeset.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@agentwatch/security/owasp.py`:
- Around line 144-157: The _flatten_values method can infinite-recurse on cyclic
structures; modify it to track seen object ids and skip already-visited
containers: add an optional parameter (e.g., seen: set[int] | None = None)
inside _flatten_values, initialize seen if None, check id(data) and return [] if
already in seen, add id(data) to seen before recursing into
dicts/lists/tuples/sets, and pass seen through all recursive calls so
self-referential payloads are ignored and RecursionError is avoided.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 058502d4-aac3-4c0e-9d00-8f2d8653fd1a

📥 Commits

Reviewing files that changed from the base of the PR and between c0e464b and f14a4e4.

📒 Files selected for processing (8)

CONTRIBUTING.md
README.md
agentwatch/api/server.py
agentwatch/security/owasp.py
docs/adapters/claude-code.md
docs/adapters/langchain.md
docs/getting-started.md
tests/test_owasp_security.py

Adds issues: write permission required to post comments on pull requests.

sreerevanth

Great catch on the OWASP argument-poisoning bypass and the added test coverage.

Before merge, please address one blocker:

_flatten_values() performs recursive traversal without cycle detection. Self-referential dict/list structures can trigger infinite recursion and raise RecursionError, causing scans to fail unexpectedly.

Suggested approach:

Track visited object IDs during traversal.
Skip already-visited containers.
Preserve current behavior for normal nested payloads.

Once cycle protection is added, this looks ready for final review.

…t recursion error

…solve conflicts

coderabbitai

Actionable comments posted: 3

🧹 Nitpick comments (1)

.github/workflows/test-on-pr.yml (1)
19-19: ⚡ Quick win

Pin third-party actions to commit SHAs.

actions/checkout@v4, actions/setup-python@v5, and actions/github-script@v7 are pinned to mutable tags. Static analysis flags this against the repo's pinning policy, and pinning to a full commit SHA is especially important here because this workflow runs with write permissions. Pin each uses: to a SHA (optionally with the version in a trailing comment).

Also applies to: 27-27, 51-51
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/test-on-pr.yml at line 19, Replace mutable action tags
with immutable commit SHAs for every "uses:" entry shown (e.g., replace
actions/checkout@v4, actions/setup-python@v5, and actions/github-script@v7) so
the workflow uses exact commit SHAs; keep the human-readable version as an
optional trailing comment if desired. Locate the three "uses:" occurrences in
the workflow file (the ones currently referencing actions/checkout,
actions/setup-python, and actions/github-script) and update each to the
corresponding full commit SHA from the action's repository release commit,
ensuring the SHA is the exact full-length commit hash.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/test-on-pr.yml:
- Around line 4-5: The workflow uses the privileged pull_request_target trigger
while checking out and running PR-controlled code (e.g., actions/checkout of the
PR head, pip install -e ".[dev]", pytest, ruff), which allows token
exfiltration; fix it by splitting into two workflows: 1) a "PR Tests" workflow
triggered on pull_request (not pull_request_target) with permissions.contents:
read that checks out the PR, runs pip install -e ".[dev]", pytest and ruff, and
uploads results as an artifact; and 2) a privileged "PR Test Comment" workflow
triggered by workflow_run for the "PR Tests" workflow that never checks out PR
code, has permissions.pull-requests: write and issues: write, downloads the
uploaded artifact and posts the comment—ensure actions/checkout is removed from
the privileged workflow and artifacts are used to pass results.
- Around line 18-24: The checkout step uses actions/checkout@v4 with the default
credential persistence, which leaves GITHUB_TOKEN in .git/config; update the
Checkout step (actions/checkout@v4) to set persist-credentials: false to prevent
persisting credentials for this PR-test job since the job does not push and
should not expose GITHUB_TOKEN. Ensure the changed Checkout configuration is
applied where ref: ${{ github.event.pull_request.head.sha }} is used.

In `@tests/test_safety.py`:
- Around line 368-385: The test uses the typing symbol Any in annotations inside
test_flatten_values_cycle_detection (and other tests) but there is no import;
add the missing import "from typing import Any" to the test module so
annotations in test_flatten_values_cycle_detection, and any uses of Any
elsewhere, resolve and CI (F821) no longer fails; locate the
tests/test_safety.py file and insert the import near other typing imports or at
top of the file.

---

Nitpick comments:
In @.github/workflows/test-on-pr.yml:
- Line 19: Replace mutable action tags with immutable commit SHAs for every
"uses:" entry shown (e.g., replace actions/checkout@v4, actions/setup-python@v5,
and actions/github-script@v7) so the workflow uses exact commit SHAs; keep the
human-readable version as an optional trailing comment if desired. Locate the
three "uses:" occurrences in the workflow file (the ones currently referencing
actions/checkout, actions/setup-python, and actions/github-script) and update
each to the corresponding full commit SHA from the action's repository release
commit, ensuring the SHA is the exact full-length commit hash.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 7d8bb7c4-0ebb-480c-8e90-61e351c5d90f

📥 Commits

Reviewing files that changed from the base of the PR and between f14a4e4 and 312848c.

📒 Files selected for processing (3)

.github/workflows/test-on-pr.yml
agentwatch/security/owasp.py
tests/test_safety.py

🚧 Files skipped from review as they are similar to previous changes (1)

agentwatch/security/owasp.py

coderabbitai · 2026-05-30T15:59:21Z

+  pull_request_target:
    branches: [main]


⚠️ Potential issue | 🔴 Critical | 🏗️ Heavy lift

Critical: pull_request_target that checks out and executes untrusted PR code enables token theft / RCE.

This workflow combines the privileged pull_request_target trigger (which runs with the base repo's write-capable GITHUB_TOKEN and issues: write / pull-requests: write) with checking out the PR head (Line 21) and then executing arbitrary PR-controlled code via pip install -e ".[dev]" (Line 33), pytest (Lines 39-42, including any conftest.py/test code in the PR), and ruff config from the PR.

Any external contributor can modify those files to run arbitrary commands during the job and exfiltrate the token or abuse the write permissions. The inline comment on Lines 22-24 documents the risk but does not mitigate it. This directly undermines the security intent of this PR.

The safe pattern is to split into two workflows: run untrusted code under the unprivileged pull_request trigger (no secrets/write token), and post the comment from a separate workflow_run workflow (or an actions/upload-artifact + download) that never checks out PR code.

🔒 Recommended two-workflow structure

# test-on-pr.yml — runs untrusted code, no write token name: PR Tests on: pull_request: branches: [main] permissions: contents: read jobs: test: runs-on: ubuntu-latest steps: - uses: actions/checkout@<sha> - uses: actions/setup-python@<sha> with: { python-version: "3.12", cache: pip } - run: pip install -e ".[dev]" - run: pytest tests/ -v --tb=short --cov=agentwatch --cov-report=json continue-on-error: true id: pytest - run: ruff check . continue-on-error: true # upload coverage.json + outcomes as an artifact for the commenting workflow

# pr-comment.yml — privileged, never checks out PR code name: PR Test Comment on: workflow_run: workflows: ["PR Tests"] types: [completed] permissions: pull-requests: write issues: write # download the artifact, then post/update the comment via github-script

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/workflows/test-on-pr.yml around lines 4 - 5, The workflow uses the privileged pull_request_target trigger while checking out and running PR-controlled code (e.g., actions/checkout of the PR head, pip install -e ".[dev]", pytest, ruff), which allows token exfiltration; fix it by splitting into two workflows: 1) a "PR Tests" workflow triggered on pull_request (not pull_request_target) with permissions.contents: read that checks out the PR, runs pip install -e ".[dev]", pytest and ruff, and uploads results as an artifact; and 2) a privileged "PR Test Comment" workflow triggered by workflow_run for the "PR Tests" workflow that never checks out PR code, has permissions.pull-requests: write and issues: write, downloads the uploaded artifact and posts the comment—ensure actions/checkout is removed from the privileged workflow and artifacts are used to pass results.

coderabbitai · 2026-05-30T15:59:22Z

      - name: Checkout
        uses: actions/checkout@v4
+        with:
+          ref: ${{ github.event.pull_request.head.sha }}
+          # Note: pull_request_target runs in the context of the base repo.
+          # Checking out the PR head is necessary to test the changes,
+          # but use caution as this runs with a write-capable GITHUB_TOKEN.


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Disable credential persistence on checkout.

Since this job runs untrusted PR code under a privileged trigger, the default persist-credentials: true leaves the GITHUB_TOKEN in .git/config where the executed code can read and exfiltrate it. Set persist-credentials: false (you don't push from this job). This hardens the checkout even after restructuring the trigger.

🛡️ Proposed fix

- name: Checkout uses: actions/checkout@v4 with: ref: ${{ github.event.pull_request.head.sha }} + persist-credentials: false

🧰 Tools

🪛 zizmor (1.25.2)

[warning] 18-24: credential persistence through GitHub Actions artifacts (artipacked): does not set persist-credentials: false

(artipacked)

[error] 19-19: unpinned action reference (unpinned-uses): action is not pinned to a hash (required by blanket policy)

(unpinned-uses)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/workflows/test-on-pr.yml around lines 18 - 24, The checkout step uses actions/checkout@v4 with the default credential persistence, which leaves GITHUB_TOKEN in .git/config; update the Checkout step (actions/checkout@v4) to set persist-credentials: false to prevent persisting credentials for this PR-test job since the job does not push and should not expose GITHUB_TOKEN. Ensure the changed Checkout configuration is applied where ref: ${{ github.event.pull_request.head.sha }} is used.

sreerevanth

❌ Changes Requested

Thanks for addressing the original blocker. The cycle-detection fix in _flatten_values() resolves the recursion concern and the security regression tests are appreciated.

Before merge, two additional issues need to be addressed:

Critical workflow security issue

The workflow now uses pull_request_target while checking out and executing PR-controlled code (pip install, pytest, etc.). This introduces a privileged-code-execution path and is a higher-severity security concern than the OWASP scanner issue being fixed.

Please revert to a safe workflow design (e.g. unprivileged pull_request testing plus a separate reporting workflow) or otherwise eliminate execution of untrusted PR code under a privileged token context.

Resolve merge conflicts

GitHub currently reports an unresolved conflict in:

docs/adapters/langchain.md

Please rebase on the latest main, resolve conflicts, and push the updated branch.

Once the workflow security issue and merge conflict are resolved, this can be reviewed again.

SHAURYASANYAL3 added 2 commits May 29, 2026 21:35

docs: upgrade documentation quality and add framework-specific guides

2b496cd

fix(security): resolve OWASP scanner bypass by including tool argumen…

f14a4e4

…ts in scan

coderabbitai Bot reviewed May 29, 2026

View reviewed changes

Comment thread agentwatch/security/owasp.py Outdated

ci: fix github actions permission for posting PR comments

db43e55

Adds issues: write permission required to post comments on pull requests.

sreerevanth self-requested a review May 30, 2026 15:13

sreerevanth requested changes May 30, 2026

View reviewed changes

sreerevanth added bug Something isn't working level: advanced NSoC 26' level3 security labels May 30, 2026

SHAURYASANYAL3 added 2 commits May 30, 2026 21:18

fix(security): implement cycle detection in _flatten_values to preven…

830297a

…t recursion error

ci: fix 403 error for PR comments by using pull_request_target

312848c

SHAURYASANYAL3 force-pushed the fix/owasp-argument-bypass branch from 85a2073 to 312848c Compare May 30, 2026 15:53

SHAURYASANYAL3 added 2 commits May 30, 2026 21:25

ci: fix 403 error for PR comments by using pull_request_target and re…

aa4e11a

…solve conflicts

ci: fix PR comment permissions and resolve lint issues

41714ac

coderabbitai Bot reviewed May 30, 2026

View reviewed changes

SHAURYASANYAL3 requested a review from sreerevanth May 30, 2026 16:00

sreerevanth requested changes May 30, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(security): resolve OWASP scanner bypass via argument poisoning#89

fix(security): resolve OWASP scanner bypass via argument poisoning#89
SHAURYASANYAL3 wants to merge 7 commits into
sreerevanth:mainfrom
SHAURYASANYAL3:fix/owasp-argument-bypass

SHAURYASANYAL3 commented May 29, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 29, 2026 •

edited

Loading

Review failed

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

sreerevanth left a comment

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot May 30, 2026

Uh oh!

coderabbitai Bot May 30, 2026

Uh oh!

Uh oh!

sreerevanth left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

SHAURYASANYAL3 commented May 29, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What Changed

Recursive Argument Flattening

Expanded Scan Coverage

Improved Threat Detection

Validation

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

sreerevanth left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 30, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 30, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

sreerevanth left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

SHAURYASANYAL3 commented May 29, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 29, 2026 •

edited

Loading