overreach

Catch code that overreaches before it merges.

overreach is a fast, local CLI that scans a diff, file, or repo for capability drift: outbound network calls, subprocess spawns, sensitive-file reads, curl | sh, disabled TLS verification, and hardcoded secrets. Think ripgrep, but for what this code is allowed to touch.

It is built for AI-assisted code review. The risky part of an agent PR is often not the feature itself — it's the quiet fetch, execSync, or .env read that appeared beside it. Treat overreach as a tripwire for capability drift, not a containment boundary: it favors recall and runs on every push, but a determined adversary can evade regex (string concatenation, base64, dynamic eval/import, or moving the call into a dependency) — so it's a fast first pass, not a security guarantee.

cargo install --git https://github.com/Conalh/overreach --tag v0.2.0 --locked
git diff | overreach --diff

No signup. No daemon. No telemetry. No network at scan time. It reads your diff and exits.

flowchart LR
    A["git diff (--diff)<br/>· file · repo"] --> S["Line scanner<br/>added lines · UTF-8 · 8 MiB cap"]
    S --> D["Detectors<br/>pipe-to-shell · secrets · sensitive-fs<br/>network · subprocess · TLS-off"]
    S --> G["Coverage gaps<br/>unreadable · non-UTF-8 · oversize"]
    D --> R["Graded findings<br/>critical · high · medium · low"]
    G --> R
    R --> O["Report<br/>human · --json"]
    O --> F{"--fail-on"}
    F -->|below or clean| P["exit 0<br/>pass"]
    F -->|at or above| X["exit 1<br/>fail"]
    S -.->|unreadable entrypoint| T["exit 2<br/>couldn't scan"]

    classDef in fill:#1e293b,stroke:#334155,color:#e2e8f0
    classDef core fill:#0f172a,stroke:#1e293b,color:#e2e8f0,stroke-width:2px
    classDef out fill:#0c4a6e,stroke:#0369a1,color:#e0f2fe
    class A in
    class S,D,G,R,O core
    class F,P,X,T out

$ git diff | overreach --diff
CRITICAL  src/util.js:15  [pipe_to_shell]
          Downloads a script and pipes it straight into a shell
CRITICAL  src/util.js:16  [hardcoded_secret]
          Possible hardcoded Anthropic credential (value redacted)
    HIGH  src/util.js:13  [network_call]
          Makes an outbound network call

3 finding(s): 2 critical, 1 high, 0 medium, 0 low
FAIL (findings at/above critical)

See also: SECURITY.md for the threat model and self-guarantees · CHANGELOG.md for release history · part of the agent-gov suite.

Why

Code review optimizes for "is the feature correct?" — not "did this diff quietly gain a new capability?" As autonomous agents get write access to real repositories, the second question is the one that bites. overreach is a fast, zero-config first pass that answers it, locally, before anything merges.

Diff-aware. Scans only the added lines of a unified diff, so you see what a change introduces, not what was already there.
Secrets are reported, never echoed. A hardcoded key is flagged by provider ("Anthropic", "AWS") with the literal value redacted — overreach never prints a credential back at you.
CI-ready. --json output and a configurable --fail-on severity make it a one-line PR gate.
Dependency-light. Three crates (regex, serde, serde_json). No network access at scan time, no telemetry, nothing phones home.

Install

# Recommended: install the pinned release
cargo install --git https://github.com/Conalh/overreach --tag v0.2.0 --locked

# From a local checkout
cargo install --path . --locked

# From main, if you want unreleased changes
cargo install --git https://github.com/Conalh/overreach --locked

Produces a single static-ish binary; drop it anywhere on PATH. A crates.io release (cargo install overreach) is planned but not yet published — install from source or Git for now.

Usage

overreach [PATH]                  # scan a file or directory (default: .)
git diff | overreach --diff       # scan only the added lines of a diff
overreach --diff --json           # machine-readable output for CI
overreach . --fail-on high        # also fail on new network calls / subprocess spawns

Flag	Effect
`--diff`	Read a unified diff from stdin; scan added lines only
`--json`	Emit findings + summary as JSON
`--fail-on <level>`	Exit non-zero at/above `low`\|`medium`\|`high`\|`critical` (default `critical`)
`-h`, `--help`	Help
`-V`, `--version`	Version

Exit codes — so it slots straight into a pipeline:

Code	Meaning
`0`	scanned; nothing at/above `--fail-on`
`1`	findings at/above `--fail-on`
`2`	could not scan (unreadable entrypoint) or invalid invocation

By default the gate fails only on critical findings (pipe_to_shell, hardcoded_secret, sensitive_fs_read). Adding a network call or shelling out is normal feature work, so network_call and subprocess_spawn are reported as high but do not fail the build unless you opt into --fail-on high — a gate that red-X'd every routine PR is a gate that gets switched off.

What it detects

Kind	Severity	What it flags
`pipe_to_shell`	critical	`curl`/`wget … \| sh` — downloading and executing a script in one breath
`hardcoded_secret`	critical	Provider-prefixed credentials (Anthropic, OpenAI, GitHub, AWS, Slack, Google, GitLab, Stripe). Value redacted.
`sensitive_fs_read`	critical	References to `.ssh/`, `id_rsa`, `/etc/passwd`, `.aws/credentials`, `.env`, `.npmrc`
`network_call`	high	`fetch`, `axios`, `XMLHttpRequest`, `WebSocket`, raw sockets; Python `requests`/`urllib`/`httpx`/`aiohttp`; Node/Go `http.get`/`http.Get`; Rust `reqwest`/`TcpStream`; JVM/.NET `HttpClient`/`openConnection`
`subprocess_spawn`	high	`child_process`, `execSync`/`execFile`, top-level `exec(`, `.spawn(`; Python `subprocess.*`/`os.system`; Rust `process::Command`; Go `exec.Command`; Java `Runtime.exec`/`ProcessBuilder`; C# `Process.Start`
`tls_verification_disabled`	medium	`rejectUnauthorized: false`, `verify=False`, `InsecureSkipVerify: true`, `NODE_TLS_REJECT_UNAUTHORIZED=0` (the insecure value only), Python `ssl._create_unverified_context`
`file_too_large_to_scan`, `file_not_utf8`, `file_unreadable`, `directory_unreadable`, `path_unreadable`	low	Coverage gaps, not content findings: a path that was skipped mid-walk (over the 8 MiB cap, not UTF-8, or unreadable) is surfaced so a clean report can't hide something unscanned. An unreadable entrypoint is a hard error (exit `2`), not a low finding.

This is a fast, regex-based first pass — it favors recall over perfect precision. It is not a full taint analysis; treat findings as "look here," not "proven exploit." And because matching is literal regex, a determined adversary can evade it — string concatenation, base64, dynamic eval/import, or relocating the call into a dependency all slip past — so overreach surfaces drift and carelessness, not a motivated attacker. It deliberately does not flag method-call lookalikes such as JS regex.exec(str) or Rust tokio::spawn(...) as subprocesses, and process.env.X is not treated as a .env file read.

Matching is line-literal: overreach scans raw source lines and does not strip comments or string literals, so a comment or string that merely mentions a trigger token (a TODO about axios, a docstring naming subprocess.run) can be flagged. That's the trade-off for a zero-config first pass that favors recall — eyeball low-context findings before wiring it as a hard gate on comment-heavy code, and prefer scanning --diff (added lines only) over whole files.

Language coverage

Being regex-based, overreach is language-agnostic in that it scans any UTF-8 text file — but the detector patterns are tuned per language. Today's coverage, by detector:

Language	`network_call`	`subprocess_spawn`	`tls_verification_disabled`
JS / TS	✅	✅	✅
Python	✅	✅	✅
Go	✅	✅	✅
Rust	✅	✅	—
Java	✅	✅	—
C# / .NET	✅	✅	—

pipe_to_shell, sensitive_fs_read, and hardcoded_secret are language-independent — they match shell invocations, sensitive file paths, and provider key formats regardless of language. Coverage is additive: adding a language means adding patterns, never rewriting the engine. Gaps (Ruby, PHP, C/C++, shell beyond curl|sh, and the dashes above) are good first contributions.

How it compares

overreach is deliberately narrow: a fast, zero-config first pass for capability drift in a diff. It does not try to beat a real static analyzer at depth or a dedicated secret scanner at secrets — it tries to be the thing you can drop into any PR as one binary with no rules to write. The honest landscape, assuming typical out-of-the-box or common PR-gate usage:

	Capability drift (net · subprocess · TLS)	Hardcoded secrets	Diff-aware first pass	Zero-config	Analysis depth	Footprint
overreach	✅ built-in	⚠️ 8 providers, no entropy/verify	✅ added lines only	✅ no rules	❌ regex, no taint/dataflow	✅ one static binary · 3 deps · no network at scan time
`ripgrep` / `grep`	⚠️ DIY patterns	⚠️ DIY patterns	⚠️ `git diff \| rg`	✅	❌	✅
Semgrep	✅ via rules	✅ (validated in Pro)	✅ baseline scan	⚠️ rules required	✅ AST + dataflow	⚠️ heavier · cloud for best
gitleaks / TruffleHog	❌ secrets only	✅ entropy · git history · live verify	✅	✅	—	✅
CodeQL	✅ via queries	⚠️ secondary; not its primary use	⚠️ PR alerts, whole-DB scan	❌ build DB + query packs	✅ semantic dataflow	❌ minutes per run · heavy

The table lists classic tooling, but in 2026 the louder competition is the cloud/ML AI-SAST wave — DryRun, Aikido, Snyk Code, Checkmarx, GitHub Advanced Security. They go far deeper than overreach, and that's the point: they are the depth tier. overreach lives at the opposite end — local, deterministic, no-AI, no-telemetry, free, and instant. It never competes on analysis depth; it competes on being the zero-friction tripwire you can run on every push with nothing sent anywhere.

Reach for something else when:

you need semantic dataflow or taint analysis (Semgrep, CodeQL);
you need full secret-scanning coverage across git history (gitleaks, TruffleHog);
you need live key validation;
you need policy-as-code rules maintained by a security team.

Reach for overreach when:

you want a fast, local tripwire for new network, subprocess, TLS-disable, sensitive-file, or hardcoded-secret patterns in a diff.

The capability grid above is a positioning sketch — every tool is more configurable than one table can show, and the right answer is often overreach and one of them. Speed, on the other hand, is measured:

Speed

overreach scans a 40 MiB / 3,000-file tree for all six detector families in ~0.13 s — near ripgrep-class speed while running every detector family, and far ahead of gitleaks, TruffleHog, grep, and Semgrep. Fast enough to run on every push.

The benchmark below is reproducible from bench/; it scans an identical generated corpus and reports wall-clock time. These tools do not all solve the same problem — gitleaks/TruffleHog do secrets only (with entropy and live verification); Semgrep does the AST/dataflow analysis overreach deliberately skips — so slower does not mean worse.

Tool	40 MiB / 3k-file scan	what it is
`ripgrep` (parallel)	68 ms	the fastest line scanner
`overreach` (parallel)	127 ms	all six detector families
`ripgrep -j1`	215 ms	single-threaded, same engine
`gitleaks`	654 ms	secrets only · entropy, git history
`TruffleHog`	1.11 s	secrets only · live key verification
`grep -rP`	2.82 s	naive recursive baseline
`Semgrep`	34.8 s	AST framework running equivalent pattern checks

"Clean" never means "didn't scan"

The worst failure mode for a security tool is a silent gap that reads as a pass. overreach is built so a clean report is trustworthy:

Symlink-safe walk. Mid-walk, symlinks are skipped via symlink_metadata, so a hostile checkout can't escape the scan root (link -> /etc) or loop forever (link -> ..). A user-supplied entrypoint is followed exactly once, as deliberate intent.
Skips are surfaced, not swallowed. A file that's over the 8 MiB cap, not valid UTF-8, or unreadable — an unreadable directory, or a path that can't be stat'd mid-walk — each becomes a low-severity coverage-gap finding (file_too_large_to_scan, file_not_utf8, file_unreadable, directory_unreadable, path_unreadable) rather than vanishing. An attacker can't hide a key in a 9 MB blob, a binary blob, or a locked-down file and get back "clean."
An unscannable entrypoint is a hard failure, not a pass. If the path you point overreach at can't be read at all (missing, permission denied), it exits 2 — distinct from 1 (findings at/above --fail-on) and 0 (scanned clean). A security gate must never exit 0 on something it couldn't scan.
Redaction is pinned by a canary test. rendered_output_never_echoes_a_credential_value plants a sentinel secret and asserts it never appears in any rendered output, so a future change can't accidentally start printing credentials. The exit-code contract, diff line-numbering, the ++-vs-header edge case, every coverage gap above, and the false-positive guards (regex.exec, tokio::spawn) are all pinned by unit and CLI integration tests.

In CI

# .github/workflows/overreach.yml
name: overreach
on: pull_request

# Minimum-privilege token: this job only needs to read source.
permissions:
  contents: read

jobs:
  overreach:
    runs-on: ubuntu-latest
    steps:
      # Pin third-party actions to commit SHAs (with the tag as a comment)
      # so a compromised upstream tag can't silently change what runs in CI.
      - uses: actions/checkout@93cb6efe18208431cddfb8368fd83d5badbf9bfd # v5
        with:
          fetch-depth: 0
          # Don't leave GITHUB_TOKEN in .git/config — `cargo build` runs
          # build scripts from every transitive dep.
          persist-credentials: false
      - uses: dtolnay/rust-toolchain@29eef336d9b2848a0b548edc03f92a220660cdb8 # stable
      # --locked enforces the committed Cargo.lock.
      - run: cargo build --release --locked
      - name: Scan the PR diff
        env:
          # Route the trigger value through env so it can't be interpolated
          # into the shell script body.
          BASE_REF: ${{ github.base_ref }}
        # pipefail so a failed `git diff` can't be masked by the trailing pipe
        # and let the scan report "clean" on an unscanned diff.
        run: |
          set -euo pipefail
          # The default gate fails only on critical findings (curl|sh, secrets,
          # sensitive-file reads). Add --fail-on high to also block new network
          # calls and subprocess spawns.
          git diff "origin/$BASE_REF...HEAD" | ./target/release/overreach --diff

Using the composite action

The repo also ships a composite action (action.yml) that builds overreach and scans the PR diff. It does not check out your code — you must run actions/checkout first, with fetch-depth: 0 so the base ref exists to diff against:

name: overreach
on: pull_request

permissions:
  contents: read

jobs:
  overreach:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@93cb6efe18208431cddfb8368fd83d5badbf9bfd # v5
        with:
          fetch-depth: 0
          persist-credentials: false
      - uses: Conalh/overreach@v0.2.0
        # Defaults to scanning the PR diff and failing on critical findings.
        # with:
        #   fail-on: high   # also block new network calls / subprocess spawns
        #   path: src/      # scan a path in full-tree mode instead of the PR diff

Where this fits

overreach is the standalone, language-agnostic cousin of CapabilityEcho from the agent-gov suite — the same idea (catch capability drift in a diff), repackaged as one fast binary with no Node and no suite to adopt. Use overreach for a quick gate anywhere; reach for the full agent-gov suite when you want cross-tool consolidation and a single PR verdict.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

overreach

Why

Install

Usage

What it detects

Language coverage

How it compares

Speed

"Clean" never means "didn't scan"

In CI

Using the composite action

Where this fits

License

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
.github/workflows		.github/workflows
bench		bench
docs		docs
src		src
tests		tests
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
action.yml		action.yml

Folders and files

Latest commit

History

Repository files navigation

overreach

Why

Install

Usage

What it detects

Language coverage

How it compares

Speed

"Clean" never means "didn't scan"

In CI

Using the composite action

Where this fits

License

About

Topics

Resources

License

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages