Catch code that overreaches before it merges.
overreach is a fast, local CLI that scans a diff, file, or repo for capability drift: outbound network calls, subprocess spawns, sensitive-file reads, curl | sh, disabled TLS verification, and hardcoded secrets. Think ripgrep, but for what this code is allowed to touch.
It is built for AI-assisted code review. The risky part of an agent PR is often not the feature itself — it's the quiet fetch, execSync, or .env read that appeared beside it. Treat overreach as a tripwire for capability drift, not a containment boundary: it favors recall and runs on every push, but a determined adversary can evade regex (string concatenation, base64, dynamic eval/import, or moving the call into a dependency) — so it's a fast first pass, not a security guarantee.
cargo install --git https://github.com/Conalh/overreach --tag v0.2.0 --locked
git diff | overreach --diffNo signup. No daemon. No telemetry. No network at scan time. It reads your diff and exits.
flowchart LR
A["git diff (--diff)<br/>· file · repo"] --> S["Line scanner<br/>added lines · UTF-8 · 8 MiB cap"]
S --> D["Detectors<br/>pipe-to-shell · secrets · sensitive-fs<br/>network · subprocess · TLS-off"]
S --> G["Coverage gaps<br/>unreadable · non-UTF-8 · oversize"]
D --> R["Graded findings<br/>critical · high · medium · low"]
G --> R
R --> O["Report<br/>human · --json"]
O --> F{"--fail-on"}
F -->|below or clean| P["exit 0<br/>pass"]
F -->|at or above| X["exit 1<br/>fail"]
S -.->|unreadable entrypoint| T["exit 2<br/>couldn't scan"]
classDef in fill:#1e293b,stroke:#334155,color:#e2e8f0
classDef core fill:#0f172a,stroke:#1e293b,color:#e2e8f0,stroke-width:2px
classDef out fill:#0c4a6e,stroke:#0369a1,color:#e0f2fe
class A in
class S,D,G,R,O core
class F,P,X,T out
$ git diff | overreach --diff
CRITICAL src/util.js:15 [pipe_to_shell]
Downloads a script and pipes it straight into a shell
CRITICAL src/util.js:16 [hardcoded_secret]
Possible hardcoded Anthropic credential (value redacted)
HIGH src/util.js:13 [network_call]
Makes an outbound network call
3 finding(s): 2 critical, 1 high, 0 medium, 0 low
FAIL (findings at/above critical)
See also: SECURITY.md for the threat model and self-guarantees · CHANGELOG.md for release history · part of the agent-gov suite.
Code review optimizes for "is the feature correct?" — not "did this diff quietly gain a new capability?" As autonomous agents get write access to real repositories, the second question is the one that bites. overreach is a fast, zero-config first pass that answers it, locally, before anything merges.
- Diff-aware. Scans only the added lines of a unified diff, so you see what a change introduces, not what was already there.
- Secrets are reported, never echoed. A hardcoded key is flagged by provider ("Anthropic", "AWS") with the literal value redacted —
overreachnever prints a credential back at you. - CI-ready.
--jsonoutput and a configurable--fail-onseverity make it a one-line PR gate. - Dependency-light. Three crates (
regex,serde,serde_json). No network access at scan time, no telemetry, nothing phones home.
# Recommended: install the pinned release
cargo install --git https://github.com/Conalh/overreach --tag v0.2.0 --locked
# From a local checkout
cargo install --path . --locked
# From main, if you want unreleased changes
cargo install --git https://github.com/Conalh/overreach --lockedProduces a single static-ish binary; drop it anywhere on PATH. A crates.io
release (cargo install overreach) is planned but not yet published — install
from source or Git for now.
overreach [PATH] # scan a file or directory (default: .)
git diff | overreach --diff # scan only the added lines of a diff
overreach --diff --json # machine-readable output for CI
overreach . --fail-on high # also fail on new network calls / subprocess spawns| Flag | Effect |
|---|---|
--diff |
Read a unified diff from stdin; scan added lines only |
--json |
Emit findings + summary as JSON |
--fail-on <level> |
Exit non-zero at/above low|medium|high|critical (default critical) |
-h, --help |
Help |
-V, --version |
Version |
Exit codes — so it slots straight into a pipeline:
| Code | Meaning |
|---|---|
0 |
scanned; nothing at/above --fail-on |
1 |
findings at/above --fail-on |
2 |
could not scan (unreadable entrypoint) or invalid invocation |
By default the gate fails only on critical findings (pipe_to_shell, hardcoded_secret, sensitive_fs_read). Adding a network call or shelling out is normal feature work, so network_call and subprocess_spawn are reported as high but do not fail the build unless you opt into --fail-on high — a gate that red-X'd every routine PR is a gate that gets switched off.
| Kind | Severity | What it flags |
|---|---|---|
pipe_to_shell |
critical | curl/wget … | sh — downloading and executing a script in one breath |
hardcoded_secret |
critical | Provider-prefixed credentials (Anthropic, OpenAI, GitHub, AWS, Slack, Google, GitLab, Stripe). Value redacted. |
sensitive_fs_read |
critical | References to .ssh/, id_rsa, /etc/passwd, .aws/credentials, .env, .npmrc |
network_call |
high | fetch, axios, XMLHttpRequest, WebSocket, raw sockets; Python requests/urllib/httpx/aiohttp; Node/Go http.get/http.Get; Rust reqwest/TcpStream; JVM/.NET HttpClient/openConnection |
subprocess_spawn |
high | child_process, execSync/execFile, top-level exec(, .spawn(; Python subprocess.*/os.system; Rust process::Command; Go exec.Command; Java Runtime.exec/ProcessBuilder; C# Process.Start |
tls_verification_disabled |
medium | rejectUnauthorized: false, verify=False, InsecureSkipVerify: true, NODE_TLS_REJECT_UNAUTHORIZED=0 (the insecure value only), Python ssl._create_unverified_context |
file_too_large_to_scan, file_not_utf8, file_unreadable, directory_unreadable, path_unreadable |
low | Coverage gaps, not content findings: a path that was skipped mid-walk (over the 8 MiB cap, not UTF-8, or unreadable) is surfaced so a clean report can't hide something unscanned. An unreadable entrypoint is a hard error (exit 2), not a low finding. |
This is a fast, regex-based first pass — it favors recall over perfect precision. It is not a full taint analysis; treat findings as "look here," not "proven exploit." And because matching is literal regex, a determined adversary can evade it — string concatenation, base64, dynamic eval/import, or relocating the call into a dependency all slip past — so overreach surfaces drift and carelessness, not a motivated attacker. It deliberately does not flag method-call lookalikes such as JS regex.exec(str) or Rust tokio::spawn(...) as subprocesses, and process.env.X is not treated as a .env file read.
Matching is line-literal: overreach scans raw source lines and does not strip comments or string literals, so a comment or string that merely mentions a trigger token (a TODO about axios, a docstring naming subprocess.run) can be flagged. That's the trade-off for a zero-config first pass that favors recall — eyeball low-context findings before wiring it as a hard gate on comment-heavy code, and prefer scanning --diff (added lines only) over whole files.
Being regex-based, overreach is language-agnostic in that it scans any UTF-8 text file — but the detector patterns are tuned per language. Today's coverage, by detector:
| Language | network_call |
subprocess_spawn |
tls_verification_disabled |
|---|---|---|---|
| JS / TS | ✅ | ✅ | ✅ |
| Python | ✅ | ✅ | ✅ |
| Go | ✅ | ✅ | ✅ |
| Rust | ✅ | ✅ | — |
| Java | ✅ | ✅ | — |
| C# / .NET | ✅ | ✅ | — |
pipe_to_shell, sensitive_fs_read, and hardcoded_secret are language-independent — they match shell invocations, sensitive file paths, and provider key formats regardless of language. Coverage is additive: adding a language means adding patterns, never rewriting the engine. Gaps (Ruby, PHP, C/C++, shell beyond curl|sh, and the dashes above) are good first contributions.
overreach is deliberately narrow: a fast, zero-config first pass for capability drift in a diff. It does not try to beat a real static analyzer at depth or a dedicated secret scanner at secrets — it tries to be the thing you can drop into any PR as one binary with no rules to write. The honest landscape, assuming typical out-of-the-box or common PR-gate usage:
| Capability drift (net · subprocess · TLS) |
Hardcoded secrets | Diff-aware first pass | Zero-config | Analysis depth | Footprint | |
|---|---|---|---|---|---|---|
| overreach | ✅ built-in | ✅ added lines only | ✅ no rules | ❌ regex, no taint/dataflow | ✅ one static binary · 3 deps · no network at scan time | |
ripgrep / grep |
git diff | rg |
✅ | ❌ | ✅ | ||
| Semgrep | ✅ via rules | ✅ (validated in Pro) | ✅ baseline scan | ✅ AST + dataflow | ||
| gitleaks / TruffleHog | ❌ secrets only | ✅ entropy · git history · live verify | ✅ | ✅ | — | ✅ |
| CodeQL | ✅ via queries | ❌ build DB + query packs | ✅ semantic dataflow | ❌ minutes per run · heavy |
The table lists classic tooling, but in 2026 the louder competition is the cloud/ML AI-SAST wave — DryRun, Aikido, Snyk Code, Checkmarx, GitHub Advanced Security. They go far deeper than overreach, and that's the point: they are the depth tier. overreach lives at the opposite end — local, deterministic, no-AI, no-telemetry, free, and instant. It never competes on analysis depth; it competes on being the zero-friction tripwire you can run on every push with nothing sent anywhere.
Reach for something else when:
- you need semantic dataflow or taint analysis (Semgrep, CodeQL);
- you need full secret-scanning coverage across git history (gitleaks, TruffleHog);
- you need live key validation;
- you need policy-as-code rules maintained by a security team.
Reach for overreach when:
- you want a fast, local tripwire for new network, subprocess, TLS-disable, sensitive-file, or hardcoded-secret patterns in a diff.
The capability grid above is a positioning sketch — every tool is more configurable than one table can show, and the right answer is often
overreachand one of them. Speed, on the other hand, is measured:
overreach scans a 40 MiB / 3,000-file tree for all six detector families in ~0.13 s — near ripgrep-class speed while running every detector family, and far ahead of gitleaks, TruffleHog, grep, and Semgrep. Fast enough to run on every push.
The benchmark below is reproducible from bench/; it scans an identical generated corpus and reports wall-clock time. These tools do not all solve the same problem — gitleaks/TruffleHog do secrets only (with entropy and live verification); Semgrep does the AST/dataflow analysis overreach deliberately skips — so slower does not mean worse.
| Tool | 40 MiB / 3k-file scan | what it is |
|---|---|---|
ripgrep (parallel) |
68 ms | the fastest line scanner |
overreach (parallel) |
127 ms | all six detector families |
ripgrep -j1 |
215 ms | single-threaded, same engine |
gitleaks |
654 ms | secrets only · entropy, git history |
TruffleHog |
1.11 s | secrets only · live key verification |
grep -rP |
2.82 s | naive recursive baseline |
Semgrep |
34.8 s | AST framework running equivalent pattern checks |
The worst failure mode for a security tool is a silent gap that reads as a pass. overreach is built so a clean report is trustworthy:
- Symlink-safe walk. Mid-walk, symlinks are skipped via
symlink_metadata, so a hostile checkout can't escape the scan root (link -> /etc) or loop forever (link -> ..). A user-supplied entrypoint is followed exactly once, as deliberate intent. - Skips are surfaced, not swallowed. A file that's over the 8 MiB cap, not valid UTF-8, or unreadable — an unreadable directory, or a path that can't be stat'd mid-walk — each becomes a low-severity coverage-gap finding (
file_too_large_to_scan,file_not_utf8,file_unreadable,directory_unreadable,path_unreadable) rather than vanishing. An attacker can't hide a key in a 9 MB blob, a binary blob, or a locked-down file and get back "clean." - An unscannable entrypoint is a hard failure, not a pass. If the path you point
overreachat can't be read at all (missing, permission denied), it exits2— distinct from1(findings at/above--fail-on) and0(scanned clean). A security gate must never exit0on something it couldn't scan. - Redaction is pinned by a canary test.
rendered_output_never_echoes_a_credential_valueplants a sentinel secret and asserts it never appears in any rendered output, so a future change can't accidentally start printing credentials. The exit-code contract, diff line-numbering, the++-vs-header edge case, every coverage gap above, and the false-positive guards (regex.exec,tokio::spawn) are all pinned by unit and CLI integration tests.
# .github/workflows/overreach.yml
name: overreach
on: pull_request
# Minimum-privilege token: this job only needs to read source.
permissions:
contents: read
jobs:
overreach:
runs-on: ubuntu-latest
steps:
# Pin third-party actions to commit SHAs (with the tag as a comment)
# so a compromised upstream tag can't silently change what runs in CI.
- uses: actions/checkout@93cb6efe18208431cddfb8368fd83d5badbf9bfd # v5
with:
fetch-depth: 0
# Don't leave GITHUB_TOKEN in .git/config — `cargo build` runs
# build scripts from every transitive dep.
persist-credentials: false
- uses: dtolnay/rust-toolchain@29eef336d9b2848a0b548edc03f92a220660cdb8 # stable
# --locked enforces the committed Cargo.lock.
- run: cargo build --release --locked
- name: Scan the PR diff
env:
# Route the trigger value through env so it can't be interpolated
# into the shell script body.
BASE_REF: ${{ github.base_ref }}
# pipefail so a failed `git diff` can't be masked by the trailing pipe
# and let the scan report "clean" on an unscanned diff.
run: |
set -euo pipefail
# The default gate fails only on critical findings (curl|sh, secrets,
# sensitive-file reads). Add --fail-on high to also block new network
# calls and subprocess spawns.
git diff "origin/$BASE_REF...HEAD" | ./target/release/overreach --diffThe repo also ships a composite action (action.yml) that builds overreach and scans the PR diff. It does not check out your code — you must run actions/checkout first, with fetch-depth: 0 so the base ref exists to diff against:
name: overreach
on: pull_request
permissions:
contents: read
jobs:
overreach:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@93cb6efe18208431cddfb8368fd83d5badbf9bfd # v5
with:
fetch-depth: 0
persist-credentials: false
- uses: Conalh/overreach@v0.2.0
# Defaults to scanning the PR diff and failing on critical findings.
# with:
# fail-on: high # also block new network calls / subprocess spawns
# path: src/ # scan a path in full-tree mode instead of the PR diffoverreach is the standalone, language-agnostic cousin of CapabilityEcho from the agent-gov suite — the same idea (catch capability drift in a diff), repackaged as one fast binary with no Node and no suite to adopt. Use overreach for a quick gate anywhere; reach for the full agent-gov suite when you want cross-tool consolidation and a single PR verdict.
MIT © Conal Hickey
