feat(detector): detect GitHub stateless (JWT-format) ghs_ installation tokens by cemililik · Pull Request #15 · HodeTech/Leakwatch

cemililik · 2026-05-25T11:43:32Z

Summary

Detects GitHub's new stateless (JWT-format) ghs_ installation tokens (rolled out from April 2026 to the Actions GITHUB_TOKEN and App installation tokens). The new format is ghs_APPID_<jwt> — a ghs_-prefixed JWT of ~520 chars containing exactly two dots; segments are base64url (A-Za-z0-9, _, -).

The old pattern gh[orus]_[A-Za-z0-9_]{36,} had no . in its body class, so a new token was either truncated at the first dot (its JWT body falling to the generic jwt detector → one secret, two wrong findings) or missed entirely when a base64url - appeared before the body reached 36 chars (only flagged as a generic JWT). Both failure modes were reproduced before the change and proven fixed after.

What changed

github-oauth-token detector (github_oauth.go) — ordered alternation: a stateless branch ghs_[A-Za-z0-9_-]{8,}(?:\.[A-Za-z0-9_-]{8,}){2} (listed first so it wins over the opaque branch at a ghs_ start) plus the unchanged opaque branch gh[orus]_[A-Za-z0-9_]{36,}. The whole token is captured as a single finding. gho_/ghu_/ghr_/legacy ghs_ are unchanged and never start eating dots; ghp_ stays with the separate github-token detector. Kept as one raw-string literal so the tools/site-build AST extractor can still read it.
jwt detector (jwt.go) — suppresses a JWT that is the body of a ghs_ token (walks back over the contiguous token run and checks it contains ghs_; RE2 has no lookbehind), so the secret is reported exactly once by the more specific detector. Sound because the GitHub stateless floors ({8,}) are ≤ the JWT floors ({10,}), so whenever this fires the GitHub detector also matches the whole token — suppression can only drop a duplicate, never a secret.
Docs — README detector table, en/tr detector catalogs, CHANGELOG; site bundle regenerated (site/js/manuals/{en,tr}.js, site/js/detectors.js).

Design decisions

ghu_ left opaque on purpose — GitHub has signalled a later ghu_ format change but has not published it; speculating risks false matches. The code documents exactly where to extend (gh[su]_…).
Verification unaffected & not misleading — a bogus token → /user 401 → inactive; a live installation token → 403 → verify-error (not a false active/revoked). Sending the whole token is strictly more correct than the old truncated fragment.
Redaction still reveals only the trailing four characters, safe for a ~520-char token.

Review follow-ups (in this PR)

After a multi-angle review, three findings were actioned (the rest accepted with rationale: inherent benign over-capture, a rare FP class already tighter than GitHub's own regex, and the documented ghu_ extension point):

chore: Code of Conduct, issue templates, discussions #1 fix(jwt) — suppression made robust to a base64url char glued before ghs_ (Contains instead of HasPrefix); removes a contrived double-report.
ci: GoReleaser Homebrew tap auto-update #2 test(detector) — a guard pins the generated detectors.js to the live registry, so a future non-AST-extractable pattern fails CI instead of silently vanishing from the web playground (the previous protection was only a code comment). Verified it fails on a simulated drop.
feat: CLI UX improvements — help, init, colors, default cwd #5 test — an explicit GitHub /user 403 case documenting the installation-token verify-error behavior.

Verification

gofumpt -l . empty · go vet ./... · go build ./... · go test -race ./... (no failures/races) · golangci-lint run ./... --config .golangci.yml → 0 issues · github(det)/jwt(det)/github(verifier) 100% coverage · tools/site-build regeneration leaves the tree clean.

Notes

Independent of the in-flight GitHub Marketplace Action PR — branched off main, no shared concerns.
Follow-up suggestion: open a tracking issue to extend the stateless branch to ghu_ once GitHub publishes that format (Doc & code cleanup: align with v1.5.0, fix SonarCloud BLOCKERs, raise dbconn coverage #6).

🤖 Generated with Claude Code

Summary by CodeRabbit

Release Notes

New Features
- Added detection support for GitHub stateless installation tokens (JWT-formatted ghs_ tokens).
Bug Fixes
- Eliminated duplicate findings when both GitHub OAuth and JWT detectors matched the same stateless token.
Documentation
- Updated detector catalog and README to reflect expanded GitHub token type coverage.

… tokens From April 2026 GitHub issues installation tokens (including the Actions GITHUB_TOKEN) in a new ghs_APPID_<jwt> format: a ghs_-prefixed JWT of ~520 chars containing exactly two dots. The previous github-oauth-token pattern `gh[orus]_[A-Za-z0-9_]{36,}` had no dot in its body class, so it truncated such a token at the first dot — and missed it entirely when a base64url '-' appeared before the body reached 36 chars — while the JWT body fell through to the generic jwt detector. One secret was reported as two wrong findings (or one wrong one). Changes: - github-oauth-token now matches an ordered alternation: a stateless branch `ghs_[A-Za-z0-9_-]{8,}(?:\.[A-Za-z0-9_-]{8,}){2}` (listed first so it wins over the opaque branch) plus the unchanged opaque branch `gh[orus]_[A-Za-z0-9_]{36,}`. The whole token is captured as one finding. Opaque gho_/ghu_/ghr_/legacy ghs_ are unchanged and never start eating dots; ghp_ stays with the separate github-token detector. - jwt detector suppresses a JWT that is the body of a ghs_ token (walks back over the preceding token run and checks for a "ghs_" prefix; RE2 has no lookbehind), so the secret is reported exactly once by the more specific detector. Decisions: - ghu_ (user-to-server) is intentionally left opaque: GitHub has signalled a later format change but has not published it; speculating risks false matches. A note in the code says where to extend when documented. - Verification is unaffected: a bogus token -> 401 -> inactive; a live installation token -> 403 -> verify-error (not a false active/revoked). Redaction still reveals only the trailing four characters. Tests are table-driven and cover full capture of both failure modes (long-header truncation, dash-early miss), opaque regressions, no over-capture into trailing context, the exactly-one-detector invariant, and jwt suppression. github and jwt detector packages remain at 100% coverage. Docs: README detector table, en/tr detector catalogs, and CHANGELOG updated; site bundle regenerated (site/js/manuals/{en,tr}.js, site/js/detectors.js). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

isGitHubStatelessBody required the contiguous token run before a JWT to BEGIN with "ghs_". When a base64url char is glued directly in front (e.g. "xghs_APPID_eyJ...eyJ...sig" with no delimiter), the run was "xghs_APPID_" so the JWT was not recognised as a ghs_ body and was reported again — while the unanchored github-oauth-token pattern still matched "ghs_..." mid-string, double-reporting the same secret. Use bytes.Contains instead of HasPrefix. Wherever "ghs_" appears in the run the run has no dots, so it is glued onto the JWT and forms a "ghs_...eyJ.eyJ.sig" shape the github detector captures in full (its segment floors {8,} are <= this detector's {10,}); suppressing can therefore only drop a duplicate, never a secret. Realistic delimiters (=, ", space, newline, :, /) are not token bytes, so this only tightens a contrived edge. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

GH-2: tools/site-build extracts each detector's regex from the AST and only emits a detector with a single regexp.MustCompile(`literal`); a concatenated / const / fmt.Sprintf pattern silently vanishes from the web playground while the registry count test still passes. Add TestDetectorsJS_CoversEveryRegisteredDetector (in the golden-count test that already blank-imports every detector) to pin the bundle to detector.All() minus the documented skips (generic). Verified it fails with an actionable message when an entry is dropped. This converts the previous "keep it one raw-string literal" code comment into an enforced invariant. GH-5: add an explicit GitHub /user 403 verifier case. A live stateless installation token authenticates as an app installation, so /user answers 403, which is neither active (200) nor inactive (401) and maps to verify-error — documenting that it is never mislabelled active or "invalid or revoked". Both are test-only; no production behavior changes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

sourcery-ai

Sorry @cemililik, your pull request is larger than the review limit of 500000 diff characters

coderabbitai · 2026-05-25T11:43:46Z

📝 Walkthrough

Walkthrough

Extended GitHub token detection to support stateless ghs_ installation tokens in JWT format. Updated OAuth detector regex, suppressed duplicate JWT detector matches for embedded bodies, synchronized web playground detector, added registry validation, and updated all documentation.

Changes

GitHub Stateless Token Detection

Layer / File(s)	Summary
GitHub OAuth Detector: Stateless Token Pattern & Tests `internal/detector/github/github_oauth.go`, `internal/detector/github/github_oauth_test.go`	Extended `oauthTokenPattern` regex to match stateless JWT-shaped `ghs_` tokens and legacy opaque formats (`gho_`, `ghu_`, `ghr_`, `ghs_`). Updated detector description. Added tests for full-token capture, redaction, boundary handling (no over-capture across dots), legacy backward compatibility, and regression check ensuring no detector overlap on stateless tokens.
JWT Detector: Suppress Embedded Stateless Token Bodies `internal/detector/jwt/jwt.go`, `internal/detector/jwt/jwt_test.go`	Reworked JWT scanner to use byte-range iteration and detect `ghs_` prefix in preceding context, suppressing JWT matches that form the body of stateless tokens. Added helper functions `isGitHubStatelessBody` and `isTokenByte`. Prevents duplicate findings while preserving standalone JWT detection with comprehensive test coverage across embedding scenarios.
Verifier: HTTP 403 Status Mapping for Stateless Tokens `internal/verifier/github/github_oauth_verifier_test.go`	Added test documenting that stateless installation tokens receive HTTP 403 on GitHub `/user` endpoint and that `Verify` correctly returns `StatusVerifyError` with status code in message.
Web Playground Detector: Frontend Pattern Update `site/js/detectors.js`	Updated JavaScript detector regex to include stateless `ghs_` token format (dot-separated base64url segments) alongside opaque token patterns, keeping frontend detector synchronized with backend.
Detector Registry Cross-Check: Prevent Silent Detector Drops `internal/detector/registry_count_test.go`	Added `TestDetectorsJS_CoversEveryRegisteredDetector` with helpers to validate that generated `site/js/detectors.js` bundle includes all registered detectors (except documented skips) and contains no stale IDs. Prevents silent detector removal due to build or bundling issues.
Documentation & Changelog Updates `CHANGELOG.md`, `README.md`, `docs/user-manuals/en/detectors/detector-catalog.md`, `docs/user-manuals/tr/detectors/detector-catalog.md`	Updated all user-facing documentation to describe stateless `ghs_` JWT-format token support, documenting the fix for duplicate JWT detection and the new registry validation test, and expanding detector catalog entries across English and Turkish user manuals.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 A rabbit hops through GitHub tokens with delight,
Stateless ghs_ twins now shine so bright,
No more double-reports from JWT's keen eye,
Registry guards ensure none slip by! ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 60.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately and specifically describes the main change: adding detection for GitHub stateless JWT-format ghs_ installation tokens, which is the central focus across all modified detector code and tests.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/github-stateless-ghs-token

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist

Code Review

This pull request enhances the github-oauth-token detector to support GitHub's new stateless (JWT-format) ghs_ installation tokens. The changes include updating the regex pattern to capture these tokens in full and modifying the jwt detector to suppress duplicate findings when a JWT is identified as part of a GitHub stateless token. Additionally, the PR includes comprehensive tests for these changes, updates documentation, and introduces a new test to ensure the web playground bundle remains synchronized with the detector registry. I have no feedback to provide as there were no review comments to assess.

coderabbitai

Actionable comments posted: 3

🧹 Nitpick comments (1)

internal/verifier/github/github_oauth_verifier_test.go (1)

100-124: ⚡ Quick win

Use a table-driven case for this new verifier scenario.

This adds another single-case function in a _test.go file; please express it as a table entry (or fold it into a shared status-mapping table) to stay compliant and keep scenarios easier to extend.

As per coding guidelines, "Use table-driven tests in test files".

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/verifier/github/github_oauth_verifier_test.go` around lines 100 -
124, Convert the single-case test TestOAuthVerify_Forbidden_ReturnsVerifyError
into a table-driven test by creating a test table (slice of structs) with a
descriptive name field and entries for the forbidden scenario (and any existing
scenarios) and iterate with t.Run for each case; inside each case construct the
httptest server (as currently done), instantiate OAuthVerifier (apiURL,
httpClient), build the detector.RawFinding (using oauthDetectorID and the same
Raw/Redacted values), call v.Verify(ctx, raw) and assert expected result.Status
and result.Message for that case; update references to
TestOAuthVerify_Forbidden_ReturnsVerifyError to the new table-driven test
function name and ensure each case is self-contained so tests remain isolated.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@internal/detector/github/github_oauth_test.go`:
- Around line 100-225: The PR fails the repository-wide detector test-coverage
gate because several detector packages are below 95% (internal/detector, gcp,
generic, custom, privatekey, snowflake, stripe, testutil); add unit tests that
exercise each package's exported detector Scan implementations and helper
functions to raise coverage: create table-driven tests (mirroring patterns in
internal/detector/github_oauth_test.go) that call detectors'
Scan(context.Background(), []byte(...)) for expected matches and non-matches,
assert finding counts and fields (use symbols like OAuthDetector.Scan,
Token.Scan, and any package-specific detector types such as GCPDetector or
StripeDetector), and add tests for utility code in testutil to cover edge cases
and error paths; aim to hit the uncovered branches (negative cases, boundary
inputs, and redaction logic) so each listed package reaches ≥95% coverage and
the gate passes.

In `@internal/detector/jwt/jwt.go`:
- Around line 66-99: The detector packages overall miss the 95% coverage gate;
add focused unit tests to exercise uncovered branches in the detector packages
(e.g., custom, gcp, generic, heroku, privatekey, snowflake, stripe and testutil)
so overall detector coverage rises above 95%. Specifically, create table-driven
tests that exercise positive and negative detection paths and edge cases
(including token boundary and separator behavior similar to
jwt.isGitHubStatelessBody and jwt.isTokenByte), add mocks or sample inputs for
cloud/provider-specific detectors (GCP, Heroku, Stripe, Snowflake, private keys,
custom patterns), and include tests for testutil helpers so they are executed;
run `go test ./internal/detector/... -coverprofile` to verify coverage and
iterate until the detector package coverage meets the 95% threshold.

In `@internal/detector/registry_count_test.go`:
- Around line 199-202: The raw error returned from os.Getwd should be wrapped
with context before returning; update the failing branch that calls os.Getwd
(the block returning "", err) to return a wrapped error using
fmt.Errorf("getting working directory: %w", err) (and add a fmt import if
missing) so the function (the helper that calls os.Getwd in
registry_count_test.go) preserves call-site context per the repo error-wrapping
rule.

---

Nitpick comments:
In `@internal/verifier/github/github_oauth_verifier_test.go`:
- Around line 100-124: Convert the single-case test
TestOAuthVerify_Forbidden_ReturnsVerifyError into a table-driven test by
creating a test table (slice of structs) with a descriptive name field and
entries for the forbidden scenario (and any existing scenarios) and iterate with
t.Run for each case; inside each case construct the httptest server (as
currently done), instantiate OAuthVerifier (apiURL, httpClient), build the
detector.RawFinding (using oauthDetectorID and the same Raw/Redacted values),
call v.Verify(ctx, raw) and assert expected result.Status and result.Message for
that case; update references to TestOAuthVerify_Forbidden_ReturnsVerifyError to
the new table-driven test function name and ensure each case is self-contained
so tests remain isolated.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 3c1fc6cb-7da4-4eac-b9ec-e52c8406ee15

📥 Commits

Reviewing files that changed from the base of the PR and between cbe8c4d and 8a4fbeb.

📒 Files selected for processing (13)

CHANGELOG.md
README.md
docs/user-manuals/en/detectors/detector-catalog.md
docs/user-manuals/tr/detectors/detector-catalog.md
internal/detector/github/github_oauth.go
internal/detector/github/github_oauth_test.go
internal/detector/jwt/jwt.go
internal/detector/jwt/jwt_test.go
internal/detector/registry_count_test.go
internal/verifier/github/github_oauth_verifier_test.go
site/js/detectors.js
site/js/manuals/en.js
site/js/manuals/tr.js

coderabbitai · 2026-05-25T11:50:16Z

+// fakeStatelessToken builds an obviously-fake GitHub stateless installation
+// token of the ghs_APPID_<jwt> form (header.payload.signature). It is assembled
+// from parts at runtime so the source file never contains a contiguous,
+// real-looking token literal that secret push-protection could flag.
+func fakeStatelessToken(headerTail string) string {
+	const appID = "12345678"
+	header := "eyJ" + headerTail
+	payload := "eyJ" + strings.Repeat("Gh1Ij2Kl", 30)
+	signature := strings.Repeat("Mn3Op4Qr", 12)
+	return "ghs_" + appID + "_" + header + "." + payload + "." + signature
+}
+
+// TestOAuthDetector_Scan_StatelessToken_CapturesWholeToken proves the new
+// ghs_APPID_<jwt> stateless installation tokens are captured in full by a single
+// github-oauth-token finding (the pre-2026 behaviour truncated them at the first
+// dot or missed them entirely when a base64url '-' appeared early).
+func TestOAuthDetector_Scan_StatelessToken_CapturesWholeToken(t *testing.T) {
+	tests := []struct {
+		name  string
+		token string
+	}{
+		{
+			name:  "long alphanumeric header segment",
+			token: fakeStatelessToken(strings.Repeat("Ab9Cd0Ef", 5)),
+		},
+		{
+			name:  "base64url dash early in header",
+			token: fakeStatelessToken("Ab-Cd0Ef9Gh"),
+		},
+		{
+			name:  "base64url underscore in header",
+			token: fakeStatelessToken("Ab_Cd0Ef9Gh_Ij"),
+		},
+		{
+			name:  "short app id",
+			token: "ghs_42_" + "eyJ" + strings.Repeat("Ab9Cd0Ef", 4) + "." + "eyJ" + strings.Repeat("Gh1Ij2Kl", 20) + "." + strings.Repeat("Mn3Op4Qr", 10),
+		},
+	}
+
+	d := &OAuthDetector{}
+	for _, tt := range tests {
+		t.Run(tt.name, func(t *testing.T) {
+			findings := d.Scan(context.Background(), []byte(tt.token))
+			require.Len(t, findings, 1, "stateless token must yield exactly one finding")
+
+			f := findings[0]
+			assert.Equal(t, "github-oauth-token", f.DetectorID)
+			// The whole token is captured, not just the header segment.
+			assert.Equal(t, tt.token, string(f.Raw), "must capture the entire token")
+			assert.Greater(t, len(f.Raw), 100, "stateless tokens are long")
+
+			// Redaction stays safe for a long token: only the last four
+			// characters are ever revealed.
+			assert.Equal(t, "****"+tt.token[len(tt.token)-4:], f.Redacted)
+			assert.Len(t, f.Redacted, len("****")+4)
+			assert.NotContains(t, f.Redacted, tt.token[:len(tt.token)-4],
+				"redaction must not expose the token body")
+		})
+	}
+}
+
+// TestOAuthDetector_Scan_NoOverCapture guards the greedy branches against eating
+// surrounding context: opaque tokens must not start consuming dots, and a
+// stateless token must stop at its third (signature) segment.
+func TestOAuthDetector_Scan_NoOverCapture(t *testing.T) {
+	suffix40 := strings.Repeat("Abc1D678", 5)
+	stateless := fakeStatelessToken(strings.Repeat("Ab9Cd0Ef", 5))
+
+	tests := []struct {
+		name  string
+		input string
+		want  string // expected single captured match
+	}{
+		{
+			name:  "opaque gho_ followed by dotted domain",
+			input: "gho_" + suffix40 + ".example.com",
+			want:  "gho_" + suffix40,
+		},
+		{
+			name:  "stateless token at end of a sentence",
+			input: "leaked token: " + stateless + ". Please rotate it.",
+			want:  stateless,
+		},
+		{
+			name:  "stateless token followed by a fourth dotted segment",
+			input: stateless + "." + strings.Repeat("Qq11Ww22", 8),
+			want:  stateless,
+		},
+	}
+
+	d := &OAuthDetector{}
+	for _, tt := range tests {
+		t.Run(tt.name, func(t *testing.T) {
+			findings := d.Scan(context.Background(), []byte(tt.input))
+			require.Len(t, findings, 1)
+			assert.Equal(t, tt.want, string(findings[0].Raw))
+		})
+	}
+}
+
+// TestOAuthDetector_Scan_LegacyOpaqueUnchanged confirms the legacy opaque shapes
+// (including legacy opaque ghs_) are still captured whole and unchanged.
+func TestOAuthDetector_Scan_LegacyOpaqueUnchanged(t *testing.T) {
+	suffix40 := strings.Repeat("Abc1D678", 5)
+	for _, prefix := range []string{"gho_", "ghu_", "ghr_", "ghs_"} {
+		t.Run(prefix, func(t *testing.T) {
+			token := prefix + suffix40
+			findings := (&OAuthDetector{}).Scan(context.Background(), []byte(token))
+			require.Len(t, findings, 1)
+			assert.Equal(t, token, string(findings[0].Raw))
+		})
+	}
+}
+
+// TestGitHubDetectors_StatelessNoPrefixOverlap ensures a stateless ghs_ token is
+// still claimed by exactly one of the two GitHub detectors (never the ghp_
+// personal-access-token detector).
+func TestGitHubDetectors_StatelessNoPrefixOverlap(t *testing.T) {
+	token := []byte(fakeStatelessToken(strings.Repeat("Ab9Cd0Ef", 5)))
+
+	tokenFindings := (&Token{}).Scan(context.Background(), token)
+	oauthFindings := (&OAuthDetector{}).Scan(context.Background(), token)
+
+	assert.Empty(t, tokenFindings, "ghp_ detector must not claim a ghs_ token")
+	require.Len(t, oauthFindings, 1, "oauth detector must claim the ghs_ token")
+}


⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Check test coverage for the github detector package cd internal/detector/github go test -cover -coverprofile=coverage.out go tool cover -func=coverage.out | grep total

Repository: HodeTech/Leakwatch

Length of output: 474

🏁 Script executed:

#!/bin/bash set -euo pipefail pkgs=$(go list ./internal/detector/...) echo "Detector packages:" echo "$pkgs" fail=0 for pkg in $pkgs; do echo "== $pkg ==" tmpfile="/tmp/cover_${pkg//\//_}.out" go test -cover -coverprofile="$tmpfile" "$pkg" >/tmp/cover_stdout.txt 2>/tmp/cover_stderr.txt || { echo "go test failed for $pkg" cat /tmp/cover_stderr.txt fail=1 continue } total=$(go tool cover -func="$tmpfile" | awk '/total:/{print $3}') echo "total: $total" # total is like 100.0% pct=${total%%%} if awk "BEGIN{exit !($pct < 95)}"; then echo "Coverage below 95% for $pkg: $total" fail=1 fi done if [ "$fail" -ne 0 ]; then echo "One or more detector packages are below 95% coverage." exit 1 fi echo "All detector packages meet 95%+ coverage."

Repository: HodeTech/Leakwatch

Length of output: 9032

Test coverage gate for detector packages fails (95% not met overall)
internal/detector/github is at 100%, but multiple detector packages are below the 95% minimum (e.g., internal/detector 81.2%, gcp 88.7%, generic 83.3%, custom 92.9%, privatekey 93.8%, snowflake 92.3%, stripe **92.3%, and testutil` 0.0%), so the detector coverage requirement is not satisfied.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/detector/github/github_oauth_test.go` around lines 100 - 225, The PR fails the repository-wide detector test-coverage gate because several detector packages are below 95% (internal/detector, gcp, generic, custom, privatekey, snowflake, stripe, testutil); add unit tests that exercise each package's exported detector Scan implementations and helper functions to raise coverage: create table-driven tests (mirroring patterns in internal/detector/github_oauth_test.go) that call detectors' Scan(context.Background(), []byte(...)) for expected matches and non-matches, assert finding counts and fields (use symbols like OAuthDetector.Scan, Token.Scan, and any package-specific detector types such as GCPDetector or StripeDetector), and add tests for utility code in testutil to cover edge cases and error paths; aim to hit the uncovered branches (negative cases, boundary inputs, and redaction logic) so each listed package reaches ≥95% coverage and the gate passes.

coderabbitai · 2026-05-25T11:50:17Z

+// isGitHubStatelessBody reports whether the JWT beginning at start is the body
+// of a GitHub stateless installation token (ghs_APPID_<jwt>). RE2 has no
+// lookbehind, so it walks back over the contiguous token run (base64url plus the
+// ghs_/app-ID separators) immediately preceding the match and checks whether
+// that run contains the literal "ghs_".
+//
+// Contains rather than HasPrefix: the run may carry leading base64url bytes with
+// no delimiter (e.g. "xghs_APPID_"). Wherever "ghs_" appears in the run, the run
+// has no dots (dots are not token bytes) so it is glued straight onto this JWT,
+// forming a "ghs_...eyJ.eyJ.sig" shape that the github-oauth-token detector
+// captures in full — its per-segment floors ({8,}) are at or below this
+// detector's ({10,}). Suppressing here therefore only removes a duplicate of a
+// secret the github detector already reports; it can never drop one. (This
+// assumes the github-oauth-token detector is active, which it is by default.)
+func isGitHubStatelessBody(data []byte, start int) bool {
+	i := start
+	for i > 0 && isTokenByte(data[i-1]) {
+		i--
+	}
+	return bytes.Contains(data[i:start], ghsPrefix)
+}
+
+// isTokenByte reports whether b is part of a contiguous token run: a base64url
+// character or one of the separators ('_', '-') that appear in a ghs_ token.
+func isTokenByte(b byte) bool {
+	switch {
+	case b >= 'a' && b <= 'z', b >= 'A' && b <= 'Z', b >= '0' && b <= '9':
+		return true
+	case b == '_', b == '-':
+		return true
+	default:
+		return false
+	}
+}


⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Check test coverage for the jwt detector package cd internal/detector/jwt go test -cover -coverprofile=coverage.out go tool cover -func=coverage.out | grep total

Repository: HodeTech/Leakwatch

Length of output: 267

🏁 Script executed:

#!/bin/bash set -euo pipefail # List all detector packages packages=$(go list ./internal/detector/... | tr '\n' ' ') echo "Detector packages:" echo "$packages" for pkg in $(go list ./internal/detector/...); do echo "== $pkg ==" # Create a temp coverprofile per package tmpfile=$(mktemp) # Run tests with coverage go test -cover -coverprofile="$tmpfile" "$pkg" >/dev/null # Extract total statement coverage total=$(go tool cover -func="$tmpfile" | awk '/total:/ {print $3}') echo "statement_coverage: $total" rm -f "$tmpfile" done

Repository: HodeTech/Leakwatch

Length of output: 9124

Coverage gate: detector packages are below the 95% minimum

github.com/HodeTech/leakwatch/internal/detector overall statement coverage is 81.2%

Multiple detector packages are under 95% (e.g., custom 92.9%, gcp 88.7%, generic 83.3%, heroku 92.9%, privatekey 93.8%, snowflake 92.3%, stripe **92.3%, testutil` 0.0%)

internal/detector/jwt itself is 100%, but the stated requirement is not satisfied across detector packages.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/detector/jwt/jwt.go` around lines 66 - 99, The detector packages overall miss the 95% coverage gate; add focused unit tests to exercise uncovered branches in the detector packages (e.g., custom, gcp, generic, heroku, privatekey, snowflake, stripe and testutil) so overall detector coverage rises above 95%. Specifically, create table-driven tests that exercise positive and negative detection paths and edge cases (including token boundary and separator behavior similar to jwt.isGitHubStatelessBody and jwt.isTokenByte), add mocks or sample inputs for cloud/provider-specific detectors (GCP, Heroku, Stripe, Snowflake, private keys, custom patterns), and include tests for testutil helpers so they are executed; run `go test ./internal/detector/... -coverprofile` to verify coverage and iterate until the detector package coverage meets the 95% threshold.

coderabbitai · 2026-05-25T11:50:17Z

+	dir, err := os.Getwd()
+	if err != nil {
+		return "", err
+	}


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Wrap the returned os.Getwd error with context.

Line 201 returns a raw error, which drops call-site context and violates the repo’s error-wrapping rule.

Suggested patch

func repoRoot() (string, error) { dir, err := os.Getwd() if err != nil { - return "", err + return "", fmt.Errorf("get working directory: %w", err) }

As per coding guidelines, "Wrap every error with fmt.Errorf("context: %w", err) before returning".

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

dir, err := os.Getwd()

if err != nil {

return "", err

}

dir, err := os.Getwd()

if err != nil {

return "", fmt.Errorf("get working directory: %w", err)

}

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/detector/registry_count_test.go` around lines 199 - 202, The raw error returned from os.Getwd should be wrapped with context before returning; update the failing branch that calls os.Getwd (the block returning "", err) to return a wrapped error using fmt.Errorf("getting working directory: %w", err) (and add a fmt import if missing) so the function (the helper that calls os.Getwd in registry_count_test.go) preserves call-site context per the repo error-wrapping rule.

cemililik and others added 3 commits May 25, 2026 13:59

sourcery-ai Bot reviewed May 25, 2026

View reviewed changes

gemini-code-assist Bot reviewed May 25, 2026

View reviewed changes

coderabbitai Bot reviewed May 25, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(detector): detect GitHub stateless (JWT-format) ghs_ installation tokens#15

feat(detector): detect GitHub stateless (JWT-format) ghs_ installation tokens#15
cemililik wants to merge 3 commits into
mainfrom
feat/github-stateless-ghs-token

cemililik commented May 25, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

sourcery-ai Bot left a comment

Uh oh!

coderabbitai Bot commented May 25, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot May 25, 2026

Uh oh!

coderabbitai Bot May 25, 2026

Uh oh!

coderabbitai Bot May 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

cemililik commented May 25, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What changed

Design decisions

Review follow-ups (in this PR)

Verification

Notes

Summary by CodeRabbit

Release Notes

Uh oh!

sourcery-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot commented May 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 25, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

cemililik commented May 25, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 25, 2026 •

edited

Loading