feat(detector): detect GitHub stateless (JWT-format) ghs_ installation tokens#15
feat(detector): detect GitHub stateless (JWT-format) ghs_ installation tokens#15cemililik wants to merge 3 commits into
Conversation
… tokens
From April 2026 GitHub issues installation tokens (including the Actions
GITHUB_TOKEN) in a new ghs_APPID_<jwt> format: a ghs_-prefixed JWT of ~520
chars containing exactly two dots. The previous github-oauth-token pattern
`gh[orus]_[A-Za-z0-9_]{36,}` had no dot in its body class, so it truncated
such a token at the first dot — and missed it entirely when a base64url '-'
appeared before the body reached 36 chars — while the JWT body fell through
to the generic jwt detector. One secret was reported as two wrong findings
(or one wrong one).
Changes:
- github-oauth-token now matches an ordered alternation: a stateless branch
`ghs_[A-Za-z0-9_-]{8,}(?:\.[A-Za-z0-9_-]{8,}){2}` (listed first so it wins
over the opaque branch) plus the unchanged opaque branch
`gh[orus]_[A-Za-z0-9_]{36,}`. The whole token is captured as one finding.
Opaque gho_/ghu_/ghr_/legacy ghs_ are unchanged and never start eating
dots; ghp_ stays with the separate github-token detector.
- jwt detector suppresses a JWT that is the body of a ghs_ token (walks back
over the preceding token run and checks for a "ghs_" prefix; RE2 has no
lookbehind), so the secret is reported exactly once by the more specific
detector.
Decisions:
- ghu_ (user-to-server) is intentionally left opaque: GitHub has signalled a
later format change but has not published it; speculating risks false
matches. A note in the code says where to extend when documented.
- Verification is unaffected: a bogus token -> 401 -> inactive; a live
installation token -> 403 -> verify-error (not a false active/revoked).
Redaction still reveals only the trailing four characters.
Tests are table-driven and cover full capture of both failure modes
(long-header truncation, dash-early miss), opaque regressions, no
over-capture into trailing context, the exactly-one-detector invariant, and
jwt suppression. github and jwt detector packages remain at 100% coverage.
Docs: README detector table, en/tr detector catalogs, and CHANGELOG updated;
site bundle regenerated (site/js/manuals/{en,tr}.js, site/js/detectors.js).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
isGitHubStatelessBody required the contiguous token run before a JWT to BEGIN
with "ghs_". When a base64url char is glued directly in front (e.g.
"xghs_APPID_eyJ...eyJ...sig" with no delimiter), the run was "xghs_APPID_" so
the JWT was not recognised as a ghs_ body and was reported again — while the
unanchored github-oauth-token pattern still matched "ghs_..." mid-string,
double-reporting the same secret.
Use bytes.Contains instead of HasPrefix. Wherever "ghs_" appears in the run the
run has no dots, so it is glued onto the JWT and forms a "ghs_...eyJ.eyJ.sig"
shape the github detector captures in full (its segment floors {8,} are <= this
detector's {10,}); suppressing can therefore only drop a duplicate, never a
secret. Realistic delimiters (=, ", space, newline, :, /) are not token bytes,
so this only tightens a contrived edge.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
GH-2: tools/site-build extracts each detector's regex from the AST and only emits a detector with a single regexp.MustCompile(`literal`); a concatenated / const / fmt.Sprintf pattern silently vanishes from the web playground while the registry count test still passes. Add TestDetectorsJS_CoversEveryRegisteredDetector (in the golden-count test that already blank-imports every detector) to pin the bundle to detector.All() minus the documented skips (generic). Verified it fails with an actionable message when an entry is dropped. This converts the previous "keep it one raw-string literal" code comment into an enforced invariant. GH-5: add an explicit GitHub /user 403 verifier case. A live stateless installation token authenticates as an app installation, so /user answers 403, which is neither active (200) nor inactive (401) and maps to verify-error — documenting that it is never mislabelled active or "invalid or revoked". Both are test-only; no production behavior changes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Sorry @cemililik, your pull request is larger than the review limit of 500000 diff characters
📝 WalkthroughWalkthroughExtended GitHub token detection to support stateless ChangesGitHub Stateless Token Detection
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Code Review
This pull request enhances the github-oauth-token detector to support GitHub's new stateless (JWT-format) ghs_ installation tokens. The changes include updating the regex pattern to capture these tokens in full and modifying the jwt detector to suppress duplicate findings when a JWT is identified as part of a GitHub stateless token. Additionally, the PR includes comprehensive tests for these changes, updates documentation, and introduces a new test to ensure the web playground bundle remains synchronized with the detector registry. I have no feedback to provide as there were no review comments to assess.
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (1)
internal/verifier/github/github_oauth_verifier_test.go (1)
100-124: ⚡ Quick winUse a table-driven case for this new verifier scenario.
This adds another single-case function in a
_test.gofile; please express it as a table entry (or fold it into a shared status-mapping table) to stay compliant and keep scenarios easier to extend.As per coding guidelines, "Use table-driven tests in test files".
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/verifier/github/github_oauth_verifier_test.go` around lines 100 - 124, Convert the single-case test TestOAuthVerify_Forbidden_ReturnsVerifyError into a table-driven test by creating a test table (slice of structs) with a descriptive name field and entries for the forbidden scenario (and any existing scenarios) and iterate with t.Run for each case; inside each case construct the httptest server (as currently done), instantiate OAuthVerifier (apiURL, httpClient), build the detector.RawFinding (using oauthDetectorID and the same Raw/Redacted values), call v.Verify(ctx, raw) and assert expected result.Status and result.Message for that case; update references to TestOAuthVerify_Forbidden_ReturnsVerifyError to the new table-driven test function name and ensure each case is self-contained so tests remain isolated.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@internal/detector/github/github_oauth_test.go`:
- Around line 100-225: The PR fails the repository-wide detector test-coverage
gate because several detector packages are below 95% (internal/detector, gcp,
generic, custom, privatekey, snowflake, stripe, testutil); add unit tests that
exercise each package's exported detector Scan implementations and helper
functions to raise coverage: create table-driven tests (mirroring patterns in
internal/detector/github_oauth_test.go) that call detectors'
Scan(context.Background(), []byte(...)) for expected matches and non-matches,
assert finding counts and fields (use symbols like OAuthDetector.Scan,
Token.Scan, and any package-specific detector types such as GCPDetector or
StripeDetector), and add tests for utility code in testutil to cover edge cases
and error paths; aim to hit the uncovered branches (negative cases, boundary
inputs, and redaction logic) so each listed package reaches ≥95% coverage and
the gate passes.
In `@internal/detector/jwt/jwt.go`:
- Around line 66-99: The detector packages overall miss the 95% coverage gate;
add focused unit tests to exercise uncovered branches in the detector packages
(e.g., custom, gcp, generic, heroku, privatekey, snowflake, stripe and testutil)
so overall detector coverage rises above 95%. Specifically, create table-driven
tests that exercise positive and negative detection paths and edge cases
(including token boundary and separator behavior similar to
jwt.isGitHubStatelessBody and jwt.isTokenByte), add mocks or sample inputs for
cloud/provider-specific detectors (GCP, Heroku, Stripe, Snowflake, private keys,
custom patterns), and include tests for testutil helpers so they are executed;
run `go test ./internal/detector/... -coverprofile` to verify coverage and
iterate until the detector package coverage meets the 95% threshold.
In `@internal/detector/registry_count_test.go`:
- Around line 199-202: The raw error returned from os.Getwd should be wrapped
with context before returning; update the failing branch that calls os.Getwd
(the block returning "", err) to return a wrapped error using
fmt.Errorf("getting working directory: %w", err) (and add a fmt import if
missing) so the function (the helper that calls os.Getwd in
registry_count_test.go) preserves call-site context per the repo error-wrapping
rule.
---
Nitpick comments:
In `@internal/verifier/github/github_oauth_verifier_test.go`:
- Around line 100-124: Convert the single-case test
TestOAuthVerify_Forbidden_ReturnsVerifyError into a table-driven test by
creating a test table (slice of structs) with a descriptive name field and
entries for the forbidden scenario (and any existing scenarios) and iterate with
t.Run for each case; inside each case construct the httptest server (as
currently done), instantiate OAuthVerifier (apiURL, httpClient), build the
detector.RawFinding (using oauthDetectorID and the same Raw/Redacted values),
call v.Verify(ctx, raw) and assert expected result.Status and result.Message for
that case; update references to TestOAuthVerify_Forbidden_ReturnsVerifyError to
the new table-driven test function name and ensure each case is self-contained
so tests remain isolated.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 3c1fc6cb-7da4-4eac-b9ec-e52c8406ee15
📒 Files selected for processing (13)
CHANGELOG.mdREADME.mddocs/user-manuals/en/detectors/detector-catalog.mddocs/user-manuals/tr/detectors/detector-catalog.mdinternal/detector/github/github_oauth.gointernal/detector/github/github_oauth_test.gointernal/detector/jwt/jwt.gointernal/detector/jwt/jwt_test.gointernal/detector/registry_count_test.gointernal/verifier/github/github_oauth_verifier_test.gosite/js/detectors.jssite/js/manuals/en.jssite/js/manuals/tr.js
| // fakeStatelessToken builds an obviously-fake GitHub stateless installation | ||
| // token of the ghs_APPID_<jwt> form (header.payload.signature). It is assembled | ||
| // from parts at runtime so the source file never contains a contiguous, | ||
| // real-looking token literal that secret push-protection could flag. | ||
| func fakeStatelessToken(headerTail string) string { | ||
| const appID = "12345678" | ||
| header := "eyJ" + headerTail | ||
| payload := "eyJ" + strings.Repeat("Gh1Ij2Kl", 30) | ||
| signature := strings.Repeat("Mn3Op4Qr", 12) | ||
| return "ghs_" + appID + "_" + header + "." + payload + "." + signature | ||
| } | ||
|
|
||
| // TestOAuthDetector_Scan_StatelessToken_CapturesWholeToken proves the new | ||
| // ghs_APPID_<jwt> stateless installation tokens are captured in full by a single | ||
| // github-oauth-token finding (the pre-2026 behaviour truncated them at the first | ||
| // dot or missed them entirely when a base64url '-' appeared early). | ||
| func TestOAuthDetector_Scan_StatelessToken_CapturesWholeToken(t *testing.T) { | ||
| tests := []struct { | ||
| name string | ||
| token string | ||
| }{ | ||
| { | ||
| name: "long alphanumeric header segment", | ||
| token: fakeStatelessToken(strings.Repeat("Ab9Cd0Ef", 5)), | ||
| }, | ||
| { | ||
| name: "base64url dash early in header", | ||
| token: fakeStatelessToken("Ab-Cd0Ef9Gh"), | ||
| }, | ||
| { | ||
| name: "base64url underscore in header", | ||
| token: fakeStatelessToken("Ab_Cd0Ef9Gh_Ij"), | ||
| }, | ||
| { | ||
| name: "short app id", | ||
| token: "ghs_42_" + "eyJ" + strings.Repeat("Ab9Cd0Ef", 4) + "." + "eyJ" + strings.Repeat("Gh1Ij2Kl", 20) + "." + strings.Repeat("Mn3Op4Qr", 10), | ||
| }, | ||
| } | ||
|
|
||
| d := &OAuthDetector{} | ||
| for _, tt := range tests { | ||
| t.Run(tt.name, func(t *testing.T) { | ||
| findings := d.Scan(context.Background(), []byte(tt.token)) | ||
| require.Len(t, findings, 1, "stateless token must yield exactly one finding") | ||
|
|
||
| f := findings[0] | ||
| assert.Equal(t, "github-oauth-token", f.DetectorID) | ||
| // The whole token is captured, not just the header segment. | ||
| assert.Equal(t, tt.token, string(f.Raw), "must capture the entire token") | ||
| assert.Greater(t, len(f.Raw), 100, "stateless tokens are long") | ||
|
|
||
| // Redaction stays safe for a long token: only the last four | ||
| // characters are ever revealed. | ||
| assert.Equal(t, "****"+tt.token[len(tt.token)-4:], f.Redacted) | ||
| assert.Len(t, f.Redacted, len("****")+4) | ||
| assert.NotContains(t, f.Redacted, tt.token[:len(tt.token)-4], | ||
| "redaction must not expose the token body") | ||
| }) | ||
| } | ||
| } | ||
|
|
||
| // TestOAuthDetector_Scan_NoOverCapture guards the greedy branches against eating | ||
| // surrounding context: opaque tokens must not start consuming dots, and a | ||
| // stateless token must stop at its third (signature) segment. | ||
| func TestOAuthDetector_Scan_NoOverCapture(t *testing.T) { | ||
| suffix40 := strings.Repeat("Abc1D678", 5) | ||
| stateless := fakeStatelessToken(strings.Repeat("Ab9Cd0Ef", 5)) | ||
|
|
||
| tests := []struct { | ||
| name string | ||
| input string | ||
| want string // expected single captured match | ||
| }{ | ||
| { | ||
| name: "opaque gho_ followed by dotted domain", | ||
| input: "gho_" + suffix40 + ".example.com", | ||
| want: "gho_" + suffix40, | ||
| }, | ||
| { | ||
| name: "stateless token at end of a sentence", | ||
| input: "leaked token: " + stateless + ". Please rotate it.", | ||
| want: stateless, | ||
| }, | ||
| { | ||
| name: "stateless token followed by a fourth dotted segment", | ||
| input: stateless + "." + strings.Repeat("Qq11Ww22", 8), | ||
| want: stateless, | ||
| }, | ||
| } | ||
|
|
||
| d := &OAuthDetector{} | ||
| for _, tt := range tests { | ||
| t.Run(tt.name, func(t *testing.T) { | ||
| findings := d.Scan(context.Background(), []byte(tt.input)) | ||
| require.Len(t, findings, 1) | ||
| assert.Equal(t, tt.want, string(findings[0].Raw)) | ||
| }) | ||
| } | ||
| } | ||
|
|
||
| // TestOAuthDetector_Scan_LegacyOpaqueUnchanged confirms the legacy opaque shapes | ||
| // (including legacy opaque ghs_) are still captured whole and unchanged. | ||
| func TestOAuthDetector_Scan_LegacyOpaqueUnchanged(t *testing.T) { | ||
| suffix40 := strings.Repeat("Abc1D678", 5) | ||
| for _, prefix := range []string{"gho_", "ghu_", "ghr_", "ghs_"} { | ||
| t.Run(prefix, func(t *testing.T) { | ||
| token := prefix + suffix40 | ||
| findings := (&OAuthDetector{}).Scan(context.Background(), []byte(token)) | ||
| require.Len(t, findings, 1) | ||
| assert.Equal(t, token, string(findings[0].Raw)) | ||
| }) | ||
| } | ||
| } | ||
|
|
||
| // TestGitHubDetectors_StatelessNoPrefixOverlap ensures a stateless ghs_ token is | ||
| // still claimed by exactly one of the two GitHub detectors (never the ghp_ | ||
| // personal-access-token detector). | ||
| func TestGitHubDetectors_StatelessNoPrefixOverlap(t *testing.T) { | ||
| token := []byte(fakeStatelessToken(strings.Repeat("Ab9Cd0Ef", 5))) | ||
|
|
||
| tokenFindings := (&Token{}).Scan(context.Background(), token) | ||
| oauthFindings := (&OAuthDetector{}).Scan(context.Background(), token) | ||
|
|
||
| assert.Empty(t, tokenFindings, "ghp_ detector must not claim a ghs_ token") | ||
| require.Len(t, oauthFindings, 1, "oauth detector must claim the ghs_ token") | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check test coverage for the github detector package
cd internal/detector/github
go test -cover -coverprofile=coverage.out
go tool cover -func=coverage.out | grep totalRepository: HodeTech/Leakwatch
Length of output: 474
🏁 Script executed:
#!/bin/bash
set -euo pipefail
pkgs=$(go list ./internal/detector/...)
echo "Detector packages:"
echo "$pkgs"
fail=0
for pkg in $pkgs; do
echo "== $pkg =="
tmpfile="/tmp/cover_${pkg//\//_}.out"
go test -cover -coverprofile="$tmpfile" "$pkg" >/tmp/cover_stdout.txt 2>/tmp/cover_stderr.txt || {
echo "go test failed for $pkg"
cat /tmp/cover_stderr.txt
fail=1
continue
}
total=$(go tool cover -func="$tmpfile" | awk '/total:/{print $3}')
echo "total: $total"
# total is like 100.0%
pct=${total%%%}
if awk "BEGIN{exit !($pct < 95)}"; then
echo "Coverage below 95% for $pkg: $total"
fail=1
fi
done
if [ "$fail" -ne 0 ]; then
echo "One or more detector packages are below 95% coverage."
exit 1
fi
echo "All detector packages meet 95%+ coverage."Repository: HodeTech/Leakwatch
Length of output: 9032
Test coverage gate for detector packages fails (95% not met overall)
internal/detector/github is at 100%, but multiple detector packages are below the 95% minimum (e.g., internal/detector 81.2%, gcp 88.7%, generic 83.3%, custom 92.9%, privatekey 93.8%, snowflake 92.3%, stripe **92.3%, and testutil` 0.0%), so the detector coverage requirement is not satisfied.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@internal/detector/github/github_oauth_test.go` around lines 100 - 225, The PR
fails the repository-wide detector test-coverage gate because several detector
packages are below 95% (internal/detector, gcp, generic, custom, privatekey,
snowflake, stripe, testutil); add unit tests that exercise each package's
exported detector Scan implementations and helper functions to raise coverage:
create table-driven tests (mirroring patterns in
internal/detector/github_oauth_test.go) that call detectors'
Scan(context.Background(), []byte(...)) for expected matches and non-matches,
assert finding counts and fields (use symbols like OAuthDetector.Scan,
Token.Scan, and any package-specific detector types such as GCPDetector or
StripeDetector), and add tests for utility code in testutil to cover edge cases
and error paths; aim to hit the uncovered branches (negative cases, boundary
inputs, and redaction logic) so each listed package reaches ≥95% coverage and
the gate passes.
| // isGitHubStatelessBody reports whether the JWT beginning at start is the body | ||
| // of a GitHub stateless installation token (ghs_APPID_<jwt>). RE2 has no | ||
| // lookbehind, so it walks back over the contiguous token run (base64url plus the | ||
| // ghs_/app-ID separators) immediately preceding the match and checks whether | ||
| // that run contains the literal "ghs_". | ||
| // | ||
| // Contains rather than HasPrefix: the run may carry leading base64url bytes with | ||
| // no delimiter (e.g. "xghs_APPID_"). Wherever "ghs_" appears in the run, the run | ||
| // has no dots (dots are not token bytes) so it is glued straight onto this JWT, | ||
| // forming a "ghs_...eyJ.eyJ.sig" shape that the github-oauth-token detector | ||
| // captures in full — its per-segment floors ({8,}) are at or below this | ||
| // detector's ({10,}). Suppressing here therefore only removes a duplicate of a | ||
| // secret the github detector already reports; it can never drop one. (This | ||
| // assumes the github-oauth-token detector is active, which it is by default.) | ||
| func isGitHubStatelessBody(data []byte, start int) bool { | ||
| i := start | ||
| for i > 0 && isTokenByte(data[i-1]) { | ||
| i-- | ||
| } | ||
| return bytes.Contains(data[i:start], ghsPrefix) | ||
| } | ||
|
|
||
| // isTokenByte reports whether b is part of a contiguous token run: a base64url | ||
| // character or one of the separators ('_', '-') that appear in a ghs_ token. | ||
| func isTokenByte(b byte) bool { | ||
| switch { | ||
| case b >= 'a' && b <= 'z', b >= 'A' && b <= 'Z', b >= '0' && b <= '9': | ||
| return true | ||
| case b == '_', b == '-': | ||
| return true | ||
| default: | ||
| return false | ||
| } | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check test coverage for the jwt detector package
cd internal/detector/jwt
go test -cover -coverprofile=coverage.out
go tool cover -func=coverage.out | grep totalRepository: HodeTech/Leakwatch
Length of output: 267
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# List all detector packages
packages=$(go list ./internal/detector/... | tr '\n' ' ')
echo "Detector packages:"
echo "$packages"
for pkg in $(go list ./internal/detector/...); do
echo "== $pkg =="
# Create a temp coverprofile per package
tmpfile=$(mktemp)
# Run tests with coverage
go test -cover -coverprofile="$tmpfile" "$pkg" >/dev/null
# Extract total statement coverage
total=$(go tool cover -func="$tmpfile" | awk '/total:/ {print $3}')
echo "statement_coverage: $total"
rm -f "$tmpfile"
doneRepository: HodeTech/Leakwatch
Length of output: 9124
Coverage gate: detector packages are below the 95% minimum
github.com/HodeTech/leakwatch/internal/detectoroverall statement coverage is 81.2%- Multiple detector packages are under 95% (e.g.,
custom92.9%,gcp88.7%,generic83.3%,heroku92.9%,privatekey93.8%,snowflake92.3%,stripe**92.3%,testutil` 0.0%) internal/detector/jwtitself is 100%, but the stated requirement is not satisfied across detector packages.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@internal/detector/jwt/jwt.go` around lines 66 - 99, The detector packages
overall miss the 95% coverage gate; add focused unit tests to exercise uncovered
branches in the detector packages (e.g., custom, gcp, generic, heroku,
privatekey, snowflake, stripe and testutil) so overall detector coverage rises
above 95%. Specifically, create table-driven tests that exercise positive and
negative detection paths and edge cases (including token boundary and separator
behavior similar to jwt.isGitHubStatelessBody and jwt.isTokenByte), add mocks or
sample inputs for cloud/provider-specific detectors (GCP, Heroku, Stripe,
Snowflake, private keys, custom patterns), and include tests for testutil
helpers so they are executed; run `go test ./internal/detector/...
-coverprofile` to verify coverage and iterate until the detector package
coverage meets the 95% threshold.
| dir, err := os.Getwd() | ||
| if err != nil { | ||
| return "", err | ||
| } |
There was a problem hiding this comment.
Wrap the returned os.Getwd error with context.
Line 201 returns a raw error, which drops call-site context and violates the repo’s error-wrapping rule.
Suggested patch
func repoRoot() (string, error) {
dir, err := os.Getwd()
if err != nil {
- return "", err
+ return "", fmt.Errorf("get working directory: %w", err)
}As per coding guidelines, "Wrap every error with fmt.Errorf("context: %w", err) before returning".
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| dir, err := os.Getwd() | |
| if err != nil { | |
| return "", err | |
| } | |
| dir, err := os.Getwd() | |
| if err != nil { | |
| return "", fmt.Errorf("get working directory: %w", err) | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@internal/detector/registry_count_test.go` around lines 199 - 202, The raw
error returned from os.Getwd should be wrapped with context before returning;
update the failing branch that calls os.Getwd (the block returning "", err) to
return a wrapped error using fmt.Errorf("getting working directory: %w", err)
(and add a fmt import if missing) so the function (the helper that calls
os.Getwd in registry_count_test.go) preserves call-site context per the repo
error-wrapping rule.
Summary
Detects GitHub's new stateless (JWT-format)
ghs_installation tokens (rolled out from April 2026 to the ActionsGITHUB_TOKENand App installation tokens). The new format isghs_APPID_<jwt>— aghs_-prefixed JWT of ~520 chars containing exactly two dots; segments are base64url (A-Za-z0-9,_,-).The old pattern
gh[orus]_[A-Za-z0-9_]{36,}had no.in its body class, so a new token was either truncated at the first dot (its JWT body falling to the genericjwtdetector → one secret, two wrong findings) or missed entirely when a base64url-appeared before the body reached 36 chars (only flagged as a generic JWT). Both failure modes were reproduced before the change and proven fixed after.What changed
github-oauth-tokendetector (github_oauth.go) — ordered alternation: a stateless branchghs_[A-Za-z0-9_-]{8,}(?:\.[A-Za-z0-9_-]{8,}){2}(listed first so it wins over the opaque branch at aghs_start) plus the unchanged opaque branchgh[orus]_[A-Za-z0-9_]{36,}. The whole token is captured as a single finding.gho_/ghu_/ghr_/legacy ghs_are unchanged and never start eating dots;ghp_stays with the separategithub-tokendetector. Kept as one raw-string literal so thetools/site-buildAST extractor can still read it.jwtdetector (jwt.go) — suppresses a JWT that is the body of aghs_token (walks back over the contiguous token run and checks it containsghs_; RE2 has no lookbehind), so the secret is reported exactly once by the more specific detector. Sound because the GitHub stateless floors ({8,}) are ≤ the JWT floors ({10,}), so whenever this fires the GitHub detector also matches the whole token — suppression can only drop a duplicate, never a secret.site/js/manuals/{en,tr}.js,site/js/detectors.js).Design decisions
ghu_left opaque on purpose — GitHub has signalled a laterghu_format change but has not published it; speculating risks false matches. The code documents exactly where to extend (gh[su]_…)./user401 → inactive; a live installation token → 403 → verify-error (not a false active/revoked). Sending the whole token is strictly more correct than the old truncated fragment.Review follow-ups (in this PR)
After a multi-angle review, three findings were actioned (the rest accepted with rationale: inherent benign over-capture, a rare FP class already tighter than GitHub's own regex, and the documented
ghu_extension point):fix(jwt)— suppression made robust to a base64url char glued beforeghs_(Containsinstead ofHasPrefix); removes a contrived double-report.test(detector)— a guard pins the generateddetectors.jsto the live registry, so a future non-AST-extractable pattern fails CI instead of silently vanishing from the web playground (the previous protection was only a code comment). Verified it fails on a simulated drop.test— an explicit GitHub/user403 case documenting the installation-token verify-error behavior.Verification
gofumpt -l .empty ·go vet ./...·go build ./...·go test -race ./...(no failures/races) ·golangci-lint run ./... --config .golangci.yml→ 0 issues · github(det)/jwt(det)/github(verifier) 100% coverage ·tools/site-buildregeneration leaves the tree clean.Notes
main, no shared concerns.ghu_once GitHub publishes that format (Doc & code cleanup: align with v1.5.0, fix SonarCloud BLOCKERs, raise dbconn coverage #6).🤖 Generated with Claude Code
Summary by CodeRabbit
Release Notes
New Features
ghs_tokens).Bug Fixes
Documentation