Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion SCORING.md
Original file line number Diff line number Diff line change
Expand Up @@ -150,7 +150,7 @@ Not all warnings represent the same degree of degradation. A warning on `llms-tx
| **0.50** | Genuine functional degradation | `llms-txt-exists`, `llms-txt-size`, `rendering-strategy`, `markdown-url-support`, `page-size-markdown`, `page-size-html`, `content-start-position`, `tabbed-content-serialization`, `section-header-quality`, `cache-header-hygiene`, `auth-gate-detection`, `auth-alternative-access` |
| **0.25** | Actively steering agents to a worse path | `llms-txt-links-markdown` (markdown exists but llms.txt links to HTML; agents don't discover .md variants on their own) |

Checks that only have pass/fail (no warn state): `http-status-codes`, `markdown-code-fence-validity`.
`markdown-code-fence-validity` only has pass/fail (no warn state). `http-status-codes` is normally pass/fail but warns when every sampled response is indeterminate (HTTP 202 from CDN cache-miss/build, or 5xx) so the check couldn't measure bad-URL handling.

## Score caps

Expand Down
2 changes: 1 addition & 1 deletion docs/agent-score-calculation.md
Original file line number Diff line number Diff line change
Expand Up @@ -141,7 +141,7 @@ A warning is not a binary "half credit." Different warnings represent different
| **0.50** | Genuine functional degradation | `llms-txt-exists`, `llms-txt-size`, `rendering-strategy`, `markdown-url-support`, `page-size-markdown`, `page-size-html`, `content-start-position`, `tabbed-content-serialization`, `section-header-quality`, `cache-header-hygiene`, `auth-gate-detection`, `auth-alternative-access` |
| **0.25** | Actively steering agents to a worse path | `llms-txt-links-markdown` (markdown exists but llms.txt links to HTML) |

Two checks have no warn state and are strictly pass/fail: `http-status-codes` and `markdown-code-fence-validity`.
`markdown-code-fence-validity` is strictly pass/fail. `http-status-codes` is normally pass/fail but emits a warn when every sampled response is indeterminate (HTTP 202 during CDN cache-miss/build, or 5xx) so we couldn't measure bad-URL handling.

## Score caps

Expand Down
9 changes: 6 additions & 3 deletions scoring-reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -131,9 +131,12 @@ Each check has a specific warn coefficient rather than a uniform default.
| `auth-gate-detection` | 0.50 | Partial gating. Some docs accessible, some invisible to agents. |
| `auth-alternative-access` | 0.50 | Partial alternative access. Covers some gated content but not all. |

Checks without a warn state (`http-status-codes`,
`markdown-code-fence-validity`) don't appear in this table. Their spec
definitions only have pass and fail levels.
`markdown-code-fence-validity` doesn't appear in this table because its
spec definition only has pass and fail levels. `http-status-codes` is
normally pass/fail too, but emits a warn when every sampled response is
indeterminate (HTTP 202 during CDN cache-miss/build, or 5xx); in that
case scoring falls back to the default warn coefficient of 0.5 because
the check couldn't measure bad-URL handling.

This replaces the worst-case aggregation for scoring purposes. A site where
3/50 pages exceed the size limit scores ~94% of the check's weight, not 0%.
Expand Down
49 changes: 42 additions & 7 deletions src/checks/url-stability/http-status-codes.ts
Original file line number Diff line number Diff line change
Expand Up @@ -7,10 +7,11 @@ interface StatusCodeResult {
url: string;
testUrl: string;
status: number | null;
classification: 'correct-error' | 'soft-404' | 'fetch-error';
classification: 'correct-error' | 'soft-404' | 'indeterminate' | 'fetch-error';
redirected?: boolean;
finalUrl?: string;
bodyHint?: string;
indeterminateReason?: string;
error?: string;
}

Expand Down Expand Up @@ -44,10 +45,31 @@ async function check(ctx: CheckContext): Promise<CheckResult> {
const redirected = response.redirected || response.url !== testUrl;
const finalUrl = redirected ? response.url : undefined;

if (status >= 400) {
if (status >= 400 && status < 500) {
return { url, testUrl, status, classification: 'correct-error', redirected, finalUrl };
}

// 202 Accepted: per RFC 7231, the request is being processed but not
// complete. Vercel/Next.js ISR returns this during cache-miss/build
// for fresh URLs — it's a CDN behavior, not site-level error handling.
// 5xx: server failure tells us nothing about how the site handles
// bad URLs. Both are excluded from the soft-404 tally.
if (status === 202 || status >= 500) {
const reason =
status === 202
? 'HTTP 202 (CDN still processing — not a site response)'
: `HTTP ${status} (server error — bad-URL handling unknown)`;
return {
url,
testUrl,
status,
classification: 'indeterminate',
redirected,
finalUrl,
indeterminateReason: reason,
};
}

// Status 200 (or other 2xx/3xx) — possible soft 404
let bodyHint: string | undefined;
try {
Expand Down Expand Up @@ -86,6 +108,8 @@ async function check(ctx: CheckContext): Promise<CheckResult> {
const fetchErrors = results.filter((r) => r.classification === 'fetch-error').length;
const soft404s = results.filter((r) => r.classification === 'soft-404');
const correctErrors = results.filter((r) => r.classification === 'correct-error');
const indeterminate = results.filter((r) => r.classification === 'indeterminate');
const determinate = correctErrors.length + soft404s.length;

if (tested.length === 0) {
return {
Expand All @@ -104,15 +128,25 @@ async function check(ctx: CheckContext): Promise<CheckResult> {
};
}

const status = soft404s.length > 0 ? 'fail' : 'pass';
const pageLabel = sampled ? 'sampled pages' : 'pages';
const suffix = fetchErrors > 0 ? `; ${fetchErrors} failed to fetch` : '';
const fetchSuffix = fetchErrors > 0 ? `; ${fetchErrors} failed to fetch` : '';
const indetSuffix =
indeterminate.length > 0 ? `; ${indeterminate.length} indeterminate (HTTP 202/5xx)` : '';
const suffix = `${fetchSuffix}${indetSuffix}`;

let status: 'pass' | 'warn' | 'fail';
let message: string;
if (status === 'pass') {
message = `All ${tested.length} ${pageLabel} return proper error codes for bad URLs${suffix}`;
if (determinate === 0) {
// Every response was indeterminate (e.g. all 202 or 5xx). We can't say
// whether the site handles bad URLs correctly.
status = 'warn';
message = `Could not determine bad-URL handling: all ${indeterminate.length} ${pageLabel} returned indeterminate responses${fetchSuffix}`;
} else if (soft404s.length > 0) {
status = 'fail';
message = `${soft404s.length} of ${determinate} ${pageLabel} return 200 for non-existent URLs (soft 404)${suffix}`;
} else {
message = `${soft404s.length} of ${tested.length} ${pageLabel} return 200 for non-existent URLs (soft 404)${suffix}`;
status = 'pass';
message = `All ${determinate} ${pageLabel} return proper error codes for bad URLs${suffix}`;
}

return {
Expand All @@ -126,6 +160,7 @@ async function check(ctx: CheckContext): Promise<CheckResult> {
sampled,
soft404Count: soft404s.length,
correctErrorCount: correctErrors.length,
indeterminateCount: indeterminate.length,
fetchErrors,
pageResults: results,
discoveryWarnings: warnings,
Expand Down
5 changes: 5 additions & 0 deletions src/cli/formatters/text.ts
Original file line number Diff line number Diff line change
Expand Up @@ -272,6 +272,7 @@ const DETAIL_FORMATTERS: Record<string, DetailFormatter> = {
classification: string;
status?: number | null;
bodyHint?: string;
indeterminateReason?: string;
error?: string;
}>
| undefined;
Expand All @@ -280,6 +281,10 @@ const DETAIL_FORMATTERS: Record<string, DetailFormatter> = {
.filter((p) => p.classification !== 'correct-error')
.map((p) => {
if (p.error) return formatDetailLine('fail', p.testUrl ?? p.url, p.error);
if (p.classification === 'indeterminate') {
const info = p.indeterminateReason ?? `HTTP ${p.status} (indeterminate)`;
return formatDetailLine('warn', p.testUrl ?? p.url, info);
}
const info = p.bodyHint
? `HTTP ${p.status} (${p.bodyHint})`
: `HTTP ${p.status} instead of 404`;
Expand Down
4 changes: 4 additions & 0 deletions src/scoring/proportions.ts
Original file line number Diff line number Diff line change
Expand Up @@ -292,6 +292,10 @@ function httpStatusCodesExtractor(
}
});

// All responses indeterminate (or fetch errors): exclude from scoring.
// We can't say whether the site handles bad URLs correctly.
if (items.every((i) => i.status === 'skip')) return undefined;

return countByStatus(items, weight.warnCoefficient);
}

Expand Down
60 changes: 60 additions & 0 deletions test/unit/checks/http-status-codes.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -204,6 +204,66 @@ describe('http-status-codes', () => {
expect(result.message).toContain('1 failed to fetch');
});

it('classifies HTTP 202 as indeterminate, not soft-404', async () => {
server.use(
http.get(
'http://test.local/docs/page1-afdocs-nonexistent-8f3a',
() => new HttpResponse(null, { status: 202 }),
),
http.get(
'http://test.local/docs/page2-afdocs-nonexistent-8f3a',
() => new HttpResponse('Not Found', { status: 404 }),
),
);

const content = `# Docs\n## Links\n- [Page 1](http://test.local/docs/page1): First\n- [Page 2](http://test.local/docs/page2): Second\n`;
const result = await check.run(makeCtx(content));
expect(result.status).toBe('pass');
expect(result.details?.soft404Count).toBe(0);
expect(result.details?.correctErrorCount).toBe(1);
expect(result.details?.indeterminateCount).toBe(1);
const pageResults = result.details?.pageResults as Array<{
classification: string;
indeterminateReason?: string;
}>;
const indet = pageResults.find((p) => p.classification === 'indeterminate');
expect(indet?.indeterminateReason).toContain('202');
});

it('classifies 5xx as indeterminate', async () => {
server.use(
http.get(
'http://test.local/docs/page1-afdocs-nonexistent-8f3a',
() => new HttpResponse('Server Error', { status: 503 }),
),
);

const content = `# Docs\n## Links\n- [Page 1](http://test.local/docs/page1): First\n`;
const result = await check.run(makeCtx(content));
const pageResults = result.details?.pageResults as Array<{ classification: string }>;
expect(pageResults[0].classification).toBe('indeterminate');
expect(result.details?.indeterminateCount).toBe(1);
});

it('warns when all responses are indeterminate', async () => {
server.use(
http.get(
'http://test.local/docs/page1-afdocs-nonexistent-8f3a',
() => new HttpResponse(null, { status: 202 }),
),
http.get(
'http://test.local/docs/page2-afdocs-nonexistent-8f3a',
() => new HttpResponse('Server Error', { status: 503 }),
),
);

const content = `# Docs\n## Links\n- [Page 1](http://test.local/docs/page1): First\n- [Page 2](http://test.local/docs/page2): Second\n`;
const result = await check.run(makeCtx(content));
expect(result.status).toBe('warn');
expect(result.message).toContain('Could not determine');
expect(result.details?.indeterminateCount).toBe(2);
});

it('strips fragments from test URLs', async () => {
server.use(
http.get(
Expand Down
33 changes: 33 additions & 0 deletions test/unit/scoring/proportions.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -120,6 +120,39 @@ describe('proportions', () => {
expect(result!.proportion).toBeCloseTo(0.667, 2);
expect(result!.tested).toBe(3);
});

it('skips indeterminate items from the proportion', () => {
const result = getCheckProportion(
makeResult('http-status-codes', 'fail', {
pageResults: [
{ url: '/a', classification: 'correct-error', status: 404 },
{ url: '/b', classification: 'soft-404', status: 200 },
{ url: '/c', classification: 'indeterminate', status: 202 },
],
}),
makeWeight(7),
);
// 1 pass, 1 fail, 1 skipped = 1/2
expect(result!.proportion).toBeCloseTo(0.5, 2);
expect(result!.tested).toBe(2);
});

it('falls back to top-level warn status when all items are indeterminate', () => {
// Extractor returns undefined for all-indeterminate, so scoring falls
// back to the top-level 'warn' status. The site gets warn-coefficient
// credit for the check rather than a hard 0 (we couldn't measure).
const result = getCheckProportion(
makeResult('http-status-codes', 'warn', {
pageResults: [
{ url: '/a', classification: 'indeterminate', status: 202 },
{ url: '/b', classification: 'indeterminate', status: 503 },
],
}),
makeWeight(7),
);
expect(result!.proportion).toBe(0.5);
expect(result!.tested).toBe(1);
});
});

describe('markdown-url-support', () => {
Expand Down