feat(scripts): hcg-policy-smoke.sh — stealth-profile status canary (Phase E)#224
Merged
Merged
Conversation
…hase E)
Extends the §1.5 operator pre-check with four exact-status canaries that
pin the deny *status code*: internal+stealth routes (sdp-status-get,
cartridge-load-post) must return exactly 404, and authenticated routes
without a stealth_profile (status-get, cartridges-list-get) must return
exactly 403. The existing `deny` pattern accepts any 4xx, which would
mask a `:stealth_profiles` runtime-config misconfiguration that demoted
internal routes to bare 403 — leaking capability existence to untrusted
callers, the exact threat the sdp-status-get rule narrative calls out
("not confirmable from outside"). probe() gains a three-digit literal
pattern (`[1-5][0-9][0-9]`) for the new assertion; existing deny /
allow_or_upstream patterns are unchanged.
Runbook §1.5 acceptance text updated to describe the new canary class
alongside the existing verb and path canaries; version 0.6 → 0.7;
date 2026-06-14 → 2026-06-15.
Refs hyperpolymath/standards#91
Refs hyperpolymath/standards#100
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
🔍 Hypatia Security ScanFindings: 215 issues detected
View findings[
{
"reason": "Stale AI session file -- delete",
"type": "stale",
"file": "GEMINI.md",
"action": "delete",
"rule_module": "root_hygiene",
"severity": "medium"
},
{
"reason": "Issue in scorecard-enforcer.yml",
"type": "missing_timeout_minutes",
"file": "scorecard-enforcer.yml",
"action": "flag",
"rule_module": "workflow_audit",
"severity": "medium"
},
{
"reason": "Issue in scorecard-enforcer.yml",
"type": "scorecard_publish_with_run_step",
"file": "scorecard-enforcer.yml",
"action": "split_scorecard_publish_job",
"rule_module": "workflow_audit",
"severity": "high"
},
{
"reason": "Issue in instant-sync.yml",
"type": "secret_action_without_presence_gate",
"file": "instant-sync.yml",
"action": "peter-evans/repository-dispatch",
"rule_module": "workflow_audit",
"severity": "high"
},
{
"reason": "Issue in codeql.yml",
"type": "codeql_missing_actions_language",
"file": "codeql.yml",
"action": "flag",
"rule_module": "workflow_audit",
"severity": "medium"
},
{
"reason": "TypeScript file detected -- banned language",
"type": "banned_language_file",
"file": "/home/runner/work/boj-server/boj-server/cartridges/academic-workflow-mcp/adapter/mod.ts",
"action": "flag",
"rule_module": "cicd_rules",
"severity": "critical"
},
{
"reason": "TypeScript file detected -- banned language",
"type": "banned_language_file",
"file": "/home/runner/work/boj-server/boj-server/cartridges/ephapax-mcp/adapter/mod.ts",
"action": "flag",
"rule_module": "cicd_rules",
"severity": "critical"
},
{
"reason": "TypeScript file detected -- banned language",
"type": "banned_language_file",
"file": "/home/runner/work/boj-server/boj-server/cartridges/bofig-mcp/adapter/mod.ts",
"action": "flag",
"rule_module": "cicd_rules",
"severity": "critical"
},
{
"reason": "TypeScript file detected -- banned language",
"type": "banned_language_file",
"file": "/home/runner/work/boj-server/boj-server/cartridges/fireflag-mcp/adapter/mod.ts",
"action": "flag",
"rule_module": "cicd_rules",
"severity": "critical"
},
{
"reason": "TypeScript file detected -- banned language",
"type": "banned_language_file",
"file": "/home/runner/work/boj-server/boj-server/cartridges/sanctify-mcp/adapter/mod.ts",
"action": "flag",
"rule_module": "cicd_rules",
"severity": "critical"
}
]Powered by Hypatia Neurosymbolic CI/CD Intelligence |
4 tasks
hyperpolymath
added a commit
that referenced
this pull request
Jun 16, 2026
Phase E §3 traffic-shift sign-off has four success criteria (rollout
runbook §3.1): p99 latency within Phase D baseline × 1.5, gateway-origin
5xx ≤ baseline, no circuit-breaker trips, no X-Trust-Level mismatches
in BoJ access logs. The runbook §4 names the signals at the human level
but stops short of giving the on-call a query they can paste. This spec
fills that gap.
Maps each signal in runbook §4.1 (six gateway-side) and §4.2 (three
BoJ-side) to:
1. The telemetry event emitted by the gateway (audit §5 / §1.6).
2. The Prometheus metric exported by `telemetry_metrics_prometheus_core`
against the canonical `telemetry_metrics/0` declaration in
`http-capability-gateway/lib/http_capability_gateway/application.ex`
lines 259-296 (request lifecycle, policy lookup, access decision,
backend forward, error, minikaran anomaly).
3. A PromQL query template the operator can paste verbatim.
4. An alert threshold anchored to a canonical source — the rollback
trigger value from runbook §5.1, the perf contract tolerance
ratio, or the load-profile §2 SLO budget. Where the absolute
number depends on Phase D-4 baseline collection, the spec names
the formula and the lookup site (`bench/baseline.json` after
`_status` flips to `active`) instead of inventing a value.
§5 closes the loop: every runbook §5.1 rollback trigger is cross-
referenced to the PromQL alert rule in this spec. Three follow-up
gaps are surfaced honestly rather than papered over:
- Circuit-breaker state has no dedicated telemetry event today;
the 503 spike + log inspection is the interim signal until a
`[:http_capability_gateway, :circuit_breaker, :state_change]`
event lands. Spec calls this out as a post-Phase-E gateway
follow-up, not a Phase E blocker.
- Policy reload has no telemetry event either; logged via
`Logger.info("Policy compiled successfully", …)`. Treat as a
log-based alert until the event is added.
- VeriSimDB write failures are cast-only (audit §4); no metric
surface today. Phase E §1.3 already lists VeriSimDB integration
status as an `!OWNER:` confirmation; this spec confirms the
metric-side consequence.
Edits the rollout runbook §4 lead-in and Appendix B cross-reference list
to point at the spec, so an operator landing in §4 can find the queries
without grep. No changes to the §4 signal list itself or to any §1/§3/
§5/§6 procedure — the runbook stays normative; the spec is its
declarative half.
Does not invent percentile budgets, does not commit a dashboard URL,
does not pre-empt the §6.4 Trustfile flip. Per single-lane HCG channel
discipline (rolled in PR #38 of http-capability-gateway, PR #168/#173/
#224 of boj-server): joint-close is owner-only; this PR refs but does
not close standards#100.
Refs hyperpolymath/standards#91
Refs hyperpolymath/standards#100
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
hyperpolymath
added a commit
that referenced
this pull request
Jun 16, 2026
## Summary
Phase E §3 traffic-shift sign-off has four success criteria (rollout
runbook §3.1). The runbook §4 names the signals that gate sign-off at
the human level — p99 latency, throughput, circuit-breaker state, 5xx
rate, policy reload counter, trust-level decision distribution, plus
three BoJ-side signals — but stops short of giving the on-call a query
they can paste. This PR adds
`docs/integration/gateway-observability-spec.md` to fill that gap.
For every runbook §4.1 + §4.2 signal, the spec wires:
1. The **telemetry event** emitted by the gateway (audit document §5 /
§1.6).
2. The **Prometheus metric** the `telemetry_metrics_prometheus_core`
reporter exports for that event, anchored to the canonical declaration
in `http-capability-gateway/lib/http_capability_gateway/application.ex`
`telemetry_metrics/0` (lines 259–296).
3. A **PromQL query template** the operator pastes verbatim into a
dashboard panel or alert rule.
4. An **alert threshold** anchored to a canonical source — the rollback
runbook §5.1 trigger value, the perf contract tolerance ratio, or the
load-profile §2 SLO budget. Absolute-µs values that depend on Phase D-4
baseline collection are left as `${BASELINE_*}` placeholders with the
lookup site (`bench/baseline.json` after `_status` flips to `active`)
named — not invented numbers.
§5 closes the loop with a table mapping every runbook §5.1 rollback
trigger to the PromQL alert rule in this spec, plus three follow-up gaps
surfaced honestly:
- **Circuit-breaker state** — no dedicated telemetry event today; the
503 spike + log inspection is the interim signal. Spec flags
`[:http_capability_gateway, :circuit_breaker, :state_change]` as a
post-Phase-E gateway follow-up.
- **Policy reload counter** — logged via `Logger.info`, not telemetry.
Treat as log-based until the event is added.
- **VeriSimDB write failures** — cast-only (audit §4), no metric surface
today. Phase E §1.3 already lists VeriSimDB integration as an `!OWNER:`
confirmation; the metric-side consequence is recorded here.
Edits the rollout runbook §4 lead-in and Appendix B cross-reference list
to point at the spec, so an operator landing in §4 can find the queries
without grep. No changes to the §4 signal list itself or to any
§1/§3/§5/§6 procedure — the runbook stays normative; the spec is its
declarative half.
## What this PR does NOT do
- Does not invent absolute percentile budgets (those live in
`bench/baseline.json` after D-4).
- Does not commit a dashboard URL (the `!OWNER:` rows in runbook §4.3
stand).
- Does not pre-empt the §6.4 Trustfile flip (`tier_2_gateway.status`
stays `PENDING`).
- Per single-lane HCG channel discipline (pattern set in
http-capability-gateway PRs #14, #22, #26, #30, #38 and boj-server PRs
#168, #173, #224): joint-close is owner-only. This PR **refs but does
not close** `standards#100`.
## Test plan
- [ ] Review `docs/integration/gateway-observability-spec.md` end-to-end
— verify the cited line numbers in
`http-capability-gateway/lib/http_capability_gateway/application.ex`
(259–296) and `lib/http_capability_gateway/gateway.ex` (228–232, 411)
match the current gateway main (commit `46116cf` at PR open time).
- [ ] Verify the metric naming convention
(`http_capability_gateway_<event>_<measurement>_<suffix>` with `_total`
on counters and `_bucket`/`_count`/`_sum` on distributions) matches
`TelemetryMetricsPrometheus.Core.scrape()` output against a locally
running gateway. If the naming convention diverges from this spec, the
spec is wrong — open a fix-up PR.
- [ ] Verify the rollout runbook §4 back-link renders correctly in
GitHub's markdown viewer.
- [ ] Verify no Hypatia / governance / spdx gates fire on the new file
(matches the existing `gateway-load-profile.md` MPL-2.0 + SPDX header
shape).
Refs hyperpolymath/standards#91
Refs hyperpolymath/standards#100
🤖 Generated with [Claude Code](https://claude.com/claude-code)
---
_Generated by [Claude
Code](https://claude.ai/code/session_01MkuwduQrd7anDHzGNZhvGj)_
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
hyperpolymath
added a commit
that referenced
this pull request
Jun 19, 2026
The HCG live policy header records a manual re-verification stamp
("Re-verified 2026-05-28 against `BojRest.Router`") confirming that
every wired BoJ HTTP route is covered by a Verb Governance Spec rule.
The ADR's largest declared risk is "policy lagging the surface" — a
new wired route landing without a matching policy rule would default-
deny in production (an outage on a route that should be live),
silently exposing the §1.5 manual-check failure-mode this stamp is
meant to gate.
This script makes the same check machine-checkable:
- Extract `(verb, path-template)` tuples from
`elixir/lib/boj_rest/router.ex` (Plug.Router `get "/..."` /
`post "/..."` lines).
- Extract `(verb, path-pattern, ...)` tuples from
`config/gateway-policy-boj.yaml` (the live policy promoted in
Phase E §1.5).
- For each wired route, concretise its `:name`-style placeholders
with a known probe segment and assert at least one policy rule
matches (literal equality for non-regex paths; ERE `grep -E`
against the concrete URL for `^...` regex paths).
- Exit 0 on no drift, 1 on drift, 64 on bad usage.
Sits alongside `scripts/hcg-policy-smoke.sh`: the smoke script runs
*against a live gateway* to confirm the policy enforces as declared
(deny/allow paths, stealth-status canaries); this script runs *against
the source files* to confirm the policy still covers the wired
surface. Together they bracket the two halves of the §1.5 pre-
rollout verification — surface→policy coverage (this) and policy→
gateway enforcement (smoke).
Hardens the §1.5 pre-rollout posture without changing any policy
content, runbook procedure, or gateway code. The runbook can adopt
this script as the §1.5 surface-drift check in a follow-up PR; this
PR lands the artefact only.
Per single-lane HCG channel discipline (pattern set in
http-capability-gateway PRs #14, #22, #26, #30, #38 and boj-server
PRs #168, #173, #224, #226): joint-close is owner-only. This commit
refs but does not close standards#100.
Refs hyperpolymath/standards#91
Refs hyperpolymath/standards#100
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
6 tasks
hyperpolymath
added a commit
that referenced
this pull request
Jun 19, 2026
## Summary Adds `scripts/hcg-surface-drift-check.sh`: a static, source-only audit that asserts every wired `BojRest.Router` route is covered by at least one rule in the HCG live Verb Governance Spec (`config/gateway-policy-boj.yaml`). The ADR's largest declared risk is **"policy lagging the surface"** — a wired route landing without a matching policy rule would default-deny in production (an outage on a route that should be live). The live policy header carries a manual re-verification stamp (`Re-verified 2026-05-28 against BojRest.Router`) which is the only check today that catches this; the §1.5 prerequisite checklist relies on the same manual procedure. This script makes the check machine-checkable without changing any policy content, runbook procedure, or gateway code. ### What the script does 1. Extracts `(verb, path-template)` tuples from `elixir/lib/boj_rest/router.ex` (Plug.Router `get "/…"` / `post "/…"` etc.). 2. Extracts `(verb, path-pattern)` tuples from `config/gateway-policy-boj.yaml`. 3. For each wired route, concretises `:name`-style placeholders with a known probe segment and asserts at least one policy rule matches (literal equality for non-regex paths; ERE `grep -E` match against the concrete URL for `^…` regex paths). 4. Exit 0 on no drift, 1 on drift detected, 64 on bad usage. ### Bracket-style relationship with `hcg-policy-smoke.sh` - **Smoke script** (`hcg-policy-smoke.sh`) — runs against a *live gateway* to confirm the policy enforces as declared (deny/allow paths, stealth-status canaries, default-deny no-match canary). - **Drift check** (this PR) — runs against the *source files* to confirm the policy still covers the wired surface. Together they cover both halves of the §1.5 pre-rollout verification: surface→policy coverage (drift) and policy→gateway enforcement (smoke). ### What this PR does NOT do - Does **not** modify the rollout runbook §1.5. Adoption as the §1.5 surface-drift check is a separate, owner-driven PR; this PR lands the artefact only so the runbook update is a one-line wiring change. - Does **not** wire the script into CI. Boj-server's CI discipline (`docs/wikis/CI-and-Required-Checks.adoc` / `.claude/CLAUDE.md`) requires path-filtered required checks to use the "always-trigger + changes job" pattern; a CI wiring PR should follow that pattern. Out of scope here. - Does **not** modify any policy file. Today's surface and today's policy are in agreement (the 2026-05-28 re-verification still holds; the script run on this commit's working tree returns OK on the seven wired routes: `/.well-known/boj-node-pubkey`, `/health`, `/menu`, `/cartridges`, `/cartridge/:name`, `/cartridge/:name/invoke`, `/cartridge/:name/sse`). - Does **not** pre-empt the §6.4 Trustfile flip (`tier_2_gateway.status` stays `PENDING`). - Per single-lane HCG channel discipline (pattern set in `http-capability-gateway` PRs #14, #22, #26, #30, #38 and `boj-server` PRs #168, #173, #224, #226): joint-close is owner-only. **This PR refs but does not close `standards#100`.** ### Channel state note This session could not read `hyperpolymath/standards#91` / `#100` (the session's repository scope is restricted to `http-capability-gateway` and `boj-server`), so the brief's instructed status comment on `standards#91` could not be posted. State was reconstructed from the canonical sources in this repo (ADR-0004, the integration plan, the audit, the rollout runbook, the live policy, and the merged-PR commit history) plus the current `main` of both in-scope repos. The analysis: Phase A/B/C/D are closed (artefacts merged, runbook §1.2 and the Phase-D status note in the runbook header confirm); Phase E (`standards#100`) is the only open phase; all remaining §1 checklist items are owner-driven (`!OWNER:` placeholders, D-4 rebaseline workflow_dispatch, cerro-torre `.ctp` signing, the §6.4 Trustfile flip). This PR advances Phase E §1.5 ("Gateway-side prerequisites") by converting one manual sub-check into an executable artefact, without crossing into any owner-input territory. ## Test plan - [ ] Run the script on this branch's working tree: `bash scripts/hcg-surface-drift-check.sh` — expect exit 0, "OK: every wired router route is covered by at least one policy rule." with the wired-count = 7 and the policy-count matching the rule total in `config/gateway-policy-boj.yaml`. - [ ] Run `bash scripts/hcg-surface-drift-check.sh -v` — expect the same exit 0 plus a `Matched:` block listing each of the seven wired routes against its policy rule (literal `/health` → literal rule; `/cartridge/:name/invoke` → `^/cartridge/[A-Za-z0-9_.-]+/invoke$` regex; etc.). - [ ] Synthetic drift test: temporarily add a wired route to the router (e.g. `get "/__drift_test__" do …`) without a policy rule; re-run and expect exit 1 with the route listed under `DRIFT:`. Revert before merge. - [ ] Confirm `shellcheck scripts/hcg-surface-drift-check.sh` is clean (matches `scripts/hcg-policy-smoke.sh`'s shellcheck posture). - [ ] Confirm SPDX header + Owner copyright match the canonical estate format (matches `scripts/hcg-policy-smoke.sh`'s header shape). - [ ] Verify no Hypatia / governance / spdx gates fire on the new script file. Refs hyperpolymath/standards#91 Refs hyperpolymath/standards#100 🤖 Generated with [Claude Code](https://claude.com/claude-code) --- _Generated by [Claude Code](https://claude.ai/code/session_01JuztZ5kyDr3AqGtUsBipDL)_ Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
5 tasks
hyperpolymath
added a commit
that referenced
this pull request
Jun 20, 2026
…229) ## Summary Wires `scripts/hcg-surface-drift-check.sh` (landed in boj-server#228, merged 2026-06-19) into GitHub Actions, so the surface⊆policy invariant the ADR calls its largest declared risk is re-proven on every PR rather than relying on the manual re-verification stamp in `config/gateway-policy-boj.yaml`'s header. PR #228 explicitly flagged this CI wiring as the follow-up step — "a CI wiring PR should follow [the always-trigger + changes-job] pattern. Out of scope here." This is that follow-up; the script, the router, and the policy are unchanged. ## What lands A single new file: `.github/workflows/hcg-surface-drift.yml`. The workflow follows the boj-server "always-trigger + changes-job" pattern documented in `docs/wikis/CI-and-Required-Checks.adoc` and `.claude/CLAUDE.md` §"CI / Required Status Checks": - **No `on.*.paths`** — the check is always created. A path-filtered required workflow that never fires is the failure mode that stranded #213/#215 until #216 fixed it; this gate is built to never re-introduce it, regardless of whether it later joins `required_status_checks`. - **Lightweight `changes` job** recomputes relevance via `git diff origin/<base>...HEAD` against the four paths this gate cares about — router (`elixir/lib/boj_rest/router.ex`), live policy (`config/gateway-policy-boj.yaml`), the drift script (`scripts/hcg-surface-drift-check.sh`), and the workflow file itself. Fail-safe to `run=true` on any diff failure. - **Heavy `check` job** is `needs: changes` + `if: needs.changes.outputs.run == 'true'`. A skipped `if:` reports SUCCESS to any future required-context list, so unrelated PRs never pay for it and can never be blocked by it. - **Pinned action**, **timeout-minutes**, **concurrency group**, **`permissions: contents: read`**, **SPDX header** — matches the canonical pattern in `.github/workflows/abi-drift.yml`. The `check` job invokes the script with `bash scripts/hcg-surface-drift-check.sh -v` (matching the test plan in #228) so it works regardless of the script's file mode — #228 committed the script as 0644. ## What this PR does NOT do - **Does NOT** modify the runbook §1.5 ("Gateway-side prerequisites"). Adoption of the CI gate into the §1.5 checklist is a one-line owner-driven runbook update — the PR #228 deliberate boundary stays in place. - **Does NOT** add the new check to `.github/settings.yml`'s `required_status_checks` list (currently `hypatia-scan` + `codeql`). Promotion to required is a settings change for the owner to make once the gate has run green on a few PRs. - **Does NOT** modify the live policy, the example policy, the router, the script, or any other Phase E artefact. The change is wholly within `.github/workflows/`. - **Does NOT** pre-empt the §6.4 Trustfile flip (`tier_2_gateway.status` stays `PENDING`), the staging soak (§3.3), or cerro-torre `.ctp` signing — all of which remain owner-driven per the channel doctrine reaffirmed in #207 / #224. - Per the single-lane HCG channel discipline (pattern set in `http-capability-gateway` PRs #14, #22, #26, #30, #38 and `boj-server` PRs #168, #173, #224, #226, #228): joint-close is owner-only. **This PR refs but does not close `standards#100`.** ## Channel state note This session could not read `hyperpolymath/standards#91` / `#100` (the session's MCP repo scope is restricted to `http-capability-gateway` and `boj-server`), so the brief's instructed status comment on `standards#91` could not be posted. State was reconstructed from the canonical sources in this repo (ADR-0004, the integration plan, the audit, the rollout runbook, the live policy, `docs/wikis/CI-and-Required-Checks.adoc`) plus the merged-PR history of both in-scope repos. Analysis: Phase A/B/C/D closed; Phase E (`standards#100`) is the only open phase; #228 (2026-06-19) is the most recent advance and explicitly named this CI wiring as the next step. ## Test plan - [ ] **Required**: the `changes` job runs and emits `run=true` (because `.github/workflows/hcg-surface-drift.yml` matches the path regex), so the `check` job is gated through, not skipped, on this PR. - [ ] **Required**: the `check` job runs `bash scripts/hcg-surface-drift-check.sh -v` and exits 0 with the OK message — current `main` (64a70c5) has 7 wired routes, 28 policy rules, no drift; locally re-verified on this branch. - [ ] **Synthetic skip**: on a follow-up PR that touches none of the four watched paths, `changes.outputs.run` is `false` and `check` reports `skipped` (which counts as success for any required-context list). - [ ] **Synthetic drift**: a temporary PR adding `get "/__drift_test__"` to `elixir/lib/boj_rest/router.ex` without a matching policy rule fires `run=true`, `check` exits 1 with the route listed under `DRIFT:`, and the PR is blocked from merge if/when this gate is promoted to required. - [ ] No `actionlint` / Hypatia / SPDX gate fires on the new workflow file. Refs hyperpolymath/standards#91 Refs hyperpolymath/standards#100 🤖 Generated with [Claude Code](https://claude.com/claude-code) --- _Generated by [Claude Code](https://claude.ai/code/session_019cKmxx6AkNjzhXT6ZoxGfx)_ Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
7 tasks
hyperpolymath
added a commit
that referenced
this pull request
Jun 21, 2026
## Summary Adds `scripts/hcg-spec-coverage-check.sh`: a static, source-only audit that asserts every HTTP route declared in `docs/specification/openapi.yaml` is covered by at least one rule in the HCG live Verb Governance Spec (`config/gateway-policy-boj.yaml`). Companion / complement to PR #228's `hcg-surface-drift-check.sh`. The two scripts bracket the contract §8 declared-surface invariant from both directions: | Script | Invariant | Catches | |---|---|---| | `hcg-surface-drift-check.sh` (#228) | wired (router.ex) ⊆ policy | policy lag behind wiring | | `hcg-spec-coverage-check.sh` (this PR) | declared (openapi.yaml) ⊆ policy | policy lag behind the spec | Contract §8 (`docs/integration/http-capability-gateway-boj-contract.md`) is explicit: "the Verb Governance Spec governs the **declared** surface (openapi.yaml), not only the currently-wired subset. Declared-but-unimplemented routes are still classified in the policy so that when the gnosis handler grows them they are governed from day one rather than silently exposed." The live policy header carries the cross-check statement (*"Surface source: docs/specification/openapi.yaml, cross-checked against elixir/lib/boj_rest/router.ex"*); PR #228 made the router half machine-checkable, this PR makes the openapi half machine-checkable. Together they make the entire §1.5 re-verification stamp executable. Without this check the risk is concrete: someone adds a new path to `openapi.yaml` without a corresponding policy rule. The surface-drift check does not catch it (the route is not yet wired in `router.ex`). The day the route is wired, the surface-drift gate fires — but by then the operator has to either (a) ship the wiring with a default-deny in production for a route that should be live or (b) hold the wiring PR until the policy catches up. Catching the gap at spec-edit time avoids both, with no procedural cost above running the existing CI gate. ### What the script does 1. Extracts `(verb, path-template)` tuples from the `paths:` section of `docs/specification/openapi.yaml` — path entries at exactly 2-space indent, HTTP operations (get/post/put/delete/patch/head/options) at exactly 4-space indent under each path. Other keys at 4-space indent (parameters/summary/description/tags/...) are metadata, not operations, and are skipped. 2. Extracts `(verb, path-pattern)` tuples from `config/gateway-policy-boj.yaml` using the identical extraction block that `hcg-surface-drift-check.sh` uses, so the two scripts cannot drift in how they read the policy. 3. For each declared route, concretises `{name}`-style placeholders with a known probe segment (`probe`, shared with the smoke + surface-drift scripts so a future regex tightening fails all three in lock-step) and asserts at least one policy rule covers it: literal equality for non-regex paths; ERE `grep -E` match against the concrete URL for `^…` regex paths. The declared verb must be in the policy rule's verb list. 4. Exit `0` on no gap, `1` on gap detected, `64` on bad usage. ### What this PR does NOT do - Does **not** modify the rollout runbook §1.5 or the contract §8. Adoption as the §1.5 declared-surface check is a separate, owner-driven PR; this PR lands the artefact only so the runbook update is a one-line wiring change. Matches the §228-then-runbook split. - Does **not** wire the script into CI. Boj-server's CI discipline (`docs/wikis/CI-and-Required-Checks.adoc` / `.claude/CLAUDE.md`) requires path-filtered required checks to use the "always-trigger + changes job" pattern; a CI wiring PR should follow that pattern, matching the #228 → #229 split. Out of scope here. - Does **not** modify the openapi.yaml or the policy. On this branch the script reports OK against today's surface — every one of the 26 `(verb, path)` pairs declared in openapi.yaml has a matching rule among the 28 `(verb, path)` rules in the live policy. The 2-rule surplus is the policy's coverage of routes the openapi.yaml does not declare (notably `/.well-known/boj-node-pubkey`, which the router wires but the spec does not yet enumerate); the script intentionally does not penalise that direction — see the script's `Limitations` header. - Does **not** pre-empt the §6.4 Trustfile flip (`tier_2_gateway.status` stays `PENDING`). - Per single-lane HCG channel discipline (pattern set in `http-capability-gateway` PRs #10, #11, #12, #14, #22, #26, #30, #38 and `boj-server` PRs #78, #90, #106, #168, #173, #207, #208, #210, #215, #222, #224, #226, #228, #229): joint-close is owner-only. **This PR refs but does not close `standards#100`.** ### Channel state note This session could not read `hyperpolymath/standards#91` / `#100` (the session's repository scope is restricted to `http-capability-gateway` and `boj-server`), so the brief's instructed status comment on `standards#91` could not be posted. State was reconstructed from the canonical sources in this repo (ADR-0004, the integration plan, the audit, the rollout runbook, the live policy, the openapi spec, and the merged-PR commit history) plus the current `main` of both in-scope repos. The analysis: Phase A/B/C/D are closed (artefacts merged, runbook §1.2 and the Phase-D status note in the runbook header confirm); Phase E (`standards#100`) is the only open phase; all remaining §1 checklist items are owner-driven (`!OWNER:` placeholders, D-4 rebaseline `workflow_dispatch`, cerro-torre `.ctp` signing, the §6.4 Trustfile flip). This PR advances Phase E §1.5 ("Gateway-side prerequisites") by converting one half of the declared-surface invariant into an executable artefact, mirroring exactly the script-first split of #228. ## Test plan - [ ] Run the script on this branch's working tree: `bash scripts/hcg-spec-coverage-check.sh` — expect exit `0`, "OK: every openapi-declared route is covered by at least one policy rule." with `Declared (openapi) routes: 26` and `Policy (verb,path) rules: 28`. - [ ] Run `bash scripts/hcg-spec-coverage-check.sh -v` — expect the same exit `0` plus a `Matched:` block listing each of the 26 declared routes against its policy rule (literal `/health` → literal rule; `/cartridge/{name}/invoke` → `^/cartridge/[A-Za-z0-9_.-]+/invoke$` regex; `/grpc/{service}/{method}` → two-segment regex; `/umoja/peers` matches both `GET` and `POST` rules; etc.). - [ ] Synthetic gap test: build a temporary openapi.yaml containing a single declared path with no policy rule and run `OPENAPI_FILE=... bash scripts/hcg-spec-coverage-check.sh` — expect exit `1` with the route listed under `GAP:`. (Verified locally on this branch.) - [ ] Confirm `shellcheck scripts/hcg-spec-coverage-check.sh` produces only the same `SC1001` info note that `scripts/hcg-surface-drift-check.sh` produces today (the `\^` escape inside a `case` pattern is intentional and matches the sibling script's posture exactly). - [ ] Confirm SPDX header + Owner copyright match the canonical estate format (matches `scripts/hcg-surface-drift-check.sh`'s header shape). - [ ] Verify `scripts/check-shebang-first.sh` is still green with the new file present. - [ ] Verify no Hypatia / governance / spdx gates fire on the new script file. Refs hyperpolymath/standards#91 Refs hyperpolymath/standards#100 🤖 Generated with [Claude Code](https://claude.com/claude-code) --- _Generated by [Claude Code](https://claude.ai/code/session_013VLPKSTEMFnPYQdx6rD91b)_ Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Extends the §1.5 operator pre-check (
scripts/hcg-policy-smoke.sh) with four exact-status stealth-profile canaries that pin the deny status code: internal+stealth routes (sdp-status-get,cartridge-load-post) must return exactly 404 (capability existence hidden), and authenticated routes without astealth_profile(status-get,cartridges-list-get) must return exactly 403 (capability exists, caller lacks credentials).probe()gains a three-digit literal pattern ([1-5][0-9][0-9]) to support the assertion; existingdenyandallow_or_upstreampatterns are unchanged.Why
The existing
denypattern accepts any 4xx. A misconfiguration where:stealth_profilesis not populated at runtime (the runtime config thatapplication.ex configure_stealth/1sets frompolicy.stealth.enabled: true) would silently demote internal+stealth routes from 404 to bare 403. The deny probes would still pass — but the security property would be broken: untrusted callers could now distinguish "route exists, you can't access it" (403) from "route does not exist" (404), leaking capability existence. This is the exact threat thesdp-status-getrule narrative calls out ("not confirmable from outside").The two code paths fixed by this canary are both in
http-capability-gateway/lib/http_capability_gateway/gateway.ex:handle_denial/3returnsstealth_profiles["default"][trust_str]for rules withstealth_profile: "default"and bare 403 otherwise.stealth_profilefield or that nils-out the application env would surface here.The pre-check now catches both.
Changes
scripts/hcg-policy-smoke.sh:probe()gains a[1-5][0-9][0-9]exact-status case clause.docs/integration/hcg-tier2-rollout-runbook.md: §1.5 acceptance prose updated to describe the new canary class; version 0.6 → 0.7; date 2026-06-14 → 2026-06-15.Test plan
bash -n scripts/hcg-policy-smoke.sh— script parses cleanly.bash scripts/hcg-policy-smoke.sh --help— usage still rendered correctly (the new probes do not change the option surface).scripts/hcg-policy-smoke.sh --gateway-url <staging>reports four newPASSlines (stealth-canary:GET /sdp/status (internal+stealth → 404), …) alongside the existing 25 deny probes, six verb canaries, and one path canary. AFAILon the 404-expected probes specifically means stealth is not armed — investigate:stealth_profilesbefore flipping the §1.5 checkbox.Phase E channel notes
hyperpolymath/standards#91(parent) → Phase E (#100) is the active sub-issue..ctpsigning. Each Phase E sub-task therefore lands as aRefs-only advance — this PR deliberately does notCloses #100.Refs hyperpolymath/standards#91
Refs hyperpolymath/standards#100
🤖 Generated with Claude Code
Generated by Claude Code