From 392e00fa3d21792d5deb0c59ebb4949617132b7a Mon Sep 17 00:00:00 2001 From: Voyvodka Date: Mon, 11 May 2026 11:05:15 +0300 Subject: [PATCH] docs(portal): add API reference and architecture sections (v0.2.0 audit doc-drift) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Closes the documentation half of the v0.2.0 portal audit. Tur 1 (#101) was the security fix, Tur 2 (#102) was the test coverage; this PR is the doc drift the same audit surfaced. docs/API.md §3.8 — Portal API (Customer-Facing JWT): - HS256 JWT contract: algorithm pin, signing key per-app, lifetime cap, clock skew, token size cap, required + optional claims. - Capability table: endpoints:read|write|test, attempts:read. - Per-app CORS rules: no wildcards, https-only, RFC 6454 case- insensitive matching, preflight semantics. - Rate limit: shares send-by-appid partition; cross-tenant lookups return 404 (never 403, which would leak existence). - All 10 portal routes documented with request/response shape. - 5 dashboard portal-admin routes documented. - Portal-specific error code table. - End-to-end probe with jose (Node.js mint) + cURL. docs/ARCHITECTURE.md §4.3 — Portal Token Authentication: - Per-application secrets stored on Application (PortalSigningKey, AllowedPortalOriginsJson, PortalRotatedAt). - Pipeline ordering with the three invariants it encodes (ApiKeyAuth bypass, PortalToken-before-PortalCors, both-before-RateLimiter). - PortalLookupCache: TTL, instant local invalidation, atomic CTS swap. - Cross-tenant isolation via 2-arg GetByIdAsync. - JWT validator defense-in-depth (HS256 pin, 8 KiB token cap, MapInboundClaims=false, lifetime cap, opaque error bodies). --- CHANGELOG.md | 4 + docs/API.md | 195 +++++++++++++++++++++++++++++++++++++++++++ docs/ARCHITECTURE.md | 46 ++++++++++ 3 files changed, 245 insertions(+) diff --git a/CHANGELOG.md b/CHANGELOG.md index 2bb42f8..56c3919 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -7,6 +7,10 @@ and this project follows [Semantic Versioning](https://semver.org/spec/v2.0.0.ht ## [Unreleased] +### Documentation +- **`docs/API.md` §3.8 — Portal API (Customer-Facing JWT).** Replaces the doc-drift gap surfaced by the v0.2.0 audit (where the portal API surface existed in code and CHANGELOG but was absent from `docs/API.md`). Covers HS256 JWT contract, capability scopes, per-app CORS, rate-limit partition reuse, cross-tenant `404` semantics, every route under `/api/v1/portal/*`, the dashboard portal-admin routes, the portal-specific error code table, and an end-to-end Node.js (`jose`) + cURL probe. +- **`docs/ARCHITECTURE.md` §4.3 — Portal Token Authentication.** Documents the load-bearing middleware ordering (`ApiKeyAuth` → `PortalTokenAuth` → `PortalCors` → `RateLimiter`), the three invariants it encodes, the `PortalLookupCache` TTL + atomic-CTS-swap behaviour, and the JWT validator's defense-in-depth choices (HS256 pin, 8 KiB token cap, `MapInboundClaims=false`, lifetime cap, opaque error bodies). + ### Added - **Portal stack test coverage — 23 new tests closing the v0.2.0 audit gaps.** New `PortalCorsMiddlewareTests` (7 facts) covers preflight allow / reject / case-insensitive match / subdomain-spoofing reject / missing-Origin pass-through and the real-request CORS-header echo path — previously zero coverage. New `PortalLookupCacheTests` (5 facts) pins TTL hit, invalidation forces DB reload, portal-disabled returns null, and concurrent-Invalidate doesn't double-dispose (regression for the audit fix). New `PortalOriginsAllowlistE2ETests` (7 facts, Testcontainers) runs `ApplicationRepository.AnyAllowsPortalOriginAsync` against real PostgreSQL JSONB. `PortalEndpointsControllerTests` gains four cross-tenant guards (`DELETE`, `/enable`, `/disable`, `/test`, `/attempts` against another tenant's endpoint id all return `404 PORTAL_NOT_FOUND`) and a defense-in-depth test for empty-capabilities tokens (the absence of any `capabilities` claim must surface as `403 PORTAL_INSUFFICIENT_CAPABILITY`, not silent full access). diff --git a/docs/API.md b/docs/API.md index ad423ab..7325cba 100644 --- a/docs/API.md +++ b/docs/API.md @@ -866,6 +866,201 @@ The dashboard treats `EndpointHealthChanged` as a cache-invalidation signal — --- +### 3.8 Portal API (Customer-Facing JWT) — v0.2.0 + +The portal API is a narrowed mirror of the public endpoint surface, scoped to a single application via a short-lived JWT minted by the host SaaS. It powers the embeddable `` React component (`@webhookengine/endpoint-manager` on npm). The engine **only verifies** these tokens — it never mints them. Per-application signing key, allowed CORS origins, and capability set are managed by the operator from the dashboard. + +For host-side integration (token mint, CSS theming, sample app), see `docs/PORTAL.md`. This section is the wire reference. + +#### Authentication + +Every request to `/api/v1/portal/*` (except `OPTIONS` preflight) requires a Bearer JWT in `Authorization: Bearer `. + +- **Algorithm:** HS256 only. `alg=none` and HS384/HS512 are rejected with `PORTAL_AUTH_INVALID_SIGNATURE`. +- **Signing key:** per-application `PortalSigningKey` (32 bytes minimum). Generated at portal-enable time; never returned by the engine after creation. Rotated via the dashboard rotate action. +- **Lifetime cap:** `exp - nbf <= 15 minutes` (configurable via `WebhookEngine:PortalAuth:MaxLifetimeMinutes`). Tokens with longer requested lifetimes are rejected as `PORTAL_AUTH_LIFETIME_TOO_LONG` even when currently valid. +- **Clock skew:** ±30 s (`PortalAuth:ClockSkewSeconds`). +- **Token size cap:** 8 KiB (`PortalAuth:MaxTokenSizeBytes`). Larger payloads are rejected before parsing. +- **Required claims:** `appId` (UUID — selects the signing key), `nbf`, `exp`. `sub` is recommended, `iat` is optional. Repeated `capabilities` claims grant scope (see below). + +#### Capabilities + +Tokens are scoped by repeated `capabilities` claims (colon-delimited wire format). Missing capability → `403 PORTAL_INSUFFICIENT_CAPABILITY`. **Absence of any `capabilities` claim grants nothing**, not full access. + +| Capability | Grants | +|---|---| +| `endpoints:read` | `GET /endpoints`, `GET /endpoints/{id}`, `GET /event-types` | +| `endpoints:write` | `POST /endpoints`, `PUT /endpoints/{id}`, `DELETE /endpoints/{id}`, `/enable`, `/disable` | +| `endpoints:test` | `POST /endpoints/{id}/test` (highest-risk — fires real outbound HTTP) | +| `attempts:read` | `GET /endpoints/{id}/attempts` | + +#### CORS + +Per-application allowed origins are stored on `Application.AllowedPortalOriginsJson` and managed via `PUT /api/v1/dashboard/applications/{appId}/portal/origins`. + +- Wildcards are **not** supported — host SaaS must enumerate exact origins. +- HTTPS-only outside Development. Up to 50 origins per app, 256 chars each. +- Origin matching is RFC 6454 case-insensitive on scheme + host. +- `OPTIONS` preflight returns `204` with `Access-Control-Allow-Origin: `, `Allow-Methods`, `Allow-Headers: Authorization, Content-Type`, `Max-Age: 600`. A disallowed origin returns `403` with no CORS headers (so the browser correctly surfaces a CORS error). + +#### Rate limiting + +Portal routes share the public API's `send-by-appid` token-bucket partition. The portal token's `appId` flows into the limiter via `HttpContext.Items["AppId"]`. A 429 carries the standard `Retry-After` header. + +#### Cross-tenant isolation + +Every resource lookup is scoped via the 2-arg `GetByIdAsync(appId, endpointId)` repository method. A token for tenant A asking for tenant B's endpoint id receives **`404 PORTAL_NOT_FOUND`** (never 403 — that would leak the existence of resources owned by other apps). + +#### Routes + +##### List Endpoints +``` +GET /api/v1/portal/endpoints + ?status=active|degraded|failed|disabled + &page=1 + &pageSize=20 +``` + +Response shape strips `secretOverride` (returns `hasSecretOverride: bool` instead) and full custom-header values (returns `customHeaderNames: string[]`). + +##### Get Endpoint +``` +GET /api/v1/portal/endpoints/{endpointId} +``` + +Strips `transformExpression`, `transformEnabled`, `transformValidatedAt`, `allowedIpsJson` — these are admin-only fields. + +##### Create Endpoint +``` +POST /api/v1/portal/endpoints +``` +```json +{ + "url": "https://api.acme.example/webhooks/orders", + "description": "Order lifecycle events", + "filterEventTypes": ["uuid-of-event-type"], + "customHeaders": { "X-Source": "webhookengine" }, + "metadata": { "team": "growth" }, + "secretOverride": "whsec_AbCdEf01234567890aBcDeF0123456789" +} +``` + +`url` must pass the SSRF-hardened URL policy (HTTPS, public DNS, no private/loopback IPs at validate-time and at connect-time). `secretOverride` requires the `whsec_` prefix and ≥32 chars — typing a weak password is rejected with `422 PORTAL_VALIDATION_FAILED`. `transformExpression` / `allowedIpsJson` are not exposed; if smuggled into the body, model binding drops them silently. + +##### Update Endpoint +``` +PUT /api/v1/portal/endpoints/{endpointId} +``` + +Partial replace — every field is optional, only non-null fields are applied. `filterEventTypes`, when provided, replaces the full list (clear by sending `[]`). At least one field must be present. + +##### Delete Endpoint +``` +DELETE /api/v1/portal/endpoints/{endpointId} +``` + +Returns `204 No Content` on success. + +##### Enable / Disable Endpoint +``` +POST /api/v1/portal/endpoints/{endpointId}/enable +POST /api/v1/portal/endpoints/{endpointId}/disable +``` + +Returns `200 OK` with the updated endpoint detail. + +##### Send Test Webhook +``` +POST /api/v1/portal/endpoints/{endpointId}/test +``` +```json +{ + "eventType": "order.created", + "payload": { "orderId": "ord_abc123" } +} +``` + +Fires a real outbound HTTP POST through the engine's `webhook-delivery` HttpClient (HMAC-signed, SSRF-checked). Returns the request preview, response status, latency, and body. **Does not** affect endpoint health or retention; the dispatch never enters the persistent queue. + +##### List Attempts for an Endpoint +``` +GET /api/v1/portal/endpoints/{endpointId}/attempts + ?page=1 + &pageSize=20 +``` + +Most-recent-first delivery attempts for the endpoint. `attempts:read` capability required. + +##### List Event Types +``` +GET /api/v1/portal/event-types + ?page=1 + &pageSize=100 +``` + +Read-only dropdown source for the embedded UI. Archived event types are excluded; their lifecycle is admin-only. + +#### Dashboard portal-admin routes + +These cookie-authenticated dashboard routes manage the portal grant per application: + +``` +GET /api/v1/dashboard/applications/{appId}/portal +POST /api/v1/dashboard/applications/{appId}/portal/enable +POST /api/v1/dashboard/applications/{appId}/portal/rotate +POST /api/v1/dashboard/applications/{appId}/portal/disable +PUT /api/v1/dashboard/applications/{appId}/portal/origins +``` + +`enable` and `rotate` return the new `portalSigningKey` **once** — capture it on the host SaaS (it's never returned again). `disable` clears the signing key (in-flight tokens are rejected within `PortalAuth:LookupCacheTtlSeconds` on remote nodes; instantly on the local node via the lookup-cache invalidation hook). Audit log records every mutating action with the signing key redacted to `portalEnabled: bool`. + +#### Error codes (portal-specific) + +| Code | HTTP | Meaning | +|---|---|---| +| `PORTAL_AUTH_REQUIRED` | 401 | Missing or malformed `Authorization: Bearer` header. | +| `PORTAL_AUTH_INVALID_TOKEN` | 401 | JWT is malformed, oversized (>8 KiB), or fails post-parse validation. | +| `PORTAL_AUTH_INVALID_SIGNATURE` | 401 | Wrong key, wrong algorithm, or `alg=none`. | +| `PORTAL_AUTH_TOKEN_EXPIRED` | 401 | `exp` is in the past beyond clock skew. | +| `PORTAL_AUTH_LIFETIME_TOO_LONG` | 401 | `exp - nbf` exceeds `MaxLifetimeMinutes`. | +| `PORTAL_NOT_ENABLED` | 401 | App exists but `PortalSigningKey` is null. | +| `PORTAL_INSUFFICIENT_CAPABILITY` | 403 | Token lacks the capability required by the route. | +| `PORTAL_NOT_FOUND` | 404 | Endpoint/event-type not found in this tenant's scope. | +| `PORTAL_VALIDATION_FAILED` | 422 | Request body failed FluentValidation. | + +#### cURL — end-to-end probe + +Mint a token on the host SaaS (Node.js example below) then call the portal: + +```js +// Server-side (host SaaS), Node.js + jose +import { SignJWT } from 'jose'; +const secret = new TextEncoder().encode(process.env.PORTAL_SIGNING_KEY); +const token = await new SignJWT({ + appId: '00000000-0000-0000-0000-000000000001', + capabilities: ['endpoints:read', 'endpoints:write', 'endpoints:test', 'attempts:read'], +}) + .setProtectedHeader({ alg: 'HS256' }) + .setNotBefore('0s') + .setExpirationTime('10m') + .sign(secret); +``` + +```bash +# List endpoints +curl https://hooks.example.com/api/v1/portal/endpoints \ + -H "Authorization: Bearer $TOKEN" \ + -H "Origin: https://app.acme.example" + +# Send a test webhook +curl -X POST https://hooks.example.com/api/v1/portal/endpoints/{id}/test \ + -H "Authorization: Bearer $TOKEN" \ + -H "Content-Type: application/json" \ + -d '{"eventType":"order.created","payload":{"orderId":"ord_abc"}}' +``` + +--- + ## 5. Webhook Headers Sent to Endpoints Every webhook delivery includes these standard headers: diff --git a/docs/ARCHITECTURE.md b/docs/ARCHITECTURE.md index 33f61f3..0a5c63a 100644 --- a/docs/ARCHITECTURE.md +++ b/docs/ARCHITECTURE.md @@ -424,6 +424,52 @@ Stored: SHA256 hash in database (never stored in plaintext) Lookup: prefix (whe_app1a2b3_) used for fast lookup, hash compared for verification ``` +### 4.3 Portal Token Authentication (v0.2.0) + +Customer-facing routes under `/api/v1/portal/*` are authenticated by short-lived HS256 JWTs minted by the host SaaS, **not** by an API key. The engine never mints these tokens — it only verifies them. See `docs/API.md` §3.8 for the wire contract and `docs/PORTAL.md` for host integration. + +**Per-application secrets stored on `Application`:** +- `PortalSigningKey` — HS256 secret (32-byte random). Generated at portal-enable; never returned after creation. Rotated via the dashboard rotate action. +- `AllowedPortalOriginsJson` — JSONB array of exact CORS origins (no wildcards, https-only outside Development, max 50 / 256 chars). +- `PortalRotatedAt` — surfaced as "last rotated at" in the operator UI. + +**Pipeline ordering** (load-bearing, in `Program.cs` middleware section): + +``` +SecurityHeaders + → MetricsAuth + → RequestLogging + → ExceptionHandling + → ApiKeyAuth (skips /api/v1/portal/*) + → PortalTokenAuth (validates JWT, populates HttpContext.Items) + → PortalCors (per-app CORS using populated lookup) + → RateLimiter (send-by-appid partition; portal AppId flows in) + → Authentication + → Authorization +``` + +Three invariants this ordering encodes: + +1. **`ApiKeyAuthMiddleware` deliberately bypasses portal paths** — those routes use a different auth scheme. Without the bypass, every portal request would 401 before reaching the JWT validator. +2. **`PortalTokenAuthMiddleware` runs before `PortalCorsMiddleware`** for non-`OPTIONS` requests, because CORS reads the validated `PortalAppLookup` from `HttpContext.Items`. `OPTIONS` preflight has no token (browsers don't send one), so the CORS middleware runs its own bounded `AnyAllowsPortalOriginAsync` query against `ApplicationRepository` — checking whether **any** portal-enabled app permits the origin. +3. **Both portal middlewares run before the rate limiter** so that the JWT-derived `AppId` is in `HttpContext.Items["AppId"]` when the limiter resolves its partition. Portal traffic shares the public API's `send-by-appid` token bucket — a leaked token can't outrun the per-tenant budget. + +**`PortalLookupCache`** (Infrastructure layer, `IMemoryCache`-backed): + +- Holds the per-app `(PortalSigningKey, AllowedOrigins)` tuple to avoid a database round-trip per request. +- TTL: 60 s (`PortalAuth:LookupCacheTtlSeconds`). +- Mutating dashboard actions (`enable` / `rotate` / `disable` / origins update) call `PortalLookupCache.InvalidateApplication(appId)` synchronously, so on the local node a key rotation takes effect within milliseconds rather than within the cache TTL. Multi-replica deployments still bounded by the TTL on remote nodes. +- The static per-app `CancellationTokenSource` is atomically swapped on every `Set` (via `AddOrUpdate`); the previous source is cancelled and disposed in the same step, so a `Set` racing an `Invalidate` cannot bind a fresh cache entry to a disposed token. + +**Cross-tenant isolation:** every controller action goes through the 2-arg `EndpointRepository.GetByIdAsync(appId, endpointId)` (and similar for event types / messages). A token for tenant A asking for tenant B's endpoint id receives `404 PORTAL_NOT_FOUND` — never `403`, which would leak the existence of other tenants' resources. + +**Defense-in-depth on the JWT validator:** +- HS256 algorithm pinned via `ValidAlgorithms = [HmacSha256]`. `alg=none`, HS384, HS512 all rejected. +- `MaximumTokenSizeInBytes = 8 KiB` (default 250 KiB) — defeats DoS amplification. +- `MapInboundClaims = false` — we read raw JWT claim keys (`appId`, `capabilities`); the .NET URI mapping is pure overhead and a small attack surface. +- Hard cap on `exp - nbf` (default 15 min) regardless of what the host minted, so a leaked token's blast radius is bounded. +- Every error response uses the same opaque message body — never echoes the inner exception (which could leak signing-key length or which validation step failed). + --- ## 5. Scalability Path