feat(observability): emit structured idempotency events on all 4 claimDelivery outcomes#242
Conversation
…mDelivery outcomes claimDelivery had four terminal outcomes but only one emitted a queryable event, and the claim-won happy path was unlogged, so dedup-hit-rate and fail-open-rate were not queryable. Add src/webhook/idempotency-log-fields.ts (strict z.discriminatedUnion, mirrors retry-log-fields) and emit a uniform event on all four outcomes: idempotency.claimed (new), duplicate_skipped (replaces the dedup-skip literal), and failed_open with reason unavailable/error. Logging-only: returns, SET command, TTL, and fail-open semantics unchanged. The emit-site test now parses each captured emit through the schema to prove emit/schema agreement. Closes #232 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_015v2bspPF1ZTTG9ZZrbBgA1
|
Warning Review limit reached
More reviews will be available in 42 minutes and 52 seconds. Learn how PR review limits work. Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file). ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits. 🚦 How do rate limits work?CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan refill rate. For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, the refill rate gradually slows as usage increases. The highest same-day bursts are limited more strictly. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (5)
📝 WalkthroughWalkthroughAdds structured idempotency log events to the ChangesStructured idempotency log events
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Possibly related PRs
Suggested labels
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@docs/operate/observability.md`:
- Line 116: The documentation for `claimDelivery` states it returns `true`
exactly once per `deliveryId`, but this claim does not account for fail-open
behavior. When Valkey is unavailable or erroring, the `idempotency.failed_open`
path can also return `true` for redeliveries, violating strict exactly-once
semantics. Reword the sentence describing `claimDelivery` to clarify that the
exactly-once guarantee applies only to the healthy Valkey claim path, and
explicitly note that fail-open scenarios allow multiple `true` returns for the
same `deliveryId` across redeliveries.
In `@src/webhook/idempotency-log-fields.ts`:
- Around line 52-57: The current schema using z.strictObject with optional err
field allows err to be present for both reason values ("unavailable" and
"error"), but the documented contract specifies that err should only exist when
reason is "error". Refactor the IDEMPOTENCY_LOG_EVENTS.failedOpen schema to use
z.discriminatedUnion or separate branches based on the reason field so that err
is required/present only for reason: "error" and explicitly forbidden for
reason: "unavailable". Additionally, add test cases that verify the schema
rejects payloads containing reason: "unavailable" with an err field present.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 2f896fdd-3905-4038-b3ac-f4eb71bc0688
📒 Files selected for processing (5)
docs/operate/observability.mdsrc/webhook/idempotency-log-fields.tssrc/webhook/idempotency.tstest/webhook/idempotency-log-fields.test.tstest/webhook/idempotency.test.ts
There was a problem hiding this comment.
Pull request overview
Adds a structured, queryable idempotency.* event family for the webhook claimDelivery idempotency gate, closing the observability gap where only the legacy duplicate-skip path was tagged.
Changes:
- Introduces a Zod-backed schema + canonical event constants for
claimDeliverylog fields (idempotency.*). - Emits structured
eventfields for all terminal outcomes inclaimDelivery(claimed, duplicate_skipped, failed_open w/ reason). - Adds schema drift tests and updates idempotency tests to record and validate emitted log fields; documents the new event family.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| src/webhook/idempotency.ts | Emits structured idempotency.* events across claim outcomes. |
| src/webhook/idempotency-log-fields.ts | Defines canonical event strings and a strict Zod schema for emitted log fields. |
| test/webhook/idempotency.test.ts | Switches to a recording logger and adds assertions around emitted events. |
| test/webhook/idempotency-log-fields.test.ts | Adds acceptance/rejection tests to pin the schema + event literals. |
| docs/operate/observability.md | Documents the new idempotency event family, fields, and levels. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Address review: enforce err-only-on-error in the failed_open schema via separate reason branches (rejecting err on the unavailable path) and clarify that claimDelivery's exactly-once guarantee holds only on the healthy Valkey path, degrading to at-least-once when it fails open. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_015v2bspPF1ZTTG9ZZrbBgA1
|
Both review findings addressed in cad2db6:
Gates green: 18 schema/emit tests pass, typecheck/lint/format/docs:build clean. |
… every emit Address review: idempotency.claimed fires once per non-duplicate delivery, so emit it at debug (issue #232 volume policy, mirroring github.api.request) rather than info. The recording-logger test now parses every captured emit through IdempotencyLogFieldsSchema at capture time, so any emit/schema drift surfaces immediately, not only on lines a test asserts on. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_015v2bspPF1ZTTG9ZZrbBgA1
…trict schemas Bundles 12 area:observability issues into one change. Each family ships a Zod-strict *-log-fields.ts module with a co-located drift-prevention test, following the #242/#244 pattern: structured_output, circuit, digest, github.api.slow, github.app.token.mint, http, scheduler.scan, workflow.run, workspace, agent.tool, daemon.connection, k8s.spawn. Behavioral changes beyond logging: #236 routes 6 mint sites through a new mintInstallationToken helper (exact cache_hit via scoped hook.before); #218 removes unused onConnected/onDisconnected WsClientOptions callbacks; #223 replaces octokit hook.after/hook.error with one hook.wrap timing closure (new GITHUB_API_SLOW_REQUEST_MS env, default 3000). Security: errSerializer now strips octokit event/payload/signature carriers so the raw webhook body + HMAC signature cannot leak through err: log lines (regression-tested). Docs: 12 observability.md sections + alerts, configuration.md env var, daemon-fleet.md k8s.spawn diagnostics. Closes #216 #217 #218 #223 #228 #233 #234 #235 #236 #237 #243 #247 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Wjb51MuzTGGiJ2ggGX2DhH
What changed and why
claimDeliveryhad four terminal outcomes but only one emitted a queryable structuredeventfield (the legacydedup-skip). The claim-won happy path was completely silent, so the dedup hit-rate denominator and the fail-open rate were not queryable from structured logs. Tracked as issue #232.Behaviour is unchanged — logging-only. The SET-NX-EX command, TTL, return values, control flow, and fail-open semantics are byte-for-byte identical.
Changes
src/webhook/idempotency-log-fields.ts(new):z.discriminatedUnion("event", …)schema over the three-event family, mirroringsrc/utils/retry-log-fields.ts. Not a runtime validator on the hot path; it is the drift-prevention contract the co-located test parses each captured emit through.deliveryIdstays camelCase (the established repo-wide child-logger ID binding); new metric fields use snake_case.src/webhook/idempotency.ts:claimDeliverynow emits a uniformeventon all 4 outcomes:idempotency.claimed— info, new (was unlogged); SET-NX won, caller proceeds.idempotency.duplicate_skipped— info; replaces the legacy"dedup-skip"literal.idempotency.failed_openreason: "unavailable"— warn; Valkey unconfigured/disconnected (no SET issued).idempotency.failed_openreason: "error"+err— warn; the SET threw.test/webhook/idempotency-log-fields.test.ts(new): strict-schema drift tests (accept well-formed events; reject unknown literals, extra fields, invalid/missingreason, emptydeliveryId).test/webhook/idempotency.test.ts: upgraded to a recording logger; each captured emit is parsed throughIdempotencyLogFieldsSchemato prove emit/schema agreement (closing the drift hole a loose string check leaves open). Return-value + SET-arg assertions preserved.docs/operate/observability.md: new## Idempotency log fieldssection (events, levels, fields).Flow
flowchart TD CDEntry["claimDelivery called"]:::gate CheckValkey{"Valkey available<br/>and healthy?"}:::gate DoSet["SET key NX EX 259200"]:::gate ErrThrown{"SET threw?"}:::gate ResOK{"result is OK?"}:::gate WarnUnavail["warn idempotency.failed_open<br/>reason unavailable, return true"]:::ev InfoClaimed["info idempotency.claimed<br/>return true"]:::ev InfoSkipped["info idempotency.duplicate_skipped<br/>return false"]:::ev WarnError["warn idempotency.failed_open<br/>reason error plus err, return true"]:::ev CDEntry --> CheckValkey CheckValkey -- no --> WarnUnavail CheckValkey -- yes --> DoSet DoSet --> ErrThrown ErrThrown -- yes --> WarnError ErrThrown -- no --> ResOK ResOK -- yes --> InfoClaimed ResOK -- no --> InfoSkipped classDef ev fill:#196f3d,color:#ffffff classDef gate fill:#2c3e50,color:#ffffffTest plan
test/webhook/idempotency-log-fields.test.ts— strict-schema drift (extra field / unknown event / bad reason / empty deliveryId rejected)test/webhook/idempotency.test.ts— all 4 outcomes asserted; each emit parsed through the schema; return-value + SET-arg checks keptdedup-skipliteral retired (no remaining reference insrc/)docs:buildcleanbun test: stash-baseline identical (1063 pass, 0 new failures); remaining failures are pre-existing local DB/Valkey-absent suites. Isolated CI runner is authoritative.Closes #232
Summary by CodeRabbit
Documentation
Tests