test(2275): add 429-retried allowlist contract + fix CHANGELOG citations [skip-runtime-e2e]#44
Merged
Conversation
…ons [skip-runtime-e2e] Addresses two HIGH findings from the hostile review of PR #42 + two MEDIUM asks. HIGH 1 — CHANGELOG line-citation drift. PR #42 cited `src/client.rs:517-525` for the retry-allowlist. After the PR merged, the actual allowlist block sits at lines 560-565 (further shifted by this PR's rustdoc addition). Citation now points to the correct range. HIGH 2 — No positive coverage that 429 IS retried. The new `test_401_not_retried_issue_2275` only locked one direction of the allowlist. A future refactor dropping `*status != 429` from `execute_with_retry` would silently make every 4xx terminal — breaking the rate-limit retry contract — and the existing tests wouldn't catch it (`test_retry_logic` uses 500, not 429). New `test_429_is_retried_allowlist_contract` brackets the other side via `.expect(3)` against `max_attempts=3`. MEDIUM 1 — Rustdoc on `execute_with_retry` describing the retry contract: which statuses retry (5xx + 429 + transport errors), which are terminal (401 + everything else 4xx outside the allowlist), with a cross-reference to issue #2275 and CHANGELOG.md. MEDIUM 2 — Clarifying comment above the retry-allowlist explaining that 402/403 are returned as `Ok(client_resp)` from `execute_request` (line 586) — they never reach `execute_with_retry` as `Err`, so the `*status != 402` / `*status != 403` clauses are functionally dead BUT kept as intent-preserving defense-in-depth for any future refactor. Defensive / documentation only. No SDK code paths modified. No retry semantics changed. No version bump — under `[Unreleased]` in CHANGELOG. Mutation test on the new 429 test: temporarily deleted the `*status != 429` clause from `execute_with_retry`; the test failed with "Expected range of matching incoming requests: == 3, Number of matched incoming requests: 1" (wiremock's `.expect(3)` panicked on Drop because the SDK only called the mock once instead of retrying twice more). Restored the clause; test passes. Confirms the assertion isn't tautological. Signed-off-by: Saurabh Jain <saurabh.jain@getaxonflow.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Follow-up PR addressing two HIGH findings + two MEDIUM-priority asks from the hostile review of PR #42 (the regression-test PR for issue #2275).
Defensive / documentation only. No SDK code paths modified. No retry semantics changed. No version bump — under
[Unreleased]in CHANGELOG.HIGH findings fixed
HIGH 1 — CHANGELOG line-citation drift (visible on crates.io / docs.rs)
PR #42's
[Unreleased]entry citedsrc/client.rs:517-525for the retry-allowlist. The actual allowlist block sits at lines 560-565 in currentmainpost-merge (further shifted by this PR's rustdoc addition). Lines 517-525 in currentmainare part of thecheck_statushelper +execute_with_retrysignature — not the allowlist at all. The cited code snippet was correct; just the line numbers were stale. The drift ships to crates.io docs.Fix: updated CHANGELOG entry's
src/client.rs:517-525→src/client.rs:560-565.HIGH 2 — No positive coverage that 429 IS retried (allowlist not regression-locked)
The new
test_401_not_retried_issue_2275only asserts401 → terminal. There's no test for the other direction — that 429 still triggers retry (since 429 is in the allowlist{429, 402, 403}). A future refactor that drops*status != 429fromexecute_with_retrywould make every 4xx terminal, silently breaking the rate-limit retry contract, and the existing tests wouldn't catch it (thetest_retry_logictest uses 500, not 429).Fix: added
test_429_is_retried_allowlist_contractintests/integration_test.rs(next totest_401_not_retried_issue_2275, NOT a separate file). Mounts wiremock with.respond_with(ResponseTemplate::new(429)).expect(3), configures the client withmax_attempts=3+initial_delay=1ms, callsproxy_llm_call, assertsresult.is_err(), asserts (implicitly via.expect(3)) that the mock was called exactly 3 times.MEDIUM-priority asks addressed
MEDIUM 1 — Rustdoc on
execute_with_retrysrc/client.rs:532(post-rustdoc) — added///block describing the retry contract:ApiErrorvariants ofAxonFlowError){429, 402, 403}allowlist (enumerated)execute_requestandCHANGELOG.mdfor historyMEDIUM 2 — Allowlist clarity on 402/403
src/client.rs:586returns 402 + 403 asOk(client_resp)fromexecute_request— they're SUCCESS responses with policy/quota data. They never propagate asErrtoexecute_with_retry, so the*status != 402and*status != 403clauses on lines 563-564 are functionally dead.Fix: kept the dead clauses (defense-in-depth + reads as intent for future hackers who don't realize 402/403 are success) and added a comment block above the allowlist explaining the situation. Don't delete — would be a behavior change if someone ever refactors
execute_requestto error on 402/403.Mutation test — confirms the new 429 test isn't tautological
Per the brief's instruction, temporarily deleted the
*status != 429clause fromexecute_with_retry:Re-ran the test:
The mock was called exactly 1 time (immediate-terminal under the mutation) instead of 3 (full retry exhaustion under correct code). Wiremock's
.expect(3)panicked on Drop. Restored the clause; test passes.Result: assertion is independent of the SUT — failing for the right reason (no retry happened) under the mutation; passing for the right reason (3 retries happened) under correct code.
Hostile R3 self-review
.expect(3)is checked at MockServer Drop, which happens at the end of the#[tokio::test]function scope. Theassert!(result.is_err(), ...)assert before drop won't mask a mock-call-count violation (the panic on Drop fires regardless).test_retry_logic— that test uses 500 +expect(2), doesn't probe 429. The new test fills the gap by mounting 429 +expect(3). Two complementary tests; no overlap, no contradiction.test_401_not_retried_issue_2275— that test uses 401 +expect(1). Together with the new test (429 +expect(3)), the pair brackets the allowlist boundary: out → terminal, in → retried. Either direction breaking is now caught.grep -n "async fn execute_with_retry" src/client.rsreturns 532;grep -n "if \*status >= 400"returns 560;grep -n "if status.is_success() || status.as_u16() == 402"returns 586. All three numbers cited in the CHANGELOG + in-code comment match the actual source.cargo fmt --all -- --checkclean,cargo clippy --all-targets --all-features -- -D warningsclean.Skip-runtime-e2e justification
This PR is defensive / documentation only:
tests/integration_test.rs(useswiremock, not the live community-stack agent)execute_with_retryNo new feature surface, no wire contract changes, no SDK code paths modified, no retry semantics changed. The retry contract itself is unchanged from PR #42; this PR only adds a complementary test + documentation around behavior that already shipped.
runtime-e2e/x-client-id/(the onlyruntime-e2e/directory in this repo) is the wire-level companion totests/x_client_id_header_test.rsfor the v9 identity headers — completely orthogonal to the retry-contract surface this PR documents. No reason to mount the live agent to verify a mock-based retry-allowlist regression test.Issue references