Skip to content

use nextest per-test retries for network-sensitive tests#140

Draft
PaulLaux wants to merge 3 commits into
sync-zcash-v4.2.0-merge-pk1from
nextest-1
Draft

use nextest per-test retries for network-sensitive tests#140
PaulLaux wants to merge 3 commits into
sync-zcash-v4.2.0-merge-pk1from
nextest-1

Conversation

@PaulLaux

@PaulLaux PaulLaux commented Jun 4, 2026

Copy link
Copy Markdown

Replace the split cargo-test steps and workflow-level retry loop with a single 'cargo nextest run' using a new ci-basic profile:

  • network-sensitive tests (zebra-network lib, zebrad acceptance) run serially via a test group and get per-test retries (3 attempts, 30s pause, matching the old loop)
  • everything else runs in parallel with no retries, so deterministic failures fail fast instead of re-running a whole test group
  • non-network tests in those targets no longer run twice
  • doctests run in a separate step (nextest does not run them)

cargo-nextest is installed via taiki-e/install-action pinned by SHA, consistent with the existing checkout hardening.

PaulLaux added 3 commits June 4, 2026 16:35
…tests

Replace the split cargo-test steps and workflow-level retry loop with a
single 'cargo nextest run' using a new ci-basic profile:

- network-sensitive tests (zebra-network lib, zebrad acceptance) run
  serially via a test group and get per-test retries (3 attempts, 30s
  pause, matching the old loop)
- everything else runs in parallel with no retries, so deterministic
  failures fail fast instead of re-running a whole test group
- non-network tests in those targets no longer run twice
- doctests run in a separate step (nextest does not run them)

cargo-nextest is installed via taiki-e/install-action pinned by SHA,
consistent with the existing checkout hardening.
…limits

Log analysis of the first nextest run showed the serial 'network' group
was the entire critical path (~34 of 35 test minutes, with all parallel
tests done by minute 5), and every run downloads 594 crates and builds
cold.

- only serialize/retry tests that actually touch the network: the
  zebra-network socket tests live in peer_set::initialize and isolated
  (35 tests, 5.1 min); the other 140 lib tests (10.5 min) are in-process
  and now run in parallel without retries
- cache cargo registry and build artifacts (Swatinem/rust-cache, pinned,
  keyed per nu7 matrix leg)
- terminate hung tests after 10 min so they are named and retried
  instead of the step-level timeout killing the whole run
- keep going up to 5 failures so one red run reports all of them
- list skipped tests in the final summary for visibility
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant