Problem
The macOS-primary CI job intermittently fails with a single un-attributed hook timeout:
(fail) (unnamed) [~5020ms]
^ a beforeEach/afterEach hook timed out for this test.
1 fail
Ran ~1169 tests across 96 files.
It is not an assertion failure — a beforeEach/afterEach hook exceeds bun's ~5s default hook timeout on the slower macOS runner. The failing test's own new code passes; the timeout lands on a pre-existing, FS-heavy test (most likely the install-lifecycle suite that shells out / creates temp dirs).
Evidence (recurring, blocks merges)
Why it matters
Every flake forces a manual rerun and erodes the "CI-green = mergeable" signal. With parallel PR waves it blocks the merge train.
Fix options
- Identify the specific slow hook (instrument
beforeEach/afterEach durations on macOS; suspect the install/uninstall lifecycle tests) and speed it up (avoid real FS work in the hook, reuse a fixture).
- Or raise the hook timeout for that suite on macOS (
test(..., { timeout }) / a per-suite hook-timeout) so the borderline runner variance doesn't trip it.
- Prefer (1); (2) is the cheap stopgap.
Acceptance
- macOS CI passes 5×/5× on rerun for an unrelated no-op PR.
- The slow hook is named + remediated or its timeout justified in a comment.
risk:low — CI reliability, no product behavior change.
Problem
The macOS-primary CI job intermittently fails with a single un-attributed hook timeout:
It is not an assertion failure — a
beforeEach/afterEachhook exceeds bun's ~5s default hook timeout on the slower macOS runner. The failing test's own new code passes; the timeout lands on a pre-existing, FS-heavy test (most likely the install-lifecycle suite that shells out / creates temp dirs).Evidence (recurring, blocks merges)
Why it matters
Every flake forces a manual rerun and erodes the "CI-green = mergeable" signal. With parallel PR waves it blocks the merge train.
Fix options
beforeEach/afterEachdurations on macOS; suspect the install/uninstall lifecycle tests) and speed it up (avoid real FS work in the hook, reuse a fixture).test(..., { timeout })/ a per-suite hook-timeout) so the borderline runner variance doesn't trip it.Acceptance
risk:low — CI reliability, no product behavior change.