Implement preloads by troglodyne · Pull Request #435 · Test-More/Test2-Harness

troglodyne · 2026-04-25T16:13:55Z

In this case it should look a lot like preloads "used to" look like on the old yath (still using goto::file, etc.) but it at least now accounts for the new multi-collector model.

This one was mostly done via LLM, not just tests, though it at least looked right to me after reviewing the change.

Interestingly we avoid the issue of MSWIN_32 here for obvious reasons. Possibly a lot of the issues here could have been solved by using by using MCE as our concurrency model, but that would be a major architectural change and I'm not even sure we need to reach for something that powerful for what we're doing here.

Stub resource class for throttling test launches against the user's soft pipe-count limit. Spec from feedback: * Throttle new assignments when (soft_limit - in_use) <= headroom. * Emit a one-shot user-facing warning the first time usage crosses warn_threshold * soft_limit; never warn again for the lifetime of this resource instance, even if usage briefly drops and climbs. Why a separate resource from UnixLimits: Per-process pipe count is not directly an rlimit on most platforms. Linux exposes it via /proc/sys/fs/pipe-user-pages-{soft,hard} (system-wide, not rlimit-style); the BSDs / macOS impose it indirectly via RLIMIT_NOFILE plus per-pipe FD cost. The harness opens many pipes per concurrent test (collector stdout/stderr capture, EventEmitter sync, IPC client socketpairs, ...) and a pipe-count cap can fire well before RLIMIT_NOFILE does. Tracking pipes specifically deserves its own sampler. Stub mirrors the shape of Resource/UnixLimits.pm: HashBase fields for headroom / warn_threshold / poll_interval, a sticky _warned flag for the one-shot guarantee, available/assign/release croak "not implemented yet", status() returns the configured knobs plus the warned flag for monitoring. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Instead of running yath test live, capturing non-deterministic output, and sorting job lines to paper over ordering differences, commit a pre-generated run.yath archive and an expected_output.txt golden file. The test now replays only the committed archive. Replay is deterministic (events come out in archive order), so no sorting normalization is needed. The clean_output helper gains absolute-path stripping so the golden file is portable across machines. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

The previous path normalization only matched the diag line shape: s{ at \S+/(t/\S+) line }{ at $1 line }g That works for the one place where a recorded archive's absolute path leaks through into the rendered output today (the `( DIAG ) job N at <abs>/t/... line N` line, where Test2's `like()` failure prints the file from caller's __FILE__ at runtime). It does not cover other contexts where a future renderer change might surface an absolute path -- the failure summary table cells, status lines, etc. -- forcing us to extend the regex case-by-case every time the renderer grows a new path-bearing line type. Generalize: strip the absolute-path prefix from any repo-rooted `t/...` reference anywhere in the output. The pattern walks one or more `/segment` components followed by `/(t/...)`, replaces with just the captured `t/...`, and stops at whitespace or `|` so table-cell paths normalise too without crossing cell boundaries. Verified against the original failure shape (diag line), table-cell paths, and status-line paths -- all six test inputs normalise correctly. 10/10 back-to-back local runs pass. CI for the current branch SHA (3ba86fc) is green; this change is forward-looking robustness, not a fix for an active failure. Co-Authored-By: Claude <noreply@anthropic.com>

Pull in the streaming-deadlock prereqs: Atomic::Pipe 0.026 (write_blocking fix; required for non-blocking Outbox) IPC::Manager 0.000034 (Role::Outbox + non-blocking service loop + fork-safe DESTROY + ConnectionUnix) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Replaces the unbounded blocking-pipe writes that were deadlocking the test command on a slow harness. The fix has four coordinated pieces -- collector outbox, harness idle protocol, test-command drain, and Outbox-aware test stubs -- that all have to land together for the system to keep working. Test2::Harness2::Collector - _ipc_client switches the client to send_blocking=0 at construction; events flow through try_send_message and queue on EAGAIN. - The collection loop drains the outbox at the top of each iteration and includes any backlogged writable handles in the IO::Select wait so the loop wakes the moment the kernel makes room. - _exit_mirroring_child drains for up to 5s before _exit so events queued during the run are not dropped on shutdown. - _send_to keeps its retry-on-peer-disappear loop, now driven by try_send_message instead of send_message. Test2::Harness2 (control plane) - request_handler_has_pending_messages returns {ok, idle, pending, running, queued}; the current request itself is excluded from the pending count because the response is queued AFTER the handler returns. Test2::Harness2::Spawn - has_pending_messages: single non-blocking idle check. - wait_until_idle(\$timeout): poll until idle or deadline (default 30 s, 0 = unbounded). Peer-gone errors count as idle. - finish() drains via wait_until_idle(30) before sending the finish request so non-blocking sends made during the run are not dropped. App::Yath2::Command::test - Drains the harness via wait_until_idle before issuing finish. t/AI test stubs - Collector and replay/run unit-test fakes that stub IPC::Manager::Client extend their fake clients with the Outbox API (try_send_message, pending_sends, drain_pending, have_writable_handles, set_send_blocking) so the new collector-side code paths exercise correctly under the stubs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Centralizes the harness's IPC defaults in Test2::Harness2::Util::IPC and applies them at every call site: ipc_default_protocol -> IPC::Manager::Client::ConnectionUnix ipc_default_serializer -> ['JSON::Zstd', level => 3, dictionary => share/other/zstd.dict] ipc_zstd_dict_path -> File::ShareDir::dist_file lookup (undef when uninstalled) ipc_default_spawn_args -> kwargs for ipcm_spawn ipc_default_connect_args -> (listen => 0) for non-services ConnectionUnix builds a per-peer SOCK_STREAM driver with an optional listen socket. Services keep listen=1 (Role::Service's client builder is unchanged) so peers can reach them; the collector and the parent-side Test2::Harness2::Spawn handle now connect with listen=0, since they only ever send upward. Test2::Harness2::Spawn pre-builds its IPC client with listen=0 and hands it to IPC::Manager::Service::Handle->new via the +client slot, bypassing the Handle's lazy listening builder. JSON::Zstd's dictionary is resolved via File::ShareDir::dist_file ('Test2-Harness2', 'other/zstd.dict'); both endpoints must land on the same dictionary path with identical content, which the shared install location guarantees. When the share file is unavailable (uninstalled checkouts without ShareDir staging) the serializer falls back to dictless compression rather than failing. App::Yath2::Options::IPC's --ipc-protocol default also flips from MessageFiles to ConnectionUnix so the option layer agrees with the underlying default. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Atomic::Pipe 0.026 added transparent zstd compression for the write_message / write_burst / get_line_burst_or_data paths in mixed data mode; plain print writes are intentionally left untouched so a pipe can still double as STDOUT/STDERR for an unaware (e.g. non-perl) downstream reader. Adds atomic_pipe_compression_args() and apply_atomic_pipe_compression() helpers in Test2::Harness2::Util::IPC -- the constructor-time form plumbs through Atomic::Pipe->pair, while the post-construction form is needed by the from_fh / from_fd wrappers that do not accept compression kwargs at construction. Both sides resolve the install-shipped dictionary via File::ShareDir (share/other/zstd.dict) and fall back to dictless compression when the share file is unavailable, which keeps the wire symmetric for uninstalled checkouts where every endpoint resolves the same way. Applied at: - Test2::Harness2::Collector::_launch_child (out/err pair) - Test2::Harness2::Collector::_interpose (out/err pair) - Test2::Harness2::Collector::_wrap_handle (from_fh reader) - Test2::Harness2::Util::EventEmitter::_as_atomic_pipe (from_fh writer that promotes STDOUT/STDERR) The collector's reader-side and the emitter's writer-side now use the same level + dictionary, so framed event JSON compresses on write_message and decodes back to plaintext via get_line_burst_or_data. STDOUT/STDERR text from the test child's own print calls passes through uncompressed as before. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

… logger Atomic::Pipe's keep_compressed mode exposes the on-wire compressed bytes alongside the decompressed payload from get_line_burst_or_data. Those bytes compress the same plaintext (bare JSON, no inter-record newline) the JSONL.zst logger would produce, so the harness can carry the frame from the collector through the parser and write it to disk verbatim instead of recompressing the same JSON twice. Newline policy: - Atomic::Pipe message frames self-delimit. Producers (EventEmitter, JSONL.zst writer, Util::File::JSONL::Zstd::write) no longer inject a trailing "\n" inside the compressed plaintext. - The plain (non-compressed) JSONL writer keeps "\n" between records since plain jsonl needs it as a separator. - The yath extract command is the canonical place that materializes compressed jsonl as plaintext jsonl; it strips any trailing newlines from each frame's payload and joins with exactly one "\n", appending a trailing "\n" so the file is canonical jsonl. Single-frame .json.zst snapshots pass through unchanged. - Util::Zstd::Reader walks frames via zstd_frame_size in both dict and no-dict modes; readline returns the next frame's decoded payload (no newline assumed). The compression-form unit tests strip leading/trailing whitespace before comparing, since JSON parsers ignore it. Carrying the frame: - Test2::Harness2::Event gains a public compressed_form slot, a clear_compressed_form helper that also drops the +json cache, and TO_JSON now strips compressed_form so the binary bytes do not leak into the JSON encoding. - Collector::_read_handle reads the optional `compressed => $raw` pair from get_line_burst_or_data and threads $raw through _ingest_item / _flush_buffer into IOParser::parse_io's new `compressed` named param, which stashes it on the resulting Event. Auditor invalidation: - Test2::Harness2::Collector::Auditor::Test calls a new _drop_compressed_cache helper at every site that mutates the audited event (subtest_start no_render flag, subtest_end rebuild, the close-by accumulation path, and the head of _subtest_process which mutates facet_data extensively). The helper deletes both compressed_form and the json cache and accepts plain hashref events, since the existing unit tests drive audit_event with hashrefs. Logger fast path: - Test2::Harness2::Collector::Logger::JSONL::log_event checks $event->compressed_form when writing into a Util::Zstd::Writer and forwards the bytes to a new print_raw_frame method that appends them via syswrite without recompressing. Without compressed_form (e.g. harness-internal events that never went through a pipe, or events the auditor cleared) the fallback path runs as before. - Util::Zstd::Writer factors the syswrite into _emit_frame and exposes print_raw_frame for callers that already hold a compressed frame. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

render_parent used `$sf->{harness} ||= $f->{harness}` to propagate the parent's harness facet onto each child event. Each child already carries its own harness hash (with event_id only), so the `||=` no-ops and job_id never reaches the child. Result: subtest-internal events render with "RUNNER" attribution instead of the owning job. Replace with explicit per-key `//=` merge of job_id and run_id from the parent harness onto the child. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

When an integration test runs under an outer yath, the outer worker sets TMPDIR to a per-worker subdirectory like /tmp/yath-XXXXXXXX/tmp. Inner yath's IPC::Manager builds its unix-socket route under TMPDIR; sun_path is only 104 bytes on Linux, leaving no room for the 42-byte hashed peer-id under such a deep route. The test child crashes with "Cannot map peer id ... exceeds available budget", the inner harness shuts down, and the renderer fails with "peer 'harness' went away". Reset TMPDIR=/tmp in the spawned child only, so inner yath gets a short route while the parent test process keeps the worker TMPDIR. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The harness_run_end aggregate previously summed per-collector times() deltas as the run's CPU figures. Two problems: 1. The collector's _START_TIMES baseline was captured in init(), before _spawn_collector() forks the collector child. After fork the child's tms_utime/tms_stime/tms_cutime/tms_cstime all reset to 0, so the end - start delta ran negative. 2. Even with the baseline reset, the aggregate represented "collector loop CPU + watched-child CPU" rather than just the watched test process. For users wanting a summary of test-job CPU, that mixes collector overhead in with usr/sys. Reset _START_TIMES at the top of _run_collector so it's captured in the post-fork process. Then add a separate per-job pair of fields covering only the watched child: child_times [usr, sys, cusr, csys] delta from pre-fork to post-waitpid in the parent of the watched child. Captures the whole child tree's CPU via cuser/ csystem, plus collector-loop CPU between fork and reap (kept simple; the full delta is fine). child_wall seconds elapsed from pre-fork to post-waitpid. Reported on harness_process_exit by every collector (including the harness interpose path). Test-job collectors propagate it through harness_job_exit -> test_job_completed -> RunService results so the aggregator picks it up. The harness_run_end facet now exposes: wall_time run service start -> last completed_at (unchanged semantics, relabeled in summary) cumulative_job_time sum of child_wall over completed jobs (would-be-serial wall time) cpu_times/cpu_total sum of child_times over completed jobs cpu_usage cpu_total / wall_time * 100 (cores worth; can exceed 100% with parallelism) Renderer/Summary now formats Run Wall Time, Cumulative Job Time, CPU Time, and the per-component (usr/sys/cusr/csys) values through Test2::Util::Times::render_duration, re-exported via Test2::Harness2::Util. Output: Run Wall Time: 11.5985s Aggregate Job Stats: Cumulative Job Time: 56.8815s CPU Time: 11.9900s (usr: ... | sys: ... | cusr: ... | csys: ...) CPU Usage: 103% t/AI/unit/Harness2/Role/Service.t: stub client() on T::Svc::Fake so run_on_start does not warn about a missing IPC bus. t/Yath/integration/replay.t: extend strip patterns to cover the new Run Wall Time / Aggregate Job Stats lines and use ".*$" for robustness against future duration-format changes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Adds a second yath replay scenario that replays a pre-committed archive generated from a fixture exercising all common renderer event types (PASS, FAIL, PLAN, NOTE, DIAG, TODO, SKIP, ! PASS !, REASON, FAILED). The replay runs with -v to enable verbose output, then checks a %seen hash to assert every expected tag appears at least once. No golden comparison -- ordering is irrelevant; only presence matters. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

In this case it should look a lot like preloads "used to" look like on the old yath (still using goto::file, etc.) but it at least now accounts for the new multi-collector model. This one was mostly done via LLM, not just tests, though it at least looked right to me after reviewing the change. Interestingly we avoid the issue of MSWIN_32 here for obvious reasons. Possibly a lot of the issues here could have been solved by using by using MCE as our concurrency model, but that would be a major architectural change and I'm not even sure we need to reach for something that powerful for what we're doing here. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

troglodyne force-pushed the preload branch 2 times, most recently from fb7b378 to ef4c342 Compare April 26, 2026 14:20

exodist and others added 13 commits May 5, 2026 17:49

troglodyne force-pushed the preload branch from ef4c342 to 1afbcf8 Compare May 5, 2026 22:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement preloads#435

Implement preloads#435
troglodyne wants to merge 13 commits intoTest-More:2.0from
troglodyne:preload

troglodyne commented Apr 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

troglodyne commented Apr 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants