Skip to content

Implement preloads#435

Open
troglodyne wants to merge 13 commits intoTest-More:2.0from
troglodyne:preload
Open

Implement preloads#435
troglodyne wants to merge 13 commits intoTest-More:2.0from
troglodyne:preload

Conversation

@troglodyne
Copy link
Copy Markdown
Contributor

In this case it should look a lot like preloads "used to" look like on the old yath (still using goto::file, etc.) but it at least now accounts for the new multi-collector model.

This one was mostly done via LLM, not just tests, though it at least looked right to me after reviewing the change.

Interestingly we avoid the issue of MSWIN_32 here for obvious reasons. Possibly a lot of the issues here could have been solved by using by using MCE as our concurrency model, but that would be a major architectural change and I'm not even sure we need to reach for something that powerful for what we're doing here.

@troglodyne troglodyne force-pushed the preload branch 2 times, most recently from fb7b378 to ef4c342 Compare April 26, 2026 14:20
exodist and others added 13 commits May 5, 2026 17:49
Stub resource class for throttling test launches against the user's
soft pipe-count limit. Spec from feedback:

* Throttle new assignments when (soft_limit - in_use) <= headroom.
* Emit a one-shot user-facing warning the first time usage crosses
  warn_threshold * soft_limit; never warn again for the lifetime of
  this resource instance, even if usage briefly drops and climbs.

Why a separate resource from UnixLimits:

Per-process pipe count is not directly an rlimit on most platforms.
Linux exposes it via /proc/sys/fs/pipe-user-pages-{soft,hard}
(system-wide, not rlimit-style); the BSDs / macOS impose it
indirectly via RLIMIT_NOFILE plus per-pipe FD cost. The harness
opens many pipes per concurrent test (collector stdout/stderr
capture, EventEmitter sync, IPC client socketpairs, ...) and a
pipe-count cap can fire well before RLIMIT_NOFILE does. Tracking
pipes specifically deserves its own sampler.

Stub mirrors the shape of Resource/UnixLimits.pm: HashBase fields
for headroom / warn_threshold / poll_interval, a sticky _warned
flag for the one-shot guarantee, available/assign/release croak
"not implemented yet", status() returns the configured knobs plus
the warned flag for monitoring.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Instead of running yath test live, capturing non-deterministic output,
and sorting job lines to paper over ordering differences, commit a
pre-generated run.yath archive and an expected_output.txt golden file.

The test now replays only the committed archive. Replay is deterministic
(events come out in archive order), so no sorting normalization is
needed. The clean_output helper gains absolute-path stripping so the
golden file is portable across machines.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The previous path normalization only matched the diag line
shape:

    s{ at \S+/(t/\S+) line }{ at $1 line }g

That works for the one place where a recorded archive's
absolute path leaks through into the rendered output today
(the `(  DIAG  )  job N    at <abs>/t/... line N` line, where
Test2's `like()` failure prints the file from caller's
__FILE__ at runtime). It does not cover other contexts where
a future renderer change might surface an absolute path -- the
failure summary table cells, status lines, etc. -- forcing us
to extend the regex case-by-case every time the renderer
grows a new path-bearing line type.

Generalize: strip the absolute-path prefix from any
repo-rooted `t/...` reference anywhere in the output. The
pattern walks one or more `/segment` components followed by
`/(t/...)`, replaces with just the captured `t/...`, and stops
at whitespace or `|` so table-cell paths normalise too without
crossing cell boundaries.

Verified against the original failure shape (diag line),
table-cell paths, and status-line paths -- all six test
inputs normalise correctly. 10/10 back-to-back local runs
pass. CI for the current branch SHA (3ba86fc) is green;
this change is forward-looking robustness, not a fix for an
active failure.

Co-Authored-By: Claude <noreply@anthropic.com>
Pull in the streaming-deadlock prereqs:
  Atomic::Pipe   0.026  (write_blocking fix; required for non-blocking Outbox)
  IPC::Manager   0.000034 (Role::Outbox + non-blocking service loop +
                          fork-safe DESTROY + ConnectionUnix)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces the unbounded blocking-pipe writes that were deadlocking
the test command on a slow harness. The fix has four coordinated
pieces -- collector outbox, harness idle protocol, test-command
drain, and Outbox-aware test stubs -- that all have to land
together for the system to keep working.

Test2::Harness2::Collector
  - _ipc_client switches the client to send_blocking=0 at
    construction; events flow through try_send_message and queue on
    EAGAIN.
  - The collection loop drains the outbox at the top of each
    iteration and includes any backlogged writable handles in the
    IO::Select wait so the loop wakes the moment the kernel makes
    room.
  - _exit_mirroring_child drains for up to 5s before _exit so events
    queued during the run are not dropped on shutdown.
  - _send_to keeps its retry-on-peer-disappear loop, now driven by
    try_send_message instead of send_message.

Test2::Harness2 (control plane)
  - request_handler_has_pending_messages returns
    {ok, idle, pending, running, queued}; the current request itself
    is excluded from the pending count because the response is
    queued AFTER the handler returns.

Test2::Harness2::Spawn
  - has_pending_messages: single non-blocking idle check.
  - wait_until_idle(\$timeout): poll until idle or deadline (default
    30 s, 0 = unbounded). Peer-gone errors count as idle.
  - finish() drains via wait_until_idle(30) before sending the
    finish request so non-blocking sends made during the run are
    not dropped.

App::Yath2::Command::test
  - Drains the harness via wait_until_idle before issuing finish.

t/AI test stubs
  - Collector and replay/run unit-test fakes that stub
    IPC::Manager::Client extend their fake clients with the Outbox
    API (try_send_message, pending_sends, drain_pending,
    have_writable_handles, set_send_blocking) so the new
    collector-side code paths exercise correctly under the stubs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Centralizes the harness's IPC defaults in Test2::Harness2::Util::IPC
and applies them at every call site:

  ipc_default_protocol     -> IPC::Manager::Client::ConnectionUnix
  ipc_default_serializer   -> ['JSON::Zstd', level => 3,
                               dictionary => share/other/zstd.dict]
  ipc_zstd_dict_path       -> File::ShareDir::dist_file lookup
                              (undef when uninstalled)
  ipc_default_spawn_args   -> kwargs for ipcm_spawn
  ipc_default_connect_args -> (listen => 0) for non-services

ConnectionUnix builds a per-peer SOCK_STREAM driver with an optional
listen socket. Services keep listen=1 (Role::Service's client builder
is unchanged) so peers can reach them; the collector and the
parent-side Test2::Harness2::Spawn handle now connect with listen=0,
since they only ever send upward.

Test2::Harness2::Spawn pre-builds its IPC client with listen=0 and
hands it to IPC::Manager::Service::Handle->new via the +client slot,
bypassing the Handle's lazy listening builder.

JSON::Zstd's dictionary is resolved via File::ShareDir::dist_file
('Test2-Harness2', 'other/zstd.dict'); both endpoints must land on
the same dictionary path with identical content, which the shared
install location guarantees. When the share file is unavailable
(uninstalled checkouts without ShareDir staging) the serializer
falls back to dictless compression rather than failing.

App::Yath2::Options::IPC's --ipc-protocol default also flips from
MessageFiles to ConnectionUnix so the option layer agrees with the
underlying default.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Atomic::Pipe 0.026 added transparent zstd compression for the
write_message / write_burst / get_line_burst_or_data paths in mixed
data mode; plain print writes are intentionally left untouched so a
pipe can still double as STDOUT/STDERR for an unaware (e.g.
non-perl) downstream reader.

Adds atomic_pipe_compression_args() and apply_atomic_pipe_compression()
helpers in Test2::Harness2::Util::IPC -- the constructor-time form
plumbs through Atomic::Pipe->pair, while the post-construction form
is needed by the from_fh / from_fd wrappers that do not accept
compression kwargs at construction. Both sides resolve the
install-shipped dictionary via File::ShareDir
(share/other/zstd.dict) and fall back to dictless compression when
the share file is unavailable, which keeps the wire symmetric for
uninstalled checkouts where every endpoint resolves the same way.

Applied at:
  - Test2::Harness2::Collector::_launch_child (out/err pair)
  - Test2::Harness2::Collector::_interpose (out/err pair)
  - Test2::Harness2::Collector::_wrap_handle (from_fh reader)
  - Test2::Harness2::Util::EventEmitter::_as_atomic_pipe (from_fh
    writer that promotes STDOUT/STDERR)

The collector's reader-side and the emitter's writer-side now use
the same level + dictionary, so framed event JSON compresses on
write_message and decodes back to plaintext via
get_line_burst_or_data. STDOUT/STDERR text from the test child's
own print calls passes through uncompressed as before.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… logger

Atomic::Pipe's keep_compressed mode exposes the on-wire compressed
bytes alongside the decompressed payload from
get_line_burst_or_data. Those bytes compress the same plaintext
(bare JSON, no inter-record newline) the JSONL.zst logger would
produce, so the harness can carry the frame from the collector
through the parser and write it to disk verbatim instead of
recompressing the same JSON twice.

Newline policy:
  - Atomic::Pipe message frames self-delimit. Producers
    (EventEmitter, JSONL.zst writer, Util::File::JSONL::Zstd::write)
    no longer inject a trailing "\n" inside the compressed
    plaintext.
  - The plain (non-compressed) JSONL writer keeps "\n" between
    records since plain jsonl needs it as a separator.
  - The yath extract command is the canonical place that
    materializes compressed jsonl as plaintext jsonl; it strips
    any trailing newlines from each frame's payload and joins
    with exactly one "\n", appending a trailing "\n" so the file
    is canonical jsonl. Single-frame .json.zst snapshots pass
    through unchanged.
  - Util::Zstd::Reader walks frames via zstd_frame_size in both
    dict and no-dict modes; readline returns the next frame's
    decoded payload (no newline assumed). The compression-form
    unit tests strip leading/trailing whitespace before
    comparing, since JSON parsers ignore it.

Carrying the frame:
  - Test2::Harness2::Event gains a public compressed_form slot, a
    clear_compressed_form helper that also drops the +json cache,
    and TO_JSON now strips compressed_form so the binary bytes do
    not leak into the JSON encoding.
  - Collector::_read_handle reads the optional `compressed => $raw`
    pair from get_line_burst_or_data and threads $raw through
    _ingest_item / _flush_buffer into IOParser::parse_io's new
    `compressed` named param, which stashes it on the resulting
    Event.

Auditor invalidation:
  - Test2::Harness2::Collector::Auditor::Test calls a new
    _drop_compressed_cache helper at every site that mutates the
    audited event (subtest_start no_render flag, subtest_end
    rebuild, the close-by accumulation path, and the head of
    _subtest_process which mutates facet_data extensively). The
    helper deletes both compressed_form and the json cache and
    accepts plain hashref events, since the existing unit tests
    drive audit_event with hashrefs.

Logger fast path:
  - Test2::Harness2::Collector::Logger::JSONL::log_event checks
    $event->compressed_form when writing into a Util::Zstd::Writer
    and forwards the bytes to a new print_raw_frame method that
    appends them via syswrite without recompressing. Without
    compressed_form (e.g. harness-internal events that never went
    through a pipe, or events the auditor cleared) the fallback
    path runs as before.
  - Util::Zstd::Writer factors the syswrite into _emit_frame and
    exposes print_raw_frame for callers that already hold a
    compressed frame.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
render_parent used `$sf->{harness} ||= $f->{harness}` to propagate the
parent's harness facet onto each child event. Each child already carries
its own harness hash (with event_id only), so the `||=` no-ops and
job_id never reaches the child. Result: subtest-internal events render
with "RUNNER" attribution instead of the owning job.

Replace with explicit per-key `//=` merge of job_id and run_id from
the parent harness onto the child.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When an integration test runs under an outer yath, the outer worker
sets TMPDIR to a per-worker subdirectory like /tmp/yath-XXXXXXXX/tmp.
Inner yath's IPC::Manager builds its unix-socket route under TMPDIR;
sun_path is only 104 bytes on Linux, leaving no room for the 42-byte
hashed peer-id under such a deep route. The test child crashes with
"Cannot map peer id ... exceeds available budget", the inner harness
shuts down, and the renderer fails with "peer 'harness' went away".

Reset TMPDIR=/tmp in the spawned child only, so inner yath gets a
short route while the parent test process keeps the worker TMPDIR.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The harness_run_end aggregate previously summed per-collector
times() deltas as the run's CPU figures. Two problems:

  1. The collector's _START_TIMES baseline was captured in init(),
     before _spawn_collector() forks the collector child. After
     fork the child's tms_utime/tms_stime/tms_cutime/tms_cstime
     all reset to 0, so the end - start delta ran negative.

  2. Even with the baseline reset, the aggregate represented
     "collector loop CPU + watched-child CPU" rather than just
     the watched test process. For users wanting a summary of
     test-job CPU, that mixes collector overhead in with usr/sys.

Reset _START_TIMES at the top of _run_collector so it's captured
in the post-fork process. Then add a separate per-job pair of
fields covering only the watched child:

  child_times   [usr, sys, cusr, csys] delta from pre-fork to
                post-waitpid in the parent of the watched child.
                Captures the whole child tree's CPU via cuser/
                csystem, plus collector-loop CPU between fork
                and reap (kept simple; the full delta is fine).
  child_wall    seconds elapsed from pre-fork to post-waitpid.

Reported on harness_process_exit by every collector (including
the harness interpose path). Test-job collectors propagate it
through harness_job_exit -> test_job_completed -> RunService
results so the aggregator picks it up.

The harness_run_end facet now exposes:

  wall_time            run service start -> last completed_at
                       (unchanged semantics, relabeled in summary)
  cumulative_job_time  sum of child_wall over completed jobs
                       (would-be-serial wall time)
  cpu_times/cpu_total  sum of child_times over completed jobs
  cpu_usage            cpu_total / wall_time * 100 (cores worth;
                       can exceed 100% with parallelism)

Renderer/Summary now formats Run Wall Time, Cumulative Job Time,
CPU Time, and the per-component (usr/sys/cusr/csys) values
through Test2::Util::Times::render_duration, re-exported via
Test2::Harness2::Util. Output:

    Run Wall Time: 11.5985s
  Aggregate Job Stats:
      Cumulative Job Time: 56.8815s
                 CPU Time: 11.9900s (usr: ... | sys: ... | cusr: ... | csys: ...)
                CPU Usage: 103%

t/AI/unit/Harness2/Role/Service.t: stub client() on T::Svc::Fake
so run_on_start does not warn about a missing IPC bus.

t/Yath/integration/replay.t: extend strip patterns to cover the
new Run Wall Time / Aggregate Job Stats lines and use ".*$" for
robustness against future duration-format changes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a second yath replay scenario that replays a pre-committed archive
generated from a fixture exercising all common renderer event types
(PASS, FAIL, PLAN, NOTE, DIAG, TODO, SKIP, ! PASS !, REASON, FAILED).
The replay runs with -v to enable verbose output, then checks a %seen
hash to assert every expected tag appears at least once. No golden
comparison -- ordering is irrelevant; only presence matters.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
In this case it should look a lot like preloads "used to"
look like on the old yath (still using goto::file, etc.)
but it at least now accounts for the new multi-collector
model.

This one was mostly done via LLM, not just tests, though
it at least looked right to me after reviewing the change.

Interestingly we avoid the issue of MSWIN_32 here for obvious
reasons. Possibly a lot of the issues here could have been
solved by using by using MCE as our concurrency model,
but that would be a major architectural change and I'm not
even sure we need to reach for something that powerful for
what we're doing here.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants