fix(collector): make trace failures self-explanatory in the JSON envelope#5
Merged
Merged
Conversation
…lope
When a trace run streamed to the dashboard, every failure mode was lossy in
exactly the channel an agent consumes (--json + exit code), so a real
debugging session spent minutes chasing the wrong cause. Each fix puts the
signal where the agent looks.
- Collector route: separate fault domains. A dead/misconfigured store now
returns 503 STORE_UNAVAILABLE (internal error logged, not leaked) instead
of collapsing into the same 400 as a malformed envelope. Every response
carries a machine `code` from the shared vocabulary (INGEST / INGEST_INVALID
/ INGEST_NO_SESSION / STORE), so a caller never has to string-match. A
Postgres auth failure is no longer indistinguishable from a schema error.
- Emit failures surface. Collector.emit returns a rich EmitResult
({ok,status,body,error}) and logs rejections at error with the reason,
instead of logging a 400 at info and swallowing a bare false. The run folds
any rejected emit into the printed/--json envelope as an EMIT diagnostic.
- 0-hit runs explain themselves. A bound-but-never-hit breakpoint now emits a
BP_BOUND_UNHIT diagnostic into the envelope (the JSON counterpart of the
renderer's "no breakpoints hit" line), so a --json reader can tell "trigger
didn't run" from "wrong line" without grepping server logs.
- --concise help states it trims the PRINTED envelope only and does NOT affect
--emit (the collector always gets the full envelope) — closing the false
lead that toggling it changes the wire payload.
- serve prints the exact --emit URL to copy; fix the stale :4747 compose
comment (host port is 14747).
Adds Code.BP_BOUND_UNHIT, re-exports Code from trace-cli/server for the route,
and covers the new emit-result shape + code in output.test.js. CLI + UI
typecheck clean; 41/41 tests pass (1 DB round-trip skipped).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR improves end-to-end debuggability of --emit/collector failures and “0-hit” dynamic runs for --json consumers by introducing shared machine-readable codes and surfacing emit/ingest failure reasons in structured outputs instead of losing them in human logs.
Changes:
- Adds collector ingest fault-domain separation (4xx envelope issues vs 503 store issues) and returns shared
Codevalues in ingest responses. - Makes
Collector.emitreturn a structured result and updates CLI flow to flush emits before rendering and emit anEMITdiagnostic on failures. - Introduces
BP_BOUND_UNHITto explain “breakpoints bound but never hit” in JSON diagnostics; updates serve output and compose comment for port clarity.
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| ui/app/v1/traces/route.ts | Adds structured ingest error codes and distinguishes client envelope errors from store outages (503). |
| src/collector/Collector.ts | Changes emit() to return a rich result and logs HTTP rejections as errors with collector-provided details. |
| src/cli/Cli.ts | Flushes pending emits before printing output and surfaces emit failures as diagnostics; clarifies --concise help text. |
| src/cli/commands/DynamicCommand.ts | Adds BP_BOUND_UNHIT warning when breakpoints bind but no events fire. |
| src/shared/codes.ts | Adds BP_BOUND_UNHIT and collector/emit-related codes to the shared code vocabulary. |
| src/server.ts | Re-exports Code so the Next.js collector route can reuse the shared vocabulary. |
| test/output.test.js | Extends tests to cover new code and Collector.emit structured failure behavior. |
| docker-compose.yml | Fixes a stale compose comment to reference the correct exposed host port. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+40
to
+42
| const collectorUrl = `http://localhost:${port}`; | ||
| log.info("starting dashboard", { url: collectorUrl, host }); | ||
| log.info("stream traces here", { hint: `trace dynamic … --emit ${collectorUrl} (or: export TRACE_COLLECTOR_URL=${collectorUrl})` }); |
Comment on lines
+124
to
128
| const emitToCollector = collector | ||
| ? (envelope: unknown) => { emitChain = emitChain.then(async () => { const result = await Collector.emit(collector, envelope); if (!result.ok) emitFailures.push(result); }); } | ||
| : undefined; | ||
|
|
||
| const { trace } = await this.#dynamic.run({ |
Comment on lines
+37
to
+41
| // A rejection is a real failure — log it at error with the collector's reason (e.g. "invalid envelope", | ||
| // "trace store unavailable"), not at info. The caller folds this into the envelope's diagnostics. | ||
| const body = await response.text().catch(() => ""); | ||
| log.error("emit rejected", { code: Code.EMIT, endpoint, status: response.status, body: body.slice(0, 500) }); | ||
| return { ok: false, status: response.status, body }; |
burrows99
added a commit
that referenced
this pull request
Jun 19, 2026
…ccurately (#6) * fix(collector): bound emit failure memory and word network failures accurately Follow-up to the emit-feedback diagnostics (#5), addressing two Copilot review notes: - Cli.ts kept every failed EmitResult in an array but only ever read the count and the last one. On a run that emits per onProgress, repeated failures grew that array unbounded. Replace it with a counter plus the most recent failure. - The emit diagnostic said the collector "rejected" the emits even when the POST never reached it (connection refused/timeout/DNS — no HTTP status). Word an HTTP rejection and a delivery failure distinctly. - Collector.emit returned the full rejection body, which a caller retains; a large error page could bloat memory. Cap the stored body at 10k chars (the log line still truncates to 500 independently). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(trace): make empty/aborted runs fail loudly instead of silently "running" (#7) A chrome run that captured nothing and recorded nothing was reported as success (ok:true, exit 0) and could sit on the dashboard's "running" badge forever — the agent had no signal it was broken. Root cause: the failure signals lived in stderr logs or dashboard-only state, never in the JSON envelope the agent reads. This closes those gaps and the matching violations of the logger's own two-channel contract (codes.ts: the stderr log and the envelope diagnostic for one event should carry the SAME code). - Terminal envelope on abort. If a run throws (attach failed, engine crashed, recording threw), DynamicCommand now emits a terminal envelope via onProgress — no `running` flag, ok:false, an ENGINE_FATAL diagnostic — so the collector resolves the session instead of leaving its initial "running" partial orphaned. Cli flushes that emit before exiting non-zero. - Recording outcomes are now envelope diagnostics, not just stderr. RECORD_EMPTY (no frames → empty video) and RECORD (render/upload threw) were log-only, so "no video" was invisible to a --json reader. Both now push a warn diagnostic carrying the same code as the log. - Upload failure no longer masquerades as a clean local save. S3ArtifactStore swallows failures and returns null, which #record could not tell apart from "no S3 configured". It now distinguishes the two and emits an UPLOAD diagnostic when a configured upload fails (link missing, local copy kept). - ENGINE_FATAL is now logged as well as diagnosed, so it appears in both channels per the contract. Adds test/dynamic-diagnostics.test.js (fake-tracer unit tests for the abort terminal envelope, captured-fatal, and clean-empty cases). typecheck + build clean; 44/45 tests pass (1 DB round-trip skipped). Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(collector): bound the rejection body at the stream, add emit-diagnostic tests Addresses the two Copilot review notes on the PR: - Collector.emit buffered the whole rejection body via response.text() before slicing, so an oversized error page still caused a transient memory spike and download latency even though the stored body was capped. Read only up to MAX_BODY_CHARS off the stream, then cancel it — bounding memory and latency at the source. - Extract the emit-failure diagnostic wording into an exported emitFailureMessage() helper and cover it with focused tests: HTTP status → "rejected", no status (network) → "failed" (the regression guard so a POST that never landed isn't reported as a rejection), plus the no-body / missing-error / oversized-body edge cases. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(collector): flush the TextDecoder so a split multi-byte char isn't dropped readCappedText decoded chunks with { stream: true } but never did a final decode() flush. A response ending on a multi-byte UTF-8 sequence split across the last chunk boundary would drop/garble that trailing character in the stored body and logged reason. Flush after the read loop. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
burrows99
added a commit
that referenced
this pull request
Jun 19, 2026
…#8) Keep the prose docs in step with the behavior changes from the collector/trace fixes (#5–#7). - README: the docker-compose section showed the container ports (:4747 · :5432 · :9000) instead of the host-published ports the compose file actually maps. Correct to :14747 · :65432 · :19000 (console :19001), and the S3_ENDPOINT example to :19000, so the README matches docker-compose.yml. The native-serve example keeps :5432 (your own Postgres) — that one was right. - skills/trace/SKILL.md: add a "Reading the result" section. The skill is deliberately minimal (the binary is the source of truth), but it gave zero guidance on interpreting a trace — the gap behind an agent treating an empty or perpetually-"running" trace as success. Names the three stable fields that decide whether a run did what was asked (ok · diagnostics[].code · meta.running) and the rule that ok:true + events:[] + a warn diagnostic means "ran fine, saw nothing — re-aim," not success. Field-level (no drift-prone code list).
burrows99
added a commit
that referenced
this pull request
Jun 19, 2026
The DX-gaps analysis enumerated six ways the CLI misled an agent; all six have since shipped. Keep the analysis verbatim as the rationale, but add a status banner + table and a per-gap resolution note pointing to where each was closed (and flagging the few sub-items still open: dashboard age-out of stale "running" (#6b), done/upload decoupling (#6c), and lockfile-based port auto-discovery (#5)). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
Minted from a real debugging session: the trace itself worked, but every failure after it was lossy in exactly the channel an agent consumes (
--json+ exit code). The model spent ~7 minutes chasing a wrong cause. The root pattern: the CLI knew the answer but put it where the agent doesn't look (human-render lines,infologs, or one undifferentiated400). Each fix moves the signal into the envelope.The session's smoking gun: a real
--emitfailed withpassword authentication failed for user "trace"— a Postgres failure — returned as 400, the same code as a malformed envelope. The model probed with a junk payload, got a different 400 ("invalid envelope"), fused the two, and concluded--concisewas trimming required fields (it isn't — see #3).Changes
1. Collector route — separate fault domains, add machine codes, stop leaking internals (
ui/app/v1/traces/route.ts)4xx(caller's fault). Store/DB failure →503 STORE_UNAVAILABLE, real cause logged server-side, not returned.codefrom the shared vocabulary (INGEST/INGEST_INVALID/INGEST_NO_SESSION/STORE). No more string-matching; infra outage ≠ bad envelope.Codefromtrace-cli/serverso the route reuses the one vocabulary instead of drifting literals.2. Emit failures surface instead of vanishing (
src/collector/Collector.ts,src/cli/Cli.ts)Collector.emitreturns a richEmitResult {ok,status,body,error}and logs rejections at error with the reason — previously a 400 logged atinfoand the barefalsewas swallowed by.catch(() => false).--jsonenvelope as anEMITdiagnostic. A run whose every emit 400'd no longer prints "success."3.
--conciseno longer reads like it shapes the wire (src/cli/Cli.ts)--emit(the collector always receives the fulltrace.toJSON()). Closes the false lead the session chased.4. 0-hit runs explain themselves in JSON (
src/shared/codes.ts,src/cli/commands/DynamicCommand.ts)BP_BOUND_UNHITdiagnostic when a breakpoint binds but never fires — the JSON counterpart of the renderer's "no breakpoints hit" line. A--jsonreader can now tell "trigger didn't run" from "wrong line" without grepping server logs.5. Port clarity (
src/cli/commands/ServeCommand.ts,docker-compose.yml)trace serveprints the exact--emitURL to copy (auto-discovery only probes the well-known ports). Fixed the stale:4747compose comment (host port is14747).Tests
test/output.test.js: assertsCode.BP_BOUND_UNHIT, and thatCollector.emitresolves to a structured result (never throws, never a bare bool) on a refused POST.tsc --noEmitclean · UItsc --noEmitclean ·npm test→ 41/41 pass (1 DB round-trip skipped).Base
Branched off
master(which already carries the14747host-port mappings — fix #5 corrects the lone stale:4747compose comment they left behind).🤖 Generated with Claude Code