docs: sync README host ports + teach the skill to read the result envelope#8
Merged
Merged
Conversation
Keep the prose docs in step with the behavior changes from the collector/trace fixes (#5–#7). - README: the docker-compose section showed the container ports (:4747 · :5432 · :9000) instead of the host-published ports the compose file actually maps. Correct to :14747 · :65432 · :19000 (console :19001), and the S3_ENDPOINT example to :19000, so the README matches docker-compose.yml. The native-serve example keeps :5432 (your own Postgres) — that one was right. - skills/trace/SKILL.md: add a "Reading the result" section. The skill is deliberately minimal (the binary is the source of truth), but it gave zero guidance on interpreting a trace — the gap behind an agent treating an empty or perpetually-"running" trace as success. Names the three stable fields that decide whether a run did what was asked (ok · diagnostics[].code · meta.running) and the rule that ok:true + events:[] + a warn diagnostic means "ran fine, saw nothing — re-aim," not success. Field-level (no drift-prone code list).
There was a problem hiding this comment.
Pull request overview
Updates documentation to match current docker-compose.yml host port mappings and adds agent-facing guidance in the trace skill for correctly interpreting the trace result envelope (beyond just exit code), reducing false “success” conclusions on empty/partial runs.
Changes:
- Sync Docker Compose quick-start port comments and
S3_ENDPOINTexample inREADME.mdto match host-published ports. - Add a “Reading the result” section to
skills/trace/SKILL.mdexplaining how to interpretok,diagnostics[].code, andmeta.running.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| README.md | Updates documented host ports and S3_ENDPOINT to match Compose mappings. |
| skills/trace/SKILL.md | Adds guidance for agents on interpreting the result envelope (ok, diagnostics, meta.running). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| - **`diagnostics[]`** — each carries a stable `code` (the same vocabulary the stderr logs use, so you can join the two). **`warn`-level codes do *not* flip `ok`** — so a run can be `ok:true` and still have observed or recorded nothing: a breakpoint that *bound but never hit*, a Chrome run with no frames, or a recording/upload that failed. **Read `diagnostics` before concluding a trace succeeded.** | ||
| - **`meta.running`** — `true` marks a *partial*, mid-run envelope (streamed to the collector while the run is in flight); it is absent on the finished trace. A dashboard session stuck on `running` means the run never finished (it aborted, or the final envelope never arrived) — not that it's still working. | ||
|
|
||
| Rule of thumb: `ok:true` **+ `events: []` + a `warn` diagnostic** = the trace ran fine but saw nothing (wrong line, branch not taken, or the trigger never fired) — re-aim the breakpoint/trigger, don't report success. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Keeps the prose docs in step with the behavior shipped in #5–#7. Two doc surfaces were stale/incomplete.
README — host ports contradicted
docker-compose.ymlThe docker-compose quick-start showed the container ports instead of the host-published ports the compose file maps:
:4747:14747:5432:65432:9000:19000(+ console:19001)Also fixed the
S3_ENDPOINTexample to:19000. The native-serve example keeps:5432(your own Postgres) — that one was already correct.skills/trace/SKILL.md — no result-interpretation guidance
The skill is deliberately minimal (the binary is the source of truth via
manifest/schema), but it gave zero guidance on reading a trace — the exact gap behind an agent treating an empty or perpetually-runningtrace as success.Added a short "Reading the result" section naming the three stable fields that decide whether a run did what was asked:
ok—falseonly on error-level diagnostics; exit code mirrors it.diagnostics[].code—warncodes do not flipok, sook:truecan still mean "observed/recorded nothing" (breakpoint bound-but-unhit, no frames, failed upload). Read diagnostics before concluding success.meta.running—trueis a partial mid-run envelope; a dashboard session stuck onrunningmeans the run never finished, not that it's still working.With the rule of thumb:
ok:true+events:[]+ awarndiagnostic = ran fine, saw nothing → re-aim, don't report success. Field-level only (no drift-prone code list), consistent with the skill's "read it from the binary" philosophy.Docs-only; no code change.
🤖 Generated with Claude Code