Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
49 changes: 47 additions & 2 deletions .claude/skills/dikw-web-verify-frontend/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -92,6 +92,50 @@ no-UI-framework, dark reader contrast, graph filters/legend/no-bloom, the
markdown HTML allow-list, and the surface contracts. Items it marks "e2e: …" are
already gated — for those, re-run that spec instead of eyeballing.

## Step 2.5 — Measured perf + a11y (Chrome DevTools MCP)

Turn the **eyeballed** a11y / contrast / perf items of the rubric into a *measured*
pass against numbers, not vibes. Use the **`chrome-devtools-mcp`** plugin (already
installed — `lighthouse_audit`, `performance_start_trace` / `performance_stop_trace`,
`performance_analyze_insight`; skills `chrome-devtools-mcp:a11y-debugging` and
`debug-optimize-lcp`). This is the verification step the Chrome MCP interaction pass
(Step 1) can't give you. **Run it for the route(s) the diff touched**; skip a route the
change can't affect.

Two different tools — don't conflate them: **`lighthouse_audit` excludes performance**
(its tool reference directs perf to the trace tools), so a11y comes from Lighthouse and
Web Vitals come from a performance trace.

1. Open the changed route in a Chrome DevTools MCP page at
`http://127.0.0.1:4321/#<route>` (reuse the running dev server).
2. **Accessibility (+ best-practices) → `lighthouse_audit`.** Run it with the
**accessibility** and **best-practices** categories (**not** `performance` — the tool
excludes it). The `a11y-debugging` skill walks specific failures (semantic HTML, ARIA
labels, focus order, tap-target size, contrast ratios).
3. **Web Vitals → a performance trace.** `performance_start_trace` (reload = true so the
load is captured) → exercise the route → `performance_stop_trace`; read CLS + LCP from
the trace, and use `performance_analyze_insight` on the LCP/CLS insight for detail.
(The `debug-optimize-lcp` skill covers this flow.)
4. Score against this rubric (the budget is a floor, not a target):
- **Accessibility ≥ 0.9**, and **no new violation** vs `main` for the route — from the
Lighthouse pass. Treat a dropped score as a fail; fix the contrast / label / role and
re-audit. This backs the rubric's "contrast ≥ 4.5:1 body / 3:1 headings" with a number.
- **CLS ≤ 0.1** — from the trace. Already gated by `tests/e2e/perf.spec.ts` on the
primary routes; here it's a cross-check on the *changed* route, and the trace shows
*which* element shifted so a regression is fixable, not just flagged.
- **LCP** — from the trace; a **soft** budget: record it and flag a clear regression vs
`main`, but it's runner-dependent (annotated, not hard-gated, in `perf.spec.ts`).
5. **Pixi `#graph` caveat (same root cause as Step 1's gotcha):** a background
DevTools MCP tab can stall `requestAnimationFrame`, so a performance trace of `#graph`
may capture a canvas that never animated. Trace graph perf only in a foreground page,
or skip the trace there and rely on `graph.spec.ts` for its render contract. The
Lighthouse accessibility audit (DOM-based) is unaffected — the node overlay exposes
stable button targets, so run a11y on `#graph` normally.

These are **measured-locally** checks, not new CI gates (Lighthouse + trace timing is
runner-dependent — the same reason `perf.spec.ts` gates only CLS). A ❌ here feeds Step
4's loop like any other finding.

## Step 3 — (if the change touches core data shape) smoke the live contract

If the change reads a different `/v1` field/shape, the mocked e2e suite can't
Expand All @@ -102,5 +146,6 @@ Skip when the change is purely presentational.
## Step 4 — Close the loop

Any ❌ → fix the source, re-run the affected gate, re-verify the route. Only
report the UI change done once behavior + rubric pass clean in both themes. This
skill is Step 5 of the `dikw-web-delivery-workflow`.
report the UI change done once behavior (Step 1), the rubric (Step 2), and the
measured perf + a11y pass (Step 2.5) are clean in both themes. This skill is Step 5
of the `dikw-web-delivery-workflow`.
15 changes: 15 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,21 @@ file format introduced in `[0.0.1.0]` was dropped.

## [Unreleased]

## [0.8.8] - 2026-06-29

### Changed

- **`dikw-web-verify-frontend` gains a measured perf + a11y pass (Step 2.5).** The
frontend-verify skill previously eyeballed the `docs/ui-checklist.md` a11y / contrast
/ perf items. It now uses the already-installed `chrome-devtools-mcp`: `lighthouse_audit`
for **accessibility + best-practices** (the tool excludes performance), plus a
`performance_start_trace`/`stop_trace` for **Web Vitals**, scored to a rubric — **a11y
≥ 0.9** with no new violation, **CLS ≤ 0.1** (cross-checking the `perf.spec.ts` gate),
**LCP** recorded as a soft budget — turning the qualitative items into numbers.
Locally-measured, not a new CI gate (Lighthouse + trace timing is runner-dependent).
The `#graph` Pixi route audits a11y normally but skips the background-tab perf trace.
See `docs/adr/0005-delivery-loop-hardening.md`.

## [0.8.7] - 2026-06-29

### Added
Expand Down
2 changes: 1 addition & 1 deletion CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,7 @@ End-to-end loop from request to landed PR. Run autonomously for behavior changes
- 3.1 Run `/codex:review --background` for an independent review pass.
- 3.2 Evaluate the findings, decide which are valid, and fix.
4. **Final pass.** Run `/code-review`, scored against `docs/review-rubric.md` (the project-specific principles), and resolve every finding before continuing.
5. **Verify in the browser.** For UI changes, invoke the `dikw-web-verify-frontend` skill: navigate the changed routes via Chrome MCP, confirm a clean runtime console on real data, exercise the affected interactions, and run the `docs/ui-checklist.md` rubric in light + dark — confirm the change actually rendered as intended, not just that unit tests pass.
5. **Verify in the browser.** For UI changes, invoke the `dikw-web-verify-frontend` skill: navigate the changed routes via Chrome MCP, confirm a clean runtime console on real data, exercise the affected interactions, run the `docs/ui-checklist.md` rubric in light + dark, and run the **measured perf + a11y pass** (Step 2.5 — Chrome DevTools MCP: `lighthouse_audit` for accessibility ≥ 0.9 with no new violation, and a `performance_start_trace`/`stop_trace` for CLS ≤ 0.1 + LCP) for the changed route — confirm the change actually rendered as intended, not just that unit tests pass.
6. **Update markdown docs.** Walk `CLAUDE.md`, `README.md`, and the relevant `docs/*.md` against the diff; any contract, behavior, command, or doc index that drifted must be updated in the same change. Don't leave docs to "catch up later".
7. **Create the PR.** Branch with a descriptive name, commit with `<type>(<scope>): <subject>` matching the project's existing convention (see recent `git log`), push, then `gh pr create`. CI auto-runs lint + format:check + typecheck + coverage + build + e2e + bundle budget + the `gate-integrity` reward-hacking gate (`check:gate`) + security scans (npm audit, gitleaks, Trivy, CodeQL). Bump `package.json.version` manually (standard 3-digit SemVer) when the change warrants it, and add an entry to `CHANGELOG.md` under the matching version heading. On merge to `main`, CI's `release` job auto-cuts a GitHub Release tagged `dikw-web-v<version>` from `package.json.version` (idempotent — only a version bump creates a new tag; notes come from the matching CHANGELOG section via `scripts/changelog-notes.mjs`), so a deliberate version bump is what publishes a release.
8. **Monitor CI and PR comments; resolve as they surface, then merge.** After pushing, actively watch both signals — don't passively wait, and don't batch resolution to merge time.
Expand Down
27 changes: 27 additions & 0 deletions docs/adr/0005-delivery-loop-hardening.md
Original file line number Diff line number Diff line change
Expand Up @@ -86,3 +86,30 @@ state protocol, container `--network none` isolation, per-token cost accounting
out of scope. dikw-web's loop is interactive Claude Code with worktree isolation
already available for background jobs and no destructive operations; that machinery
would add complexity without proportional value here.

---

## Item 2 — trustworthy green signal (flaky e2e)

Resolved outside this effort: the flaky `graph.spec.ts > renders a nonblank Pixi graph
canvas` was root-cause-fixed in PR #140 (attach the Pixi canvas only inside the
effect's `active` guard; the spec gates on `data-render-count >= 1`). `main`
deliberately keeps `retries: 2` as a *general* backstop for timing-sensitive specs
(no longer Pixi-specific), so the gate's `e2e-retries-raised` check guards that
decision without forcing it to 1.

## Item 3 — measured perf + a11y in `verify-frontend`

The `dikw-web-verify-frontend` skill verified real-browser behavior + a clean console,
then eyeballed the `docs/ui-checklist.md` a11y/contrast/perf items. Following Delba
Oliveira's feedback-loops note ("many checks have criteria Claude can measure against:
a performance budget, an accessibility checklist"), **Step 2.5** now uses the
already-installed `chrome-devtools-mcp` against the changed route: `lighthouse_audit` for
**accessibility + best-practices** (the tool deliberately excludes performance), plus a
`performance_start_trace`/`stop_trace` for **Web Vitals**. Scored to a rubric: **a11y ≥
0.9** with no new violation (Lighthouse), **CLS ≤ 0.1** (cross-checking the
`perf.spec.ts` gate) and **LCP** as a soft budget (both from the trace). Kept a
**locally-measured** step, not a new CI gate — Lighthouse + trace timing is
runner-dependent, the same reason `perf.spec.ts` hard-gates only CLS. The `#graph` Pixi route audits a11y normally but skips the perf trace in a
background tab (stalled `requestAnimationFrame`), mirroring the skill's existing
Chrome-MCP caveat.
19 changes: 19 additions & 0 deletions docs/ui-checklist.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,9 +63,28 @@ no e2e are the ones the manual pass exists for.
- [ ] **Contrast.** Normal article text ≥ 4.5:1; large headings ≥ 3:1;
metadata/control text ≥ 3:1 against their background. _e2e: `theme.spec.ts`
computes these — re-run it for reader changes rather than eyeballing._
_measured: also covered by the `lighthouse_audit` accessibility category in
`dikw-web-verify-frontend` Step 2.5 (DevTools MCP), which scores contrast +
labels + roles for the changed route._
- [ ] **No console errors** in either theme (the e2e console gate covers mocked
flows; the manual pass covers real-data rendering). See `tests/e2e/harness.ts`.

## Measured perf + a11y (DevTools MCP)

> These back the eyeballed items above with numbers. Run via
> `dikw-web-verify-frontend` **Step 2.5** for the route the diff touched, with two
> `chrome-devtools-mcp` tools: `lighthouse_audit` for **a11y** (it excludes performance)
> and a `performance_start_trace`/`stop_trace` for **Web Vitals**. Measured locally, not
> a CI gate (Lighthouse + trace timing is runner-dependent — the same reason
> `perf.spec.ts` gates only CLS).

- [ ] **Accessibility score ≥ 0.9** for the changed route, with **no new violation**
vs `main` (`lighthouse_audit`). A dropped score is a fail — fix and re-audit.
- [ ] **CLS ≤ 0.1** on the changed route (performance trace). _e2e: `perf.spec.ts` gates
this on the primary routes; the trace here shows *which* element shifted._
- [ ] **LCP** recorded from the trace; flag a clear regression vs `main` (soft budget —
annotated, not hard-gated, in `perf.spec.ts`).

## Graph (`#graph`)

> Do **not** verify the Pixi canvas through Chrome MCP — a background MCP tab
Expand Down
2 changes: 1 addition & 1 deletion package.json
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{
"name": "dikw-web",
"version": "0.8.7",
"version": "0.8.8",
"private": true,
"type": "module",
"engines": {
Expand Down