From 769cffb2360b978ec922c6b56649fef3b8952083 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?An=C4=B1lcan=20=C3=87ak=C4=B1r?= Date: Thu, 25 Jun 2026 02:38:37 +0300 Subject: [PATCH 1/2] fix: downgrade doctor enrichers check to INFO and surface find multi-match diagnostic Enrichers are opt-in, so zero registered is a valid state rather than a warning; dusk:doctor now emits INFO instead of WARN (#9). dusk:find detects when a semantics label matches more than one node and returns a diagnostic so agents can disambiguate with --text/--contains or a q-handle, while still resolving the first match (#15, #16). Adds skill and doc guidance on e-ref staleness versus q-handles. --- CHANGELOG.md | 6 + doc/commands/dusk-find.md | 88 ++++++++++++- lib/src/commands/dusk_doctor_command.dart | 5 +- lib/src/extensions/ext_find.dart | 118 ++++++++++++------ skills/fluttersdk-dusk/SKILL.md | 19 ++- .../references/actionability-and-refs.md | 39 ++++++ .../commands/dusk_doctor_command_test.dart | 14 +-- test/src/extensions/ext_find_test.dart | 98 +++++++++++++++ 8 files changed, 327 insertions(+), 60 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index dbd951f..40de1c1 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -10,12 +10,18 @@ This project follows [Semantic Versioning 2.0.0](https://semver.org/spec/v2.0.0. ### Added +- **`ext.dusk.find` now surfaces a `matchCount` field and an ambiguity `diagnostic` in its success response when `--semanticsLabel` or `--text` matches more than one Semantics node.** Previously the handler silently returned the first match, so `--semanticsLabel "Password"` over-matched the email field on forms where both `` nodes shared the same label. The response now includes `matchCount: N` on every match; when `N > 1` a `diagnostic` key carries a human-readable hint (`label 'X' matched N nodes; refine with --text/--contains or use a q-handle`). Single-match and no-match behaviour is unchanged (backward-compatible). Touches `lib/src/extensions/ext_find.dart`; covered by `test/src/extensions/ext_find_test.dart`. + - **`ext.dusk.snap` now surfaces captured non-fatal render/build FlutterErrors in a `renderErrors` block, and `dusk:snap` prints a `⚠ N render error(s)` banner to stderr while stdout stays the pure snapshot.** A widget that throws at build time (a `ParentDataWidget` misuse such as `flex-1`/`Expanded` placed under a `Semantics`/`WAnchor` instead of directly inside a Flex, or an overflow) can render partially and stay invisible in the semantics snapshot, so an action against it silently no-ops with no signal to the agent. The snapshot payload now carries `renderErrors: {count, recent: [{type, message}], hint}` (populated from the existing `FlutterError.onError` capture buffer, omitted entirely when clean), so a broken screen is impossible to miss without separately calling `ext.dusk.exceptions`. Touches `lib/src/extensions/ext_snapshot.dart`, `lib/src/commands/dusk_snap_command.dart`; covered by `test/src/extensions/ext_snapshot_render_errors_test.dart`. ### Changed - **`ext.dusk.navigate` now tries the consumer navigate adapter (`DuskPlugin.navigateAdapter`, e.g. `MagicRoute.to`) BEFORE `Navigator.pushNamed`.** On a Router-only stack (go_router / auto_route) `Navigator.onGenerateRoute` is null, so `Navigator.pushNamed` raised an asynchronous "no corresponding route" `FlutterError` on every navigate. Because the failure was async, the handler's try/catch could not suppress it, and it landed in the FlutterError buffer, now doubly visible via the new `renderErrors` snapshot block as a false positive. Adapter-first dispatch routes through the app's own router public API (the correct path for these apps) and skips the throwing `Navigator.pushNamed` entirely; it remains the fallback for apps with no registered adapter. Touches `lib/src/extensions/ext_navigation.dart`. +### Fixed + +- **`dusk:doctor` check 3 (snapshot enrichers) now emits INFO when no enrichers are registered, instead of WARN.** Enrichers are opt-in; zero is a valid state, not a problem. The WARN reading alongside "integration wired" (check 5) created false contradiction. Touches `lib/src/commands/dusk_doctor_command.dart`; test case updated in `test/src/commands/dusk_doctor_command_test.dart`. + --- ## [0.0.8] - 2026-06-17 diff --git a/doc/commands/dusk-find.md b/doc/commands/dusk-find.md index ba880b6..2b5856c 100644 --- a/doc/commands/dusk-find.md +++ b/doc/commands/dusk-find.md @@ -62,21 +62,36 @@ The CLI guards an empty params map (`Provide at least one of --text / --contains **Success envelope (illustrative):** +Single match: + ```json { "ref": "q1", - "matchCount": 1, - "rect": [120, 400, 240, 48], - "role": "button", - "label": "Sign in" + "matched": true, + "matchCount": 1 } ``` -`matchCount > 1` indicates the predicate is ambiguous: the handle still resolves to the first match, but the agent should narrow with an extra predicate (typically `--key`) before acting. +Multi-match (ambiguous predicate): + +```json +{ + "ref": "q1", + "matched": true, + "matchCount": 2, + "diagnostic": "label 'Password' matched 2 nodes; refine with --text/--contains or use a q-handle" +} +``` + +`matchCount > 1` means the predicate is ambiguous: the handle still resolves to the FIRST match (backward-compatible), but the agent should narrow with an additional predicate before acting. Common disambiguation strategies: + +- Add `--key=` when the widget carries a `ValueKey`. +- Add `--text=` when the accessibility label and the visible text differ. +- Use `--contains=` when only part of the label is unique. **Error envelope:** -The VM Service handler propagates errors as `ServiceExtensionResponse.error(extensionError, message)`. The CLI surfaces them via `ArtisanContext.callExtension` and exits non-zero. Common messages include `No widget matched predicates: {...}`. +The VM Service handler propagates errors as `ServiceExtensionResponse.error(extensionError, message)`. The CLI surfaces them via `ArtisanContext.callExtension` and exits non-zero. Common messages include `No widget matched predicates: {}`. --- @@ -154,6 +169,67 @@ The two predicates AND together; useful when the screen has multiple "Save" butt --- + +## e-ref staleness and when to prefer q-handles + +`e` tokens minted by `dusk:snap` are frozen to the Semantics node that was +live at snap time. They become defunct the moment the node leaves the tree, which +happens on any route push, list rebuild, or conditional widget swap. The +`RefRegistry` that backs `e` tokens does NOT re-resolve; calling an action +with a stale `e` returns a `defunct (element no longer mounted)` failure. + +`q` handles minted by `dusk:find` store the predicate set instead of the +node, and re-walk the live tree on every action call. They survive navigations, +hot-reloads, and full widget rebuilds as long as the predicate still matches +something in the tree. + +**When to reach for `dusk:find` / `q` instead of using the `e` from a +snap:** + +- The page might rebuild between snap and action (e.g. Settings pages with + dynamic sections, lists driven by async data). +- The agent will retry an action (gate failure, transient loading state). +- The flow spans more than one navigation hop; an `e` from the previous + screen is always stale after the route change. +- The agent holds a ref across a hot-reload. + +The `RefRegistry` is intentionally frozen for `e` (it is a FIFO token store, +not a live observer). There is no mechanism to refresh a stale `e` in place; +the design intent is that `dusk:snap` re-mints the ref after every page change. +For rebuild-prone pages, prefer `dusk:find` / `dusk:observe` from the start. + +--- + + +## Avoiding `--semanticsLabel` over-match + +`--semanticsLabel` performs an exact case-sensitive match against +`SemanticsNode.label` and returns the FIRST node in tree order. When two or +more nodes carry the same label (e.g. two `TextField` widgets both labelled +`Password` on a sign-up form, or a list of repeated row controls), the handle +resolves to the first node in tree order, which may not be the intended target. + +The `matchCount` field in the response tells the agent how many nodes matched. +A `diagnostic` key appears when `matchCount > 1`, e.g.: + +``` +label 'Password' matched 2 nodes; refine with --text/--contains or use a q-handle +``` + +**Disambiguation strategies (most to least precise):** + +1. Add `--key=` when the widget carries a `ValueKey`. This is the + most precise predicate and survives label changes. +2. Combine `--semanticsLabel=Password --text=Confirm` when the second node has + distinct visible text (some widgets expose both a label and a text value). +3. Use `--contains=` when only part of the label is unique + across the matching nodes. +4. Use `dusk:observe` with a narrow `intent` and inspect the returned candidate + list; each candidate includes role, bounds, and enricher fields that let the + agent identify the correct target before minting the handle. + +--- + ## See also diff --git a/lib/src/commands/dusk_doctor_command.dart b/lib/src/commands/dusk_doctor_command.dart index 38b0416..7e8ec59 100644 --- a/lib/src/commands/dusk_doctor_command.dart +++ b/lib/src/commands/dusk_doctor_command.dart @@ -234,10 +234,7 @@ class DuskDoctorCommand extends ArtisanCommand { const String label = 'snapshot enrichers'; final int count = enrichersProbe(); if (count == 0) { - ctx.output.warning( - '$label: no enrichers registered; install Magic + Wind integrations ' - 'for richer snapshots', - ); + ctx.output.info('$label: enrichers are opt-in; none registered'); return; } ctx.output.success('$label: enrichers registered: $count'); diff --git a/lib/src/extensions/ext_find.dart b/lib/src/extensions/ext_find.dart index 7bc6eac..d08afbb 100644 --- a/lib/src/extensions/ext_find.dart +++ b/lib/src/extensions/ext_find.dart @@ -34,8 +34,11 @@ void registerFindExtension() { /// live Semantics + Element tree once to verify the predicates resolve to /// a node, then mints a `q` handle backed by the stored predicate set. /// -/// On first match returns `{"ref": "q", "matched": true}`. When no node -/// matches, returns `{"ref": null, "matched": false}` — no handle is minted. +/// On first match returns `{"ref": "q", "matched": true, "matchCount": N}`. +/// When `matchCount > 1`, an additional `diagnostic` key carries a +/// human-readable hint so agents know to disambiguate with `--text`, +/// `--contains`, or a widget `--key`. When no node matches, returns +/// `{"ref": null, "matched": false}` — no handle is minted. /// /// The handle is opaque from the agent's perspective: passing it back to /// `ext.dusk.tap` etc. triggers a fresh tree walk at that moment, so a @@ -78,7 +81,8 @@ Future extDuskFindHandler( // NOT store the resolved RefEntry — the handle re-executes the // walk on every action call so the agent gets the latest rect / // element after intermediate rebuilds. - final RefEntry? entry = resolveQuery(query); + final (RefEntry? entry, int matchCount, String? diagnostic) = + resolveQueryWithCount(query); if (entry == null) { return developer.ServiceExtensionResponse.result( jsonEncode({ @@ -92,12 +96,17 @@ Future extDuskFindHandler( // (no groupId scope; action handlers rebuild RefEntry on call). final String token = RefRegistry.registerQuery(query); - return developer.ServiceExtensionResponse.result( - jsonEncode({ - 'ref': token, - 'matched': true, - }), - ); + final Map payload = { + 'ref': token, + 'matched': true, + 'matchCount': matchCount, + }; + // 4. Surface ambiguity diagnostic when more than one node matched. + if (diagnostic != null) { + payload['diagnostic'] = diagnostic; + } + + return developer.ServiceExtensionResponse.result(jsonEncode(payload)); } catch (e, stackTrace) { developer.log( '[fluttersdk_dusk] ext.dusk.find error: $e\n$stackTrace', @@ -134,35 +143,64 @@ Future extDuskFindHandler( /// When multiple predicates are set they all must match the same node / /// element (intersection). RefEntry? resolveQuery(DuskQuery query) { + return resolveQueryWithCount(query).$1; +} + +/// Variant of [resolveQuery] that also returns the total number of Semantics +/// nodes that matched the label predicate and an optional ambiguity +/// diagnostic. +/// +/// Returns a record `(entry, matchCount, diagnostic)`: +/// - `entry` — first match, or `null` when nothing matched. +/// - `matchCount` — number of nodes that matched (1 for a unique match, 0 +/// when no match; only meaningful for the `semanticsLabel` / `text` +/// Semantics-walk paths; key and text-only Element paths always report 1). +/// - `diagnostic` — non-null only when `matchCount > 1`; a message suitable +/// for surfacing to an agent, e.g. `label 'Password' matched 2 nodes; +/// refine with --text/--contains or use a q-handle`. +/// +/// Single-match and no-match behaviour is identical to [resolveQuery]; +/// callers that do not need ambiguity detection may use [resolveQuery] +/// directly. +(RefEntry?, int, String?) resolveQueryWithCount(DuskQuery query) { // 1. Key-based match: Element tree walk. Cheapest, most specific. if (query.keyValue != null) { final Element? element = _findElementByKey(query.keyValue!); - if (element == null) return null; - if (!_elementMatchesOtherPredicates(element, query)) return null; - return _entryFromElement(element); + if (element == null) return (null, 0, null); + if (!_elementMatchesOtherPredicates(element, query)) return (null, 0, null); + return (_entryFromElement(element), 1, null); } // 2. Semantics-label match: walk the Semantics tree first because it // surfaces merged accessibility labels (Button "Submit" with no // Text descendant still resolves). if (query.semanticsLabel != null) { - final SemanticsNode? node = - _findSemanticsNodeByLabel(query.semanticsLabel!); - if (node == null) return null; - return _entryFromSemanticsNode(node); + final (SemanticsNode? node, int count) = + _findSemanticsNodeByLabelWithCount(query.semanticsLabel!); + if (node == null) return (null, 0, null); + final String? diagnostic = count > 1 + ? "label '${query.semanticsLabel}' matched $count nodes; " + 'refine with --text/--contains or use a q-handle' + : null; + return (_entryFromSemanticsNode(node), count, diagnostic); } // 3. text-only match: Semantics-label first (covers labelled widgets // where the visible text is the accessibility label), then Element- // tree Text widget fallback. if (query.text != null) { - final SemanticsNode? node = _findSemanticsNodeByLabel(query.text!); + final (SemanticsNode? node, int count) = + _findSemanticsNodeByLabelWithCount(query.text!); if (node != null) { - return _entryFromSemanticsNode(node); + final String? diagnostic = count > 1 + ? "label '${query.text}' matched $count nodes; " + 'refine with --semanticsLabel/--contains or use a q-handle' + : null; + return (_entryFromSemanticsNode(node), count, diagnostic); } final Element? element = _findElementByTextData(query.text!); - if (element == null) return null; - return _entryFromElement(element); + if (element == null) return (null, 0, null); + return (_entryFromElement(element), 1, null); } // 4. containsText match: substring search across Semantics labels then @@ -172,14 +210,14 @@ RefEntry? resolveQuery(DuskQuery query) { final SemanticsNode? node = _findSemanticsNodeByLabelContains(query.containsText!); if (node != null) { - return _entryFromSemanticsNode(node); + return (_entryFromSemanticsNode(node), 1, null); } final Element? element = _findElementByTextContains(query.containsText!); - if (element == null) return null; - return _entryFromElement(element); + if (element == null) return (null, 0, null); + return (_entryFromElement(element), 1, null); } - return null; + return (null, 0, null); } // --------------------------------------------------------------------------- @@ -289,41 +327,41 @@ SemanticsNode? _findSemanticsNodeByLabelContains(String needle) { return found; } -/// Walks the live Semantics tree and returns the first node whose [label] -/// equals [needle]. +/// Walks the live Semantics tree and counts ALL nodes whose [label] equals +/// [needle], returning the first match alongside the total count. /// /// Production-bound widget trees expose their semantics owner via /// `RendererBinding.instance.rootPipelineOwner.semanticsOwner`. The Flutter -/// test harness, however, mounts the widget tree under a CHILD pipeline -/// owner attached to the test view (see `ext_snapshot_dispatcher_test.dart` -/// docs for the rationale). We walk the root owner first, then every child -/// owner registered under it, so this helper works in BOTH environments. -SemanticsNode? _findSemanticsNodeByLabel(String needle) { +/// test harness mounts the widget tree under a CHILD pipeline owner attached +/// to the test view, so the walk covers the root owner and all child owners. +/// +/// The walk never stops early after finding the first node, so the returned +/// count reflects ALL matches in the tree. When `count > 1` the caller +/// should surface an ambiguity diagnostic to the agent. +(SemanticsNode?, int) _findSemanticsNodeByLabelWithCount(String needle) { SemanticsNode? found; + int count = 0; void visit(SemanticsNode node) { - if (found != null) return; if (node.label == needle) { - found = node; - return; + count += 1; + found ??= node; } node.visitChildren((SemanticsNode child) { visit(child); - return found == null; + // Always continue walking to collect the full count. + return true; }); } void visitOwner(PipelineOwner owner) { - if (found != null) return; final SemanticsNode? root = owner.semanticsOwner?.rootSemanticsNode; if (root != null) visit(root); - owner.visitChildren((PipelineOwner child) { - if (found == null) visitOwner(child); - }); + owner.visitChildren(visitOwner); } visitOwner(RendererBinding.instance.rootPipelineOwner); - return found; + return (found, count); } /// Cross-checks an Element-tree match against the supplied query's diff --git a/skills/fluttersdk-dusk/SKILL.md b/skills/fluttersdk-dusk/SKILL.md index 146c188..6f4a759 100644 --- a/skills/fluttersdk-dusk/SKILL.md +++ b/skills/fluttersdk-dusk/SKILL.md @@ -1,11 +1,11 @@ --- name: fluttersdk-dusk description: "fluttersdk_dusk: E2E driver for Flutter apps that lets an LLM agent see (snap, observe, screenshot) and act (tap, type, drag, scroll, navigate) on a running Flutter app via 33 MCP tools (`dusk_*`) and 34 matching CLI commands (`./bin/fsa dusk:*`). Snapshots emit a YAML Semantics tree with stable `[ref=eN]` tokens; `dusk_find` and `dusk_observe` mint re-resolvable `q` query handles. Every gesture passes a 6-step actionability gate with substring-parseable failure reasons (`not enabled`, `zero rect`, `off-viewport`, `not stable`, `obscured by`, `defunct`). TRIGGER when: any `dusk_*` MCP tool call, any `dusk:*` CLI command, `./bin/fsa` invocation, the user asks the agent to drive / inspect / test / debug a running Flutter app, the user mentions snap / observe / actionability / ref / eN / qN, or the conversation touches end-to-end testing of a Flutter UI. DO NOT TRIGGER when: only authoring `flutter_test` widget tests, only reading telescope ring buffers without driving the UI (use fluttersdk-telescope), or only modifying Dart source without running it." -version: 0.0.8 +version: 0.0.9 when_to_use: "Any task where the agent drives or inspects a running Flutter app via dusk: calling `dusk_*` MCP tools in a loop (snap, tap, type, screenshot, hot_reload_and_snap), invoking `./bin/fsa dusk:` from a shell, recovering from an actionability failure, choosing between `e` and `q` ref tokens, waiting for text or network idle, navigating routes, or filling a form." --- - + # fluttersdk_dusk @@ -189,6 +189,21 @@ Default: snap returns `e`; use them inline. Switch to `dusk_find` / `dusk_observe` and `q` the moment the agent enters a retry or multi-step flow against the same target. +**e-ref staleness on rebuild-prone pages.** `e` tokens are frozen to +the Semantics node at snap time. The `RefRegistry` backing them does NOT +re-resolve; on pages that rebuild (Settings, lists driven by async data, +any page with conditional sections), use `dusk_find` from the start +instead of snapping and then regretting the stale `defunct` failure. + +**`--semanticsLabel` over-match.** `dusk_find { semanticsLabel: "X" }` +exact-matches against the accessibility label and silently picks the FIRST +node. On forms with repeated labels ("Password" and "Confirm Password" both +labelled "Password"), the handle points at the wrong field. Check the +`matchCount` field in the response; when `> 1`, read the `diagnostic` key +and add a second predicate (`--key`, `--text`, or `--contains`) before +acting. Full disambiguation table: `references/actionability-and-refs.md` +section "semanticsLabel exact-match and over-match". + ## 5. Quick install + doctor (when dusk is missing) If `./bin/fsa dusk:snap` returns "VM Service URI absent" (or any diff --git a/skills/fluttersdk-dusk/references/actionability-and-refs.md b/skills/fluttersdk-dusk/references/actionability-and-refs.md index e7642c4..14271df 100644 --- a/skills/fluttersdk-dusk/references/actionability-and-refs.md +++ b/skills/fluttersdk-dusk/references/actionability-and-refs.md @@ -166,6 +166,13 @@ predicates. - Become defunct when the node unmounts (the widget leaves the tree). After hot-reload, navigation, or a list rebuild, expect every not-refreshed `e` to fail with `"defunct"`. +- **The `RefRegistry` backing `e` tokens is FROZEN (a FIFO token + store, not a live observer).** There is no way to refresh a stale + `e` in place; the design intent is that `dusk_snap` re-mints + the ref after every page change. For pages that rebuild frequently + (Settings, lists driven by async data, any page with conditional + sections), prefer `dusk_find` / `dusk_observe` from the start to + get a `q` handle that re-resolves on every action. When to use them: immediately after the snap that minted them, when the UI is static, when the action is one-shot. @@ -228,3 +235,35 @@ the action is safe but the gate fails: Both flags are per-call. There is no global way to disable the gate; that is intentional. Disable per call, not per session. + +## `--semanticsLabel` exact-match and over-match + +`dusk_find { semanticsLabel: "Password" }` performs a case-sensitive exact +match against `SemanticsNode.label` and silently resolves to the FIRST node +in tree order. On forms where multiple fields share the same label (e.g. a +Password field and a Confirm Password field both labelled "Password", or a +list of rows each containing a "Delete" button), the handle points at the +wrong target. + +The `matchCount` key in the success response tells the agent how many nodes +matched. When `matchCount > 1` the response also carries a `diagnostic` key: + +```json +{ + "ref": "q3", + "matched": true, + "matchCount": 2, + "diagnostic": "label 'Password' matched 2 nodes; refine with --text/--contains or use a q-handle" +} +``` + +The handle still resolves (backward-compatible), so existing scripts that +do not read `diagnostic` are unaffected. New agent code should read +`matchCount` and, when `> 1`, apply one of these disambiguation strategies: + +| Strategy | When to use | +|---|---| +| Add `--key=` | The widget carries a `ValueKey`; most precise, survives label changes. | +| Add `--text=` | The accessibility label and visible text differ; combines with `semanticsLabel` as an AND predicate. | +| Use `--contains=` | Only part of the label is unique; e.g. `--contains="Confirm"` to single out "Confirm Password". | +| Use `dusk_observe` | When no single predicate is unique; observe returns candidates with bounds and enricher fields the agent can reason about before minting a handle. | diff --git a/test/src/commands/dusk_doctor_command_test.dart b/test/src/commands/dusk_doctor_command_test.dart index 98acc79..e547d32 100644 --- a/test/src/commands/dusk_doctor_command_test.dart +++ b/test/src/commands/dusk_doctor_command_test.dart @@ -202,7 +202,8 @@ void main() { expect(output.content, contains('enrichers registered: 2')); }); - test('WARN when DuskPlugin.enrichers is empty', () async { + test('INFO when DuskPlugin.enrichers is empty (enrichers are opt-in)', + () async { DuskDoctorCommand.enrichersProbe = () => 0; final output = BufferedOutput(); @@ -212,10 +213,7 @@ void main() { expect(exit, equals(0)); expect( output.content, - contains( - 'no enrichers registered; install Magic + Wind integrations for ' - 'richer snapshots', - ), + contains('enrichers are opt-in; none registered'), ); }); @@ -413,10 +411,10 @@ Future main() async { test( 'exit code is 0 when every check passes (defaults: empty enrichers ' - 'flip to WARN, but WARN never fails)', () async { + 'emit INFO, no errors)', () async { // With default seams (empty enrichers, no Chrome, no DUSK_DISABLE, - // semantics on, no main.dart) the only ERROR-class check (#4) passes, - // so exit code is 0 even with multiple WARN / INFO rows below. + // semantics on, no main.dart) all checks pass with no ERROR rows, + // so exit code is 0. final output = BufferedOutput(); final exit = await DuskDoctorCommand() .handle(ArtisanContext.bare(MapInput(const {}), output)); diff --git a/test/src/extensions/ext_find_test.dart b/test/src/extensions/ext_find_test.dart index 21740d3..0209cf2 100644 --- a/test/src/extensions/ext_find_test.dart +++ b/test/src/extensions/ext_find_test.dart @@ -526,6 +526,104 @@ void main() { ); }); + group('multi-match semanticsLabel diagnostic', () { + setUp(RefRegistry.resetForTesting); + + testWidgets( + '(f) two nodes sharing a semanticsLabel produce a multi-match diagnostic', + (WidgetTester tester) async { + tester.view.physicalSize = const Size(800, 600); + tester.view.devicePixelRatio = 1.0; + addTearDown(tester.view.resetPhysicalSize); + addTearDown(tester.view.resetDevicePixelRatio); + + // Two distinct semantics nodes with the same label — models the + // "Password" over-match scenario from REPORT #15. + await tester.pumpWidget( + MaterialApp( + home: Scaffold( + body: Column( + children: [ + Semantics( + label: 'Password', + textField: true, + container: true, + child: const SizedBox(width: 200, height: 50), + ), + Semantics( + label: 'Password', + textField: true, + container: true, + child: const SizedBox(width: 200, height: 50), + ), + ], + ), + ), + ), + ); + await tester.pump(); + + final response = await extDuskFindHandler( + 'ext.dusk.find', + {'semanticsLabel': 'Password'}, + ); + + // Still resolves (backward-compatible); a q-handle is minted. + expect(response.result, isNotNull); + final Map decoded = + jsonDecode(response.result!) as Map; + expect(decoded['matched'], isTrue); + expect(decoded['ref'], startsWith('q')); + + // Multi-match diagnostic is present. + expect(decoded['matchCount'], equals(2)); + expect( + decoded['diagnostic'] as String? ?? '', + contains("label 'Password' matched 2 nodes"), + ); + }, + ); + + testWidgets( + '(f) single-match semanticsLabel carries no multi-match diagnostic', + (WidgetTester tester) async { + tester.view.physicalSize = const Size(800, 600); + tester.view.devicePixelRatio = 1.0; + addTearDown(tester.view.resetPhysicalSize); + addTearDown(tester.view.resetDevicePixelRatio); + + await tester.pumpWidget( + MaterialApp( + home: Scaffold( + body: Center( + child: Semantics( + label: 'UniqueLabel', + button: true, + container: true, + child: const SizedBox(width: 100, height: 100), + ), + ), + ), + ), + ); + await tester.pump(); + + final response = await extDuskFindHandler( + 'ext.dusk.find', + {'semanticsLabel': 'UniqueLabel'}, + ); + + expect(response.result, isNotNull); + final Map decoded = + jsonDecode(response.result!) as Map; + expect(decoded['matched'], isTrue); + expect(decoded['matchCount'], equals(1)); + // No diagnostic key present on single match. + expect(decoded.containsKey('diagnostic'), isFalse); + }, + ); + }); + group('RefRegistry query store', () { setUp(RefRegistry.resetForTesting); From c2d43ce6024aa554ae67f1cf615ecb2b66658123 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?An=C4=B1lcan=20=C3=87ak=C4=B1r?= Date: Thu, 25 Jun 2026 03:02:42 +0300 Subject: [PATCH 2/2] fix(extensions): correct multi-match diagnostic wording and add missing tearDown - Replace 'refine with --text/--contains or use a q-handle' with 'refine with --key, --text, or --contains' across all locations: ext_find.dart (both diagnostic builders), doc/commands/dusk-find.md (both example blocks), skills refs/actionability-and-refs.md (diagnostic example), and CHANGELOG.md ([Unreleased] entry). The old wording was self-contradictory: dusk:find already returns a q-handle, so 'use a q-handle' confused agents. The new wording focuses on predicate refinement and surfaces --key first (the most precise disambiguator). - Reword resolveQueryWithCount docstring to accurately describe matchCount semantics: 1 on a single match, 0 on no match (the old text claimed key/text-only paths 'always report 1', which contradicted the (null, 0, null) return on no-match). - Remove 'silently' from SKILL.md and actionability-and-refs.md: dusk:find no longer silently picks the first node on ambiguity; it now surfaces matchCount and diagnostic. - Add tearDown(RefRegistry.resetForTesting) to the multi-match test group in ext_find_test.dart to prevent q/e token state from leaking into later tests (repo test guideline requires both setUp and tearDown when RefRegistry state is touched). --- CHANGELOG.md | 2 +- doc/commands/dusk-find.md | 4 ++-- lib/src/extensions/ext_find.dart | 13 +++++++------ skills/fluttersdk-dusk/SKILL.md | 15 ++++++++------- .../references/actionability-and-refs.md | 13 +++++++------ test/src/extensions/ext_find_test.dart | 1 + 6 files changed, 26 insertions(+), 22 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 40de1c1..b110c66 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -10,7 +10,7 @@ This project follows [Semantic Versioning 2.0.0](https://semver.org/spec/v2.0.0. ### Added -- **`ext.dusk.find` now surfaces a `matchCount` field and an ambiguity `diagnostic` in its success response when `--semanticsLabel` or `--text` matches more than one Semantics node.** Previously the handler silently returned the first match, so `--semanticsLabel "Password"` over-matched the email field on forms where both `` nodes shared the same label. The response now includes `matchCount: N` on every match; when `N > 1` a `diagnostic` key carries a human-readable hint (`label 'X' matched N nodes; refine with --text/--contains or use a q-handle`). Single-match and no-match behaviour is unchanged (backward-compatible). Touches `lib/src/extensions/ext_find.dart`; covered by `test/src/extensions/ext_find_test.dart`. +- **`ext.dusk.find` now surfaces a `matchCount` field and an ambiguity `diagnostic` in its success response when `--semanticsLabel` or `--text` matches more than one Semantics node.** Previously the handler silently returned the first match, so `--semanticsLabel "Password"` over-matched the email field on forms where both `` nodes shared the same label. The response now includes `matchCount: N` on every match; when `N > 1` a `diagnostic` key carries a human-readable hint (`label 'X' matched N nodes; refine with --key, --text, or --contains`). Single-match and no-match behaviour is unchanged (backward-compatible). Touches `lib/src/extensions/ext_find.dart`; covered by `test/src/extensions/ext_find_test.dart`. - **`ext.dusk.snap` now surfaces captured non-fatal render/build FlutterErrors in a `renderErrors` block, and `dusk:snap` prints a `⚠ N render error(s)` banner to stderr while stdout stays the pure snapshot.** A widget that throws at build time (a `ParentDataWidget` misuse such as `flex-1`/`Expanded` placed under a `Semantics`/`WAnchor` instead of directly inside a Flex, or an overflow) can render partially and stay invisible in the semantics snapshot, so an action against it silently no-ops with no signal to the agent. The snapshot payload now carries `renderErrors: {count, recent: [{type, message}], hint}` (populated from the existing `FlutterError.onError` capture buffer, omitted entirely when clean), so a broken screen is impossible to miss without separately calling `ext.dusk.exceptions`. Touches `lib/src/extensions/ext_snapshot.dart`, `lib/src/commands/dusk_snap_command.dart`; covered by `test/src/extensions/ext_snapshot_render_errors_test.dart`. diff --git a/doc/commands/dusk-find.md b/doc/commands/dusk-find.md index 2b5856c..4adf359 100644 --- a/doc/commands/dusk-find.md +++ b/doc/commands/dusk-find.md @@ -79,7 +79,7 @@ Multi-match (ambiguous predicate): "ref": "q1", "matched": true, "matchCount": 2, - "diagnostic": "label 'Password' matched 2 nodes; refine with --text/--contains or use a q-handle" + "diagnostic": "label 'Password' matched 2 nodes; refine with --key, --text, or --contains" } ``` @@ -213,7 +213,7 @@ The `matchCount` field in the response tells the agent how many nodes matched. A `diagnostic` key appears when `matchCount > 1`, e.g.: ``` -label 'Password' matched 2 nodes; refine with --text/--contains or use a q-handle +label 'Password' matched 2 nodes; refine with --key, --text, or --contains ``` **Disambiguation strategies (most to least precise):** diff --git a/lib/src/extensions/ext_find.dart b/lib/src/extensions/ext_find.dart index d08afbb..4f5e55f 100644 --- a/lib/src/extensions/ext_find.dart +++ b/lib/src/extensions/ext_find.dart @@ -152,12 +152,13 @@ RefEntry? resolveQuery(DuskQuery query) { /// /// Returns a record `(entry, matchCount, diagnostic)`: /// - `entry` — first match, or `null` when nothing matched. -/// - `matchCount` — number of nodes that matched (1 for a unique match, 0 -/// when no match; only meaningful for the `semanticsLabel` / `text` -/// Semantics-walk paths; key and text-only Element paths always report 1). +/// - `matchCount` — number of nodes that matched (1 on a single match, 0 +/// when nothing matches; only meaningful for the `semanticsLabel` / `text` +/// Semantics-walk paths; key-based and text-only Element paths return 1 on +/// a match, 0 on no match). /// - `diagnostic` — non-null only when `matchCount > 1`; a message suitable /// for surfacing to an agent, e.g. `label 'Password' matched 2 nodes; -/// refine with --text/--contains or use a q-handle`. +/// refine with --key, --text, or --contains`. /// /// Single-match and no-match behaviour is identical to [resolveQuery]; /// callers that do not need ambiguity detection may use [resolveQuery] @@ -180,7 +181,7 @@ RefEntry? resolveQuery(DuskQuery query) { if (node == null) return (null, 0, null); final String? diagnostic = count > 1 ? "label '${query.semanticsLabel}' matched $count nodes; " - 'refine with --text/--contains or use a q-handle' + 'refine with --key, --text, or --contains' : null; return (_entryFromSemanticsNode(node), count, diagnostic); } @@ -194,7 +195,7 @@ RefEntry? resolveQuery(DuskQuery query) { if (node != null) { final String? diagnostic = count > 1 ? "label '${query.text}' matched $count nodes; " - 'refine with --semanticsLabel/--contains or use a q-handle' + 'refine with --key, --text, or --contains' : null; return (_entryFromSemanticsNode(node), count, diagnostic); } diff --git a/skills/fluttersdk-dusk/SKILL.md b/skills/fluttersdk-dusk/SKILL.md index 6f4a759..06dedf2 100644 --- a/skills/fluttersdk-dusk/SKILL.md +++ b/skills/fluttersdk-dusk/SKILL.md @@ -196,13 +196,14 @@ any page with conditional sections), use `dusk_find` from the start instead of snapping and then regretting the stale `defunct` failure. **`--semanticsLabel` over-match.** `dusk_find { semanticsLabel: "X" }` -exact-matches against the accessibility label and silently picks the FIRST -node. On forms with repeated labels ("Password" and "Confirm Password" both -labelled "Password"), the handle points at the wrong field. Check the -`matchCount` field in the response; when `> 1`, read the `diagnostic` key -and add a second predicate (`--key`, `--text`, or `--contains`) before -acting. Full disambiguation table: `references/actionability-and-refs.md` -section "semanticsLabel exact-match and over-match". +exact-matches against the accessibility label and resolves to the FIRST node; +ambiguity is now surfaced via `matchCount` and `diagnostic` in the response. +On forms with repeated labels ("Password" and "Confirm Password" both labelled +"Password"), the handle points at the first match. Check the `matchCount` +field in the response; when `> 1`, read the `diagnostic` key and add a second +predicate (`--key`, `--text`, or `--contains`) before acting. Full +disambiguation table: `references/actionability-and-refs.md` section +"semanticsLabel exact-match and over-match". ## 5. Quick install + doctor (when dusk is missing) diff --git a/skills/fluttersdk-dusk/references/actionability-and-refs.md b/skills/fluttersdk-dusk/references/actionability-and-refs.md index 14271df..9665b15 100644 --- a/skills/fluttersdk-dusk/references/actionability-and-refs.md +++ b/skills/fluttersdk-dusk/references/actionability-and-refs.md @@ -239,11 +239,12 @@ that is intentional. Disable per call, not per session. ## `--semanticsLabel` exact-match and over-match `dusk_find { semanticsLabel: "Password" }` performs a case-sensitive exact -match against `SemanticsNode.label` and silently resolves to the FIRST node -in tree order. On forms where multiple fields share the same label (e.g. a -Password field and a Confirm Password field both labelled "Password", or a -list of rows each containing a "Delete" button), the handle points at the -wrong target. +match against `SemanticsNode.label` and resolves to the FIRST node in tree +order; when more than one node matches, ambiguity is surfaced via `matchCount` +and `diagnostic` in the response. On forms where multiple fields share the +same label (e.g. a Password field and a Confirm Password field both labelled +"Password", or a list of rows each containing a "Delete" button), the handle +points at the first match. The `matchCount` key in the success response tells the agent how many nodes matched. When `matchCount > 1` the response also carries a `diagnostic` key: @@ -253,7 +254,7 @@ matched. When `matchCount > 1` the response also carries a `diagnostic` key: "ref": "q3", "matched": true, "matchCount": 2, - "diagnostic": "label 'Password' matched 2 nodes; refine with --text/--contains or use a q-handle" + "diagnostic": "label 'Password' matched 2 nodes; refine with --key, --text, or --contains" } ``` diff --git a/test/src/extensions/ext_find_test.dart b/test/src/extensions/ext_find_test.dart index 0209cf2..f1e7453 100644 --- a/test/src/extensions/ext_find_test.dart +++ b/test/src/extensions/ext_find_test.dart @@ -528,6 +528,7 @@ void main() { group('multi-match semanticsLabel diagnostic', () { setUp(RefRegistry.resetForTesting); + tearDown(RefRegistry.resetForTesting); testWidgets( '(f) two nodes sharing a semanticsLabel produce a multi-match diagnostic',