You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add FILL_FORM action, snapshot escalation early-exit, and URL-only verify optimization
- Add FILL_FORM planner action for multi-field forms (login, signup, checkout)
that fills all fields deterministically from snapshot in a single step
instead of one TYPE action per field (eliminates per-field LLM calls)
- Add early-exit in snapshot escalation when element count is unchanged
across iterations, preventing unnecessary limit escalation
- Skip scroll-after-escalation when page has fewer elements than limitBase
(nothing below the fold)
- Optimize verifyStepOutcome to use getCurrentUrl() instead of full
snapshot when all predicates are URL-only (url_contains, url_equals,
url_matches), dramatically reducing snapshot count on form pages
- Add 3 new tests for snapshot escalation fixes
- Update planner prompt with FILL_FORM action, examples, and rules
Copy file name to clipboardExpand all lines: src/agents/planner-executor/prompts.ts
+18-4Lines changed: 18 additions & 4 deletions
Original file line number
Diff line number
Diff line change
@@ -59,7 +59,8 @@ export function buildStepwisePlannerPrompt(
59
59
Actions:
60
60
- NAVIGATE: Go directly to a URL when the next destination is known. Set "target" to the URL.
61
61
- CLICK: Click an element. Set "intent" to describe the SPECIFIC element (include label, placeholder, or nearby text, e.g. "email textbox", "display name field", "Next button", NOT just "textbox" or "button"). Set "input" to EXACT text from elements list.
62
-
- TYPE: Type text into a form field (not a search box). Set "input" to the VALUE from the goal. Set "intent" to describe the field (e.g., "email field", "name field").
62
+
- FILL_FORM: Fill ALL visible form fields and submit. Use for login, signup, checkout, or any multi-field form. Set "fields" to an array of {label, value} pairs. Set "submitText" to the submit button text. Set "verify" to check navigation after submit.
63
+
- TYPE: Type text into a SINGLE form field. Prefer FILL_FORM for forms with multiple fields.
63
64
- TYPE_AND_SUBMIT: Type text into a search box and submit. Set "input" to the SEARCH QUERY from the goal (NOT the element label).
64
65
- SCROLL: Scroll page. Set "direction" to "up" or "down".
65
66
- WAIT: Wait for content to appear when a follow-up verification is needed.
@@ -71,8 +72,16 @@ WHEN TO USE DONE:
71
72
- "Add to Cart" task: DONE only AFTER clicking the Add to Cart button
72
73
- "Search and click product" task: DONE only AFTER clicking a product link
73
74
- "Search only" task: DONE after search results appear
75
+
- "Log in" task: DONE only AFTER the page navigates away from /login
74
76
- If goal has multiple steps, complete ALL steps before returning DONE
75
77
78
+
CRITICAL RULE FOR FILL_FORM (PREFERRED for login/signup/checkout):
79
+
- Use FILL_FORM when the goal provides values for 2+ form fields (e.g. "username: X, password: Y")
80
+
- "fields" is an array of {label, value} where label matches the field's visible text/placeholder
81
+
- "submitText" is the text on the submit button (e.g. "Sign in", "Log in", "Submit", "Next")
82
+
- The system will find and fill each field by matching label to element text/role
83
+
- This is MUCH faster than TYPE one field at a time
84
+
76
85
CRITICAL RULE FOR TYPE_AND_SUBMIT:
77
86
- "input" must be the SEARCH QUERY you want to type (e.g., "wireless headphones")
78
87
- "input" is NOT the element label (e.g., NOT "Search Amazon")
{"action":"CLICK","intent":"add to cart button","input":"Add to Cart","verify":[]}
@@ -102,11 +113,11 @@ RULES:
102
113
5. Include "verify" when you know a simple URL or element predicate that proves success; otherwise use []
103
114
6. Include planner metadata when useful: "target", "required", "stop_if_true", "optional_substeps", "heuristic_hints"
104
115
7. "heuristic_hints" entries may use snake_case fields: "intent_pattern", "text_patterns", "role_filter", "attribute_patterns", "priority"
105
-
8. Output ONLY JSON - no <think> tags, no markdown, no prose
106
-
9. Do NOT output <think> or any reasoning
116
+
8. Output ONLY JSON - no 时光网 tags, no markdown, no prose
117
+
9. Do NOT output 时光网 or any reasoning
107
118
10. Do NOT return DONE until ALL parts of the goal are complete
108
119
11. Never copy example URLs from these instructions. Only NAVIGATE to a URL from the user's task, the current page, or a visible element.
109
-
12. For multi-step forms: TYPE into each field (action: TYPE) BEFORE clicking Next. Never click Next without filling required fields first.
120
+
12. PREFER FILL_FORM for login/signup/checkout forms with 2+ fields. Do NOT use multiple TYPE actions when FILL_FORM can do it in one step.
110
121
13. "intent" must be SPECIFIC: describe the element with its label or context (e.g., "email field", "plan dropdown", "Next button on step 2")
111
122
14. Treat history results "success", "skipped", and "vision_fallback" as already satisfied. Do not repeat those steps; choose the next incomplete part of the goal.`;
0 commit comments