You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
PDF support in scrape_tool — extract content from PDF pages directly; specify individual pages with pages=[1,5,10]
OAuth 2.0 + PKCE authentication — built-in OAuth flow for sites that require it
WebMCP integration — agents can discover and call custom tools exposed by websites via the WebMCP protocol
Loop detection — LoopGuard detects page cycles and repeated failed retries, with prompt rules to break out automatically
keep_alive + disconnect() — keep the browser alive across agent runs and disconnect explicitly when done
within_viewport parameter on get_state — pass within_viewport=False to get all interactive elements across the entire DOM regardless of scroll position
Scroll position hints — browser state now includes scroll percentage and position hints for the agent
Improvements
Unified semantic tree — DOMNode replaces separate TreeNode/TreeNodeData types; tree is now built from real DOM parent-child traversal instead of XPath reconstruction
Richer semantic tree output — shows id/class in CSS selector notation, and role when it differs from tag
Improved textual element detection — additional tags and correct inline text extraction
DOM capture timing — logs state_capture_ms and screenshot_capture_ms for performance visibility
Multiple performance optimizations across the agent loop
Migrated to uv package manager
Removed Playwright dependency — fully CDP-native via bundled src/cdp/ module
Bug Fixes
Fixed PDF text extraction (switched to get_text('html') + markdownify)
Fixed done_tool over-condensing the final output
Fixed bounding boxes disappearing when page is scrolled
Fixed viewport element filtering to correctly account for scroll offset
Fixed scroll position key names in DOM viewport filtering
Fixed sub-frame/worker crash handling in CrashWatchdog
Fixed 10 s _wait_for_page timeouts by tracking navigation state
Fixed browser stability and agent crash handling
Fixed Gemini tool-calling when thought signature is absent