Skip to content

v0.2.3-alpha

Latest

Choose a tag to compare

@polymit-hq polymit-hq released this 15 May 06:52
· 1 commit to main since this release

Release Notes — v0.2.3-alpha

This release focuses on hardening the Phantom Engine's reliability during complex navigation and interaction tasks. We have addressed critical memory bottlenecks in the QuickJS execution layer, synchronized the interaction engine with actual browser side effects (scrolling and link traversal), and refined the selective serialization pipeline to ensure high-fidelity context retention for agentic workflows.

Core Engine Improvements

Dynamic Memory Scaling (js_out_of_memory)
Previously, the QuickJS runtime was capped at a hardcoded 50MB heap limit, causing catastrophic panics on complex websites like Wikipedia. In session.rs, we have replaced this with a dynamic max_heap_bytes parameter. The default budget has been increased to 256MB, allowing the engine to handle dense DOM structures without session failure.

Document Dimension Tracking
The layout pipeline in pipeline.rs has been upgraded to calculate absolute document boundaries (total_width and total_height). This enables the engine to accurately clamp scroll offsets and provide agents with reliable metadata about the full scrollable area of a webpage.

Bug Fixes

Silent Navigation Failures on Clicks
The browser_click tool previously dispatched synthetic JS events but failed to trigger browser-level side effects. In click.rs, we have implemented a navigation bridge: the engine now detects if a click targets an HTMLAnchorElement (<a>) and manually triggers the Rust navigation pipeline. This ensures the engine state actually moves to the new URL upon interaction.

Off-Screen Interaction Blindness (Auto-Scroll)
Interactive tools like browser_click and browser_press_key were previously "blind" to viewport constraints, delivering events to coordinates that were clipped or off-screen. The interaction layer is now viewport-aware; the engine will automatically adjust the scroll_y position to bring a target element into view before firing interaction events.

PageDown and Keyboard Inactivity
Synthetic keyboard events failed to trigger default browser scrolling behaviors. In press_key.rs, we have implemented manual viewport mutation for navigation keys (PageDown, PageUp, Home, End, Space, and Arrow keys). Pressing PageDown now correctly increments the scroll_y offset in the engine state, ensuring subsequent scene graph snapshots reflect the new view.

Selective Mode Over-Culling (node_count: 0)
The Selective serialization mode previously applied overly aggressive viewport clipping and relevance thresholding, often returning empty scene graphs. We have introduced a 2000px vertical "lookahead" buffer and lowered the relevance threshold to 0.2 with a baseline score for any element containing visible text. This ensures critical context (like headers and site navigation) is preserved even with specific task hints.

Affected Files

  • phantom-core/src/pipeline.rs — implemented document dimension calculation and scroll_x/y support.
  • phantom-core/src/dom/node.rs — centralized is_interactive() and is_landmark() classification.
  • phantom-mcp/src/engine.rs — added scroll state persistence and update_scroll mutation.
  • phantom-mcp/src/tools/click.rs — implemented auto-scroll and hyperlink navigation bridge.
  • phantom-mcp/src/tools/press_key.rs — implemented keyboard-driven viewport mutation.
  • phantom-serializer/src/serializer.rs — added buffered viewport clipping in Selective mode.
  • phantom-serializer/src/selective.rs — refined relevance heuristics for better context retention.

Upgrade Notes

This release involves changes to core DOM and Session structures. All custom Tier 1 session implementations must be updated to support the new dynamic memory API. Standard MCP integrations will automatically benefit from improved navigation stability and more descriptive selective scene graphs.