flatten-mode bridge + CDP zettel#34
Open
mhat wants to merge 3 commits into
Open
Conversation
…ession, browser-bridge)
Four new modules under skills/browsing/lib/ implement the flatten-mode CDP
substrate that page-action commands will ride in the next commit:
- browser-session.js: one CDP WebSocket per Chrome process, lazy-connected
on first send()/onEvent() call. Discovers /devtools/browser/<id> via
/json/version. Owns root-session pendingRequests; page-session-tagged
responses fall through to event listeners. Exposes a `sendRaw(json)`
escape hatch for page sessions to send sessionId-enveloped messages
without colliding with the root id-counter.
- cdp-router.js: dispatches inbound browser-WS messages by sessionId.
Per-session command responses go to that session's pendingRequests;
per-session events go to that session's listeners. Sessionless events
go to root listeners. Sessionless command responses fall through to
browser-session.js — we do NOT add a parallel rootPending map here
(single source of truth for root-session correlation).
- page-session.js: per-page CDP session over the browser-WS, attached
via Target.attachToTarget({flatten:true}). Independent id-counter per
session. send() builds a sessionId-enveloped JSON payload and pushes
it through browser.sendRaw; the cdp-router correlates the response.
enableDomain is idempotent so multiple callers (navigation auto-capture,
console-logging) can coexist on the same session. No retry, reconnect,
or fallback inside send — the deliberate one-shot contract that
retires the pre-flatten per-page WS pool's silent single-use fallback.
- browser-bridge.js: subscribes Target.setDiscoverTargets, tracks the
live target set, and exposes targets.list/onCreated/onDestroyed/waitForNew
plus createBrowserContext/disposeBrowserContext for atomic per-test
isolation.
Tests at test/lib/{browser-session,cdp-router,page-session,browser-bridge}.test.mjs
cover the dispatch logic via mock browser-sessions. The bridge primitives
are not yet wired into createSession() — that comes in the orchestrator
commit, after the action libs are migrated to consume page sessions.
Co-Authored-By: Mahit (Bob 1b32d32d/Opus 4.7) <noreply@anthropic.com>
… WS pool
Page-action commands now ride a per-target CDP page session over the
single browser-WS, with sessionId routing handled by the cdp-router.
The per-page WebSocket pool in lib/cdp-connection.js — and its silent
"Pooled connection failed, using single-use" fallback — is gone.
Action-lib signature change. Every action lib now takes a
`getPageSession` resolver instead of `(resolveWsUrl, sendCdpCommand)`.
The orchestrator's resolver accepts the legacy shapes (numeric tab
index, ws:// URL, numeric string) AND a pre-attached pageSession,
routing through page-session.send() for actual CDP traffic. 12 libs
migrated: mouse, keyboard-input, evaluation, screenshot, navigation,
extraction, file-upload, select-option, viewport, cookies, capture,
console-logging.
Structural changes elsewhere:
- skills/browsing/lib/tabs.js: tab handles returned by getTabs() and
newTab() carry a lazy `getPageSession()` thunk that memoizes per
targetId. closeTab() detaches the cached session before issuing the
HTTP close. Orchestrator wires the attacher via setPageSessionAttacher
to avoid a construction-order cycle.
- skills/browsing/lib/navigation.js: Page.loadEventFired now arrives
through pageSession.waitForEvent — no second WebSocket per navigation.
Auto-capture's Runtime.consoleAPICalled stream rides the same page
session. The 30s hard cap and "listener-ready-before-navigate"
ordering are preserved (Page.enable + waitForEvent are registered
before Page.navigate).
- skills/browsing/lib/console-logging.js: enableConsoleLogging registers
a page-session event listener instead of opening its own WebSocket.
state.consoleMessages is keyed by ps.sessionId. Returns {close()} so a
caller can stop capturing without detaching the whole page session.
- skills/browsing/lib/session-state.js: drop the now-unused
connectionPool Map. consoleMessages comment updated to reflect the
sessionId keying.
- skills/browsing/chrome-ws-lib.js: orchestrator wires _ensureBridge /
_closeBridge, defines the getPageSession resolver, exposes the
bridge surface (targets, createBrowserContext, attachPageSession),
and drops closePooledConnection / closeAllConnections. Bridge
open is lazy on first targets/context/page-session access — the
remote-Chrome path (where startChrome is skipped) and the local
path use one code path.
- skills/browsing/lib/chrome-process.js: killChrome now accepts a
closeBridge callback and runs it before tearing down Chrome, so the
browser-WS and any attached page sessions clean up before the
process goes.
Test changes:
- test/lib/_helpers.mjs: add makePageSessionSpy + makeGetPageSession.
The pageSession spy records send() calls in `.calls` and supports
onEvent/waitForEvent/enableDomain/detach plus a `.deliver(msg)`
hatch for tests that simulate inbound events.
- 12 action-lib tests migrated to use makePageSessionSpy. The spy
records the same shape as the legacy makeCdpSpy, so test logic
survives largely intact — only setup wiring and the call records
changed.
- console-logging tests gain coverage of the new event-stream path
(deliver() to verify the listener captures into state.consoleMessages
by sessionId).
- test/session-isolation.test.mjs: replace the closeAllConnections
identity check with bridge-surface identity checks
(attachPageSession / createBrowserContext / targets).
- test/lib/cdp-connection.test.mjs: deleted (the module is gone).
All 171 tests pass — including the 8 real-Chrome smoke tests against
a live Chrome.
Co-Authored-By: Mahit (Bob 1b32d32d/Opus 4.7) <noreply@anthropic.com>
17 atomic cards on Chrome DevTools Protocol topics most relevant to this library and its consumers — flatten mode + sessionId envelope, one-WS-many-sessions, per-session id counters, target lifecycle, isolated worlds, navigation race, Page.loadEventFired semantics, headless modes, Puppeteer/BiDi context, autoAttach vs discoverTargets, etc. Each card is one claim, in-house wording, linked to siblings, with sources. Framed as guidance for someone extending superpowers-chrome, not as a tutorial. Co-Authored-By: Yarrow (Bob yarrow-c/Opus 4.7) <noreply@anthropic.com>
0a2a458 to
b6f10ed
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Two pieces, kept together because the second explains the first.
The flatten-mode bridge (commits 1–2)
Add browser-WS bridge primitives+Migrate page actions onto flatten-mode page sessions; retire per-page WS pool— authored by Mahit.Four new modules under
skills/browsing/lib/:browser-session.js— one CDP WebSocket per Chrome process. Owns root-session pendingRequests; per-session responses fall through to the router.cdp-router.js— dispatches inbound messages bysessionId. Single source of truth for root-session correlation.page-session.js— per-page CDP session over the browser-WS, attached viaTarget.attachToTarget({flatten:true}). Independent id-counter per session.browser-bridge.js— subscribesTarget.setDiscoverTargets, tracks the live target set, exposestargets.list/onCreated/onDestroyed/waitForNewpluscreateBrowserContext/disposeBrowserContext.Twelve action libs (mouse, keyboard-input, evaluation, screenshot, navigation, extraction, file-upload, select-option, viewport, cookies, capture, console-logging) migrated from the
(resolveWsUrl, sendCdpCommand)signature to agetPageSessionresolver. The per-page WebSocket pool inlib/cdp-connection.js— and its silent "Pooled connection failed, using single-use" fallback — is gone.Why this is the right direction
The per-page WS pool had three real problems: a silent fallback (when pool logic broke, code dropped to single-use connections — "buggy" and "working" looked identical from outside), N reconnect/cleanup policies and N opportunities to leak or half-close, and a capability ceiling —
Target.setAutoAttachdelivers child sessions on the parent's socket via flatten, which doesn't fit a per-page-WS model at all. The bridge collapses all three: one socket with one lifecycle, sessionId routing that autoAttach lands on for free, no fallback path because there's nothing to fall back to.The CDP zettel (commit 3)
17 short markdown cards under
docs/cdp/, plus anINDEX.md. Each card is one claim, in our own words, with links to sibling cards and source URLs. Topics include:Framed as guidance for someone extending this library, not as a tutorial. A durable reference for the next round of capabilities (autoAttach, isolated worlds, Fetch/Network) that will build on the bridge.