Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
63 changes: 63 additions & 0 deletions docs/cdp/INDEX.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
# CDP zettel for superpowers-chrome

Atomic notes on Chrome DevTools Protocol topics most relevant to a library that drives Chrome over CDP for automation and agent use. Each card makes one claim, in its own words, with sources and links to related cards.

Written 2026-05-11 by Yarrow (Bob). Form follows the `taking-smart-notes` skill; cards live here (per Matt's request) rather than in a `notes/zettel/` slipbox.

## Reading order for a newcomer

If you're new to CDP and want to learn it in order:

1. [flatten-mode-and-sessionid-envelope](flatten-mode-and-sessionid-envelope.md) — the protocol convention everything else builds on
2. [one-ws-many-sessions-architecture](one-ws-many-sessions-architecture.md) — the client architecture that uses it
3. [per-session-message-id-counters](per-session-message-id-counters.md) — the correctness invariant you must preserve
4. [target-domain-target-types](target-domain-target-types.md) — what's out there to attach to
5. [target-autoattach-vs-discovertargets](target-autoattach-vs-discovertargets.md) — how to capture children safely
6. [runtime-evaluate-three-modes](runtime-evaluate-three-modes.md) — the workhorse command and its trap modes
7. [navigation-listener-ordering-race](navigation-listener-ordering-race.md) — the event-ordering invariant
8. [page-loadeventfired-is-not-ready](page-loadeventfired-is-not-ready.md) — what "page loaded" really means

## Index — all cards

### Session protocol & architecture

- **[flatten-mode-and-sessionid-envelope](flatten-mode-and-sessionid-envelope.md)** — Flatten mode makes sessionId a message-envelope field, not a connection property; it's the modern default.
- **[one-ws-many-sessions-architecture](one-ws-many-sessions-architecture.md)** — One browser-level WebSocket multiplexing N flatten-mode sessions is the contemporary transport pattern.
- **[per-session-message-id-counters](per-session-message-id-counters.md)** — Each CDP session has its own id counter; collapsing id space silently breaks correlation.

### Targets & lifecycle

- **[target-domain-target-types](target-domain-target-types.md)** — CDP targets are not just pages: workers, iframes, browser, "tab" are all distinct with their own session shape.
- **[target-autoattach-vs-discovertargets](target-autoattach-vs-discovertargets.md)** — `setAutoAttach` captures children safely; `setDiscoverTargets` is observation only.
- **[browser-context-for-test-isolation](browser-context-for-test-isolation.md)** — `disposeBrowserContext` is atomic; per-test cookie-scrubbing is incomplete by construction.
- **[target-attached-without-detach-leaks](target-attached-without-detach-leaks.md)** — Forgetting `detachFromTarget` leaks sessionIds and subscriptions until process exit.

### Page interaction primitives

- **[runtime-evaluate-three-modes](runtime-evaluate-three-modes.md)** — `Runtime.evaluate`'s three orthogonal modes (returnByValue, awaitPromise, exceptionDetails) and when each matters.
- **[isolated-worlds-and-execution-contexts](isolated-worlds-and-execution-contexts.md)** — Isolated worlds share the DOM but not the JS heap; the defence against hostile-page monkey-patching.
- **[navigation-listener-ordering-race](navigation-listener-ordering-race.md)** — Register the load-event listener before issuing `Page.navigate`, or fast pages will fire it before you're listening.
- **[page-loadeventfired-is-not-ready](page-loadeventfired-is-not-ready.md)** — `Page.loadEventFired` is `window.onload`, not "page is ready to interact with."

### Network instrumentation

- **[network-vs-fetch-domains](network-vs-fetch-domains.md)** — Network observes; Fetch intercepts. `Network.requestIntercepted` is deprecated.

### Process, transport, and modes

- **[chrome-process-lifecycle-traps](chrome-process-lifecycle-traps.md)** — Four lifecycle traps any Chrome-spawning library spends complexity on.
- **[cdp-pipe-vs-websocket-transport](cdp-pipe-vs-websocket-transport.md)** — `--remote-debugging-pipe` is structurally safer than `--remote-debugging-port`; Chrome 136 added a `--user-data-dir` requirement.
- **[headless-new-vs-shell](headless-new-vs-shell.md)** — Chrome 132 removed `--headless=old`; `chrome-headless-shell` is now a separate download.

### Ecosystem context

- **[puppeteer-as-cdp-reference-implementation](puppeteer-as-cdp-reference-implementation.md)** — When the docs are ambiguous, Puppeteer is the canonical implementation to read.
- **[webdriver-bidi-vs-cdp-trajectory](webdriver-bidi-vs-cdp-trajectory.md)** — CDP stays Chrome-specific debugging; BiDi is the cross-browser standard, with non-trivial overlap during the transition.

## Suggested cluster reads

- **"I'm extending the browser-WS bridge"**: flatten-mode + one-ws-many-sessions + per-session-message-id-counters + target-attached-without-detach-leaks + puppeteer-as-cdp-reference-implementation.
- **"I'm adding request interception"**: network-vs-fetch + target-autoattach + navigation-listener-ordering-race.
- **"I'm hardening for hostile pages"**: isolated-worlds + runtime-evaluate-three-modes + browser-context-for-test-isolation.
- **"I'm shipping in a container or sandboxed env"**: headless-new-vs-shell + cdp-pipe-vs-websocket-transport + chrome-process-lifecycle-traps.
- **"I'm thinking about cross-browser someday"**: webdriver-bidi-vs-cdp-trajectory + puppeteer-as-cdp-reference-implementation.
17 changes: 17 additions & 0 deletions docs/cdp/browser-context-for-test-isolation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# Target.createBrowserContext is the right unit of test isolation; per-test cookie scrubbing is the wrong one

A Chrome BrowserContext is like an incognito profile but scoped to a programmatic lifetime: created via `Target.createBrowserContext`, scoped to receive new pages via `Target.createTarget({browserContextId})`, and disposed atomically via `Target.disposeBrowserContext`. Disposing tears down cookies, localStorage, sessionStorage, IndexedDB, cache storage, service worker registrations, and any pages still open inside the context — in one call, with no race between deletions.

The contrast that justifies preferring it: a hand-rolled "reset state between tests" routine that calls `Network.clearBrowserCookies`, `Storage.clearDataForOrigin`, and friends is always incomplete. It misses storage types added after the routine was written (e.g. service worker registrations from a feature added later), it has ordering problems (clearing cookies after a redirect already fired), and it cannot atomically guarantee that no in-flight network call from the previous test mutates state for the next one. `disposeBrowserContext` makes that impossible by construction — the renderer process is torn down with its storage.

The cost is real: BrowserContexts are not free. Each one is roughly a fresh incognito session — new disk allocations, new HTTP connection pools, new service-worker registration scope. For high-volume parallel testing, the right pattern is "one context per worker, recycle every N tests" not "one context per test." For agent-driven flows where isolation is the *whole point* (a fresh session for each agent run), one context per run is correct and the cost is amortized.

## For superpowers-chrome
The library exposes `createBrowserContext({proxyServer?})` via the bridge, returning `{browserContextId, createPage, dispose}`. An advanced consumer building per-run isolation should create a context at session start, build all pages inside it, and call `dispose` at teardown. The library does not currently force this — pages created via `newTab()` go into Chrome's default context. A consumer that wants strict isolation needs to use the bridge's `createBrowserContext` API directly and skip the convenience `newTab`.

See also: [target-domain-target-types](target-domain-target-types.md), [chrome-process-lifecycle-traps](chrome-process-lifecycle-traps.md)

Sources:
- CDP Target domain (createBrowserContext/disposeBrowserContext): https://chromedevtools.github.io/devtools-protocol/tot/Target/
- `superpowers-chrome/skills/browsing/lib/browser-bridge.js` (createBrowserContext implementation)
- gauntlet commit cda4f03 (BrowserContext-based per-test isolation rationale)
21 changes: 21 additions & 0 deletions docs/cdp/cdp-pipe-vs-websocket-transport.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# --remote-debugging-pipe carries CDP over inherited file descriptors; --remote-debugging-port exposes it on the network

CDP is a JSON-RPC protocol; how the bytes move is a transport detail with security and operational consequences. Chrome supports two transports for it.

**`--remote-debugging-port=N`** is the familiar one. Chrome binds a TCP socket on localhost:N (or a configured interface), exposes the HTTP discovery endpoints (`/json/version`, `/json/list`), and accepts WebSocket upgrades for `/devtools/browser/<id>` and `/devtools/page/<targetId>`. Anyone with network access to that port — including, historically, malicious local processes — can drive the browser. This is the transport every "remote debugging in Chrome" article and library defaults to.

**`--remote-debugging-pipe`** does the same JSON-RPC over inherited file descriptors: Chrome reads CDP messages from FD 3, writes responses and events to FD 4. There is no network socket at all. Only the parent process that launched Chrome (with those FDs set up via `spawn`/`posix_spawn` options) can talk to it. The benefits: no port to leak or collide, no localhost-attack surface, works in sandboxed environments (gVisor, Firecracker) that block runtime TCP bind, lower per-message overhead (no TCP/WS framing).

A 2026-relevant change: from Chrome 136, both `--remote-debugging-port` and `--remote-debugging-pipe` are *refused* if you're targeting the default Chrome user-data-dir. You must pass `--user-data-dir=/somewhere/else`. This is a response to cookie-theft malware that was reading from the default profile via the debugging interface; legitimate automation always passed its own profile anyway, so the user impact should be small but the diagnostic "Chrome silently exits when I add `--remote-debugging-port=9222`" gets a new explanation.

For a CDP library: pipe transport is structurally safer, requires a different I/O loop (line-delimited JSON over FDs instead of WebSocket framing), and is what Puppeteer uses by default when it spawns Chrome itself. Most libraries that connect to a separately-launched Chrome are stuck on the port transport because pipe requires inheriting FDs from the launch.

## For superpowers-chrome
The library uses the port transport (WebSocket) and always launches Chrome with its own `--user-data-dir`, so the Chrome 136 change is already handled. Adding pipe-transport support would be a substantial change — the WebSocket plumbing in `lib/websocket-client.js` would need a pipe analogue, and the library would lose the ability to attach to an already-running Chrome (the typical workflow for some consumers). A useful intermediate: document the security-posture difference so consumers running in untrusted environments know to consider pipe-launched alternatives.

See also: [chrome-process-lifecycle-traps](chrome-process-lifecycle-traps.md), [headless-new-vs-shell](headless-new-vs-shell.md)

Sources:
- Chrome blog, "Changes to remote debugging switches to improve security" (Chrome 136 user-data-dir requirement): https://developer.chrome.com/blog/remote-debugging-port
- chromedp issue on pipe transport: https://github.com/chromedp/chromedp/issues/1607
- Puppeteer Launcher defaults (pipe transport for spawned Chrome): https://github.com/puppeteer/puppeteer/tree/main/packages/puppeteer-core/src/node
21 changes: 21 additions & 0 deletions docs/cdp/chrome-process-lifecycle-traps.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# Spawning and reconnecting to Chrome from a library is mostly fighting four lifecycle traps

A CDP automation library that owns the Chrome process spends a surprising fraction of its complexity on lifecycle, not on protocol. Four traps recur:

**1. User-data-dir collisions.** Two Chrome processes pointed at the same `--user-data-dir` will fight over the profile lock and one will exit immediately. If your library reuses a profile across runs (to keep cookies/extensions), you must either serialize launches or use one profile per concurrent run. The fix-by-construction is per-session profile directories. The fix-by-coordination is a meta-file written into the profile dir that records the active PID and port, checked on launch.

**2. Port-binding race.** If you ask for a specific `--remote-debugging-port`, two parallel launches will collide. If you ask Chrome to pick (no port flag), Chrome writes the chosen port to `<userDataDir>/DevToolsActivePort` — but you have to poll that file because Chrome creates it asynchronously. The cleaner fix is to find a free port *before* spawn (bind, get the port, close, pass it) and accept the small race window where another process can steal it.

**3. Zombie processes.** Chrome forks itself extensively — one browser process, one renderer per site instance, GPU process, network service, utility processes. If you kill only the parent, children survive on some platforms (macOS especially) until the OS reaps them, often hanging onto the user-data-dir lock. Either kill the process group (`-pid` on Unix), use `Browser.close` via CDP first (graceful shutdown), or both.

**4. Reconnecting to a Chrome that died.** Across restarts of your library/MCP server, the Chrome you launched may still be alive (graceful) or may have crashed (you have a stale port number in your meta file). The typical pattern is: read the meta file, probe the port (HTTP GET to `/json/version`), if it responds reconnect, else clear the meta and launch fresh. Probing must be fast and tolerant; a slow probe blocks library startup, and a strict probe (e.g. requiring a specific version field) breaks across Chrome updates.

## For superpowers-chrome
`lib/chrome-process.js` and `lib/chrome-launcher-helpers.js` handle all four: per-profile meta.json with `{port, pid}`, `findAvailablePort` for dynamic allocation, `isPortAlive` probe with PID matching for reconnection, graceful shutdown via `/json/close` then SIGTERM, port-based PID fallback for `killChrome` when the library doesn't own the process. The shape is right; the parts most worth review periodically are the probe timeouts (currently 15s startup poll) and the killing strategy on Linux/Windows where process-group handling diverges.

See also: [cdp-pipe-vs-websocket-transport](cdp-pipe-vs-websocket-transport.md), [headless-new-vs-shell](headless-new-vs-shell.md), [browser-context-for-test-isolation](browser-context-for-test-isolation.md)

Sources:
- `superpowers-chrome/skills/browsing/lib/chrome-process.js` (the in-tree implementation)
- Chrome's `DevToolsActivePort` file behavior: https://chromium.googlesource.com/chromium/src/+/main/content/browser/devtools/
- Chrome blog on `--user-data-dir` requirement from Chrome 136: https://developer.chrome.com/blog/remote-debugging-port
17 changes: 17 additions & 0 deletions docs/cdp/flatten-mode-and-sessionid-envelope.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# Flatten mode makes sessionId a message-envelope field, not a connection property

In CDP's legacy ("non-flat") session protocol, attaching to a child target produced a *nested* session: you sent `Target.sendMessageToTarget` containing a serialized inner message, and Chrome replied with `Target.receivedMessageFromTarget` containing a serialized inner reply. Each layer of attachment added a wrapper. Flatten mode — turned on by passing `flatten: true` to `Target.attachToTarget` (and `Target.setAutoAttach`) — collapses that. The same WebSocket carries top-level messages tagged with a `sessionId` field alongside the usual `id` / `method` / `params`. The official docs say: *"We plan to make this the default, deprecate non-flattened mode, and eventually retire it."* Puppeteer, Playwright, and every modern CDP client default to `flatten: true`.

The practical shape: one outbound message looks like `{"id":7,"sessionId":"<sid>","method":"Page.navigate","params":{...}}`. A reply or event with `sessionId` set belongs to that page session; one without `sessionId` is a root (browser-level) message. The router on your side keys on `sessionId` to dispatch.

The reason this matters more than it sounds: flatten mode is what makes a *single* browser-level WebSocket viable as the transport for an arbitrary number of pages, workers, and out-of-process iframes. Without it, every page attachment doubled the wire envelope and demanded a custom unwrap on every message. With it, sessionId becomes a routing label on otherwise normal CDP traffic.

## For superpowers-chrome
The library opens exactly one CDP WebSocket per Chrome process (against `/devtools/browser/<id>`) and obtains a `sessionId` for each page via `Target.attachToTarget({targetId, flatten: true})`. Page action commands ride that envelope. An advanced consumer wanting to attach to additional targets (OOPIFs, service workers, popup windows) should attach with `flatten: true` for the same reason — there is no good argument to opt into the legacy nested protocol in 2026.

See also: [one-ws-many-sessions-architecture](one-ws-many-sessions-architecture.md), [per-session-message-id-counters](per-session-message-id-counters.md), [target-autoattach-vs-discovertargets](target-autoattach-vs-discovertargets.md), [puppeteer-as-cdp-reference-implementation](puppeteer-as-cdp-reference-implementation.md)

Sources:
- Chrome DevTools Protocol — Target domain: https://chromedevtools.github.io/devtools-protocol/tot/Target/
- Andrey Lushnikov, "Getting Started With Chrome DevTools Protocol": https://github.com/aslushnikov/getting-started-with-cdp
- Puppeteer `Connection.ts` (uses `flatten: true`): https://github.com/puppeteer/puppeteer/blob/main/packages/puppeteer-core/src/cdp/Connection.ts
Loading