jeffdhooton · jeffdhooton · Apr 7, 2026 · Apr 4, 2026 · Apr 5, 2026 · Apr 6, 2026
diff --git a/.gitignore b/.gitignore
@@ -1,5 +1,6 @@
 .env
 node_modules/
+dist/
 browse/dist/
 design/dist/
 bin/gstack-global-discover

diff --git a/CHANGELOG.md b/CHANGELOG.md
diff --git a/CLAUDE.md b/CLAUDE.md
@@ -95,7 +95,8 @@ gstack/
 ├── cso/             # /cso skill (OWASP Top 10 + STRIDE security audit)
 ├── design-consultation/ # /design-consultation skill (design system from scratch)
 ├── design-shotgun/  # /design-shotgun skill (visual design exploration)
-├── connect-chrome/  # /connect-chrome skill (headed Chrome with side panel)
+├── open-gstack-browser/  # /open-gstack-browser skill (launch GStack Browser)
+├── connect-chrome/  # symlink → open-gstack-browser (backwards compat)
 ├── design/          # Design binary CLI (GPT Image API)
 │   ├── src/         # CLI + commands (generate, variants, compare, serve, etc.)
 │   ├── test/        # Integration tests
@@ -167,6 +168,14 @@ When you need to interact with a browser (QA, dogfooding, cookie setup), use the
 `mcp__claude-in-chrome__*` tools — they are slow, unreliable, and not what this
 project uses.
 
+**Sidebar architecture:** Before modifying `sidepanel.js`, `background.js`,
+`content.js`, `sidebar-agent.ts`, or sidebar-related server endpoints, read
+`docs/designs/SIDEBAR_MESSAGE_FLOW.md`. It documents the full initialization
+timeline, message flow, auth token chain, tab concurrency model, and known
+failure modes. The sidebar spans 5 files across 2 codebases (extension + server)
+with non-obvious ordering dependencies. The doc exists to prevent the kind of
+silent failures that come from not understanding the cross-component flow.
+
 ## Vendored symlink awareness
 
 When developing gstack, `.claude/skills/gstack` may be a symlink back to this

diff --git a/README.md b/README.md
diff --git a/SKILL.md b/SKILL.md
@@ -631,6 +631,9 @@ $B css ".button" "background-color"
 ## Snapshot System
 
 The snapshot is your primary tool for understanding and interacting with pages.
+`$B` is the browse binary (resolved from `$_ROOT/.claude/skills/gstack/browse/dist/browse` or `~/.claude/skills/gstack/browse/dist/browse`).
+
+**Syntax:** `$B snapshot [flags]`
 
 ```
 -i        --interactive           Interactive elements only (buttons, links, inputs) with @e refs
@@ -646,6 +649,12 @@ The snapshot is your primary tool for understanding and interacting with pages.
 All flags can be combined freely. `-o` only applies when `-a` is also used.
 Example: `$B snapshot -i -a -C -o /tmp/annotated.png`
 
+**Flag details:**
+- `-d <N>`: depth 0 = root element only, 1 = root + direct children, etc. Default: unlimited. Works with all other flags including `-i`.
+- `-s <sel>`: any valid CSS selector (`#main`, `.content`, `nav > ul`, `[data-testid="hero"]`). Scopes the tree to that subtree.
+- `-D`: outputs a unified diff (lines prefixed with `+`/`-`/` `) comparing the current snapshot against the previous one. First call stores the baseline and returns the full tree. Baseline persists across navigations until the next `-D` call resets it.
+- `-a`: saves an annotated screenshot (PNG) with red overlay boxes and @ref labels drawn on each interactive element. The screenshot is a separate output from the text tree — both are produced when `-a` is used.
+
 **Ref numbering:** @e refs are assigned sequentially (@e1, @e2, ...) in tree order.
 @c refs from `-C` are numbered separately (@c1, @c2, ...).
 

diff --git a/TODOS.md b/TODOS.md
@@ -199,16 +199,22 @@ Sidebar agent writes structured messages to `.context/sidebar-inbox/`. Workspace
 **Priority:** P3
 **Depends on:** Headed mode (shipped)
 
-### Sidebar agent needs Write tool + better error visibility
+### Sidebar agent needs Write tool + better error visibility — SHIPPED
 
 **What:** Two issues with the sidebar agent (`sidebar-agent.ts`): (1) `--allowedTools` is hardcoded to `Bash,Read,Glob,Grep`, missing `Write`. Claude can't create files (like CSVs) when asked. (2) When Claude errors or returns empty, the sidebar UI shows nothing, just a green dot. No error message, no "I tried but failed", nothing.
 
-**Why:** Users ask "write this to a CSV" and the sidebar silently can't. Then they think it's broken. The UI needs to surface errors visibly, and Claude needs the tools to actually do what's asked.
+**Completed:** v0.15.4.0 (2026-04-04). Write tool added to allowedTools. 40+ empty catch blocks replaced with `[gstack sidebar]`, `[gstack bg]`, `[browse]`, `[sidebar-agent]` prefixed console logging across all 4 files (sidepanel.js, background.js, server.ts, sidebar-agent.ts). Error placeholder text now shows in red. Auth token stale-refresh bug fixed.
 
-**Context:** `sidebar-agent.ts:163` hardcodes `--allowedTools`. The event relay (`handleStreamEvent`) handles `agent_done` and `agent_error` but the extension's sidepanel.js may not be rendering error states. The sidebar should show "Error: ..." or "Claude finished but produced no output" instead of staying on the green dot forever.
+### Sidebar direct API calls (eliminate claude -p startup tax)
 
-**Effort:** S (human: ~2h / CC: ~10min)
-**Priority:** P1
+**What:** Each sidebar message spawns a fresh `claude -p` process (~2-3s cold start overhead). For "click @e24" that's absurd. Direct Anthropic API calls would be sub-second.
+
+**Why:** The `claude -p` startup cost is: process spawn (~100ms) + CLI init (~500ms-1s) + API connection (~200ms) + first token. Model routing (Sonnet for actions) helps but doesn't fix the CLI overhead.
+
+**Context:** `server.ts:spawnClaude()` builds args and writes to queue file. `sidebar-agent.ts:askClaude()` spawns `claude -p`. Replace with direct `fetch('https://api.anthropic.com/...')` with tool use. Requires `ANTHROPIC_API_KEY` accessible to the browse server.
+
+**Effort:** M (human: ~1 week / CC: ~30min)
+**Priority:** P2
 **Depends on:** None
 
 ### Chrome Web Store publishing
@@ -846,6 +852,31 @@ Shipped in v0.6.5. TemplateContext in gen-skill-docs.ts bakes skill name into pr
 **Effort:** M (human: ~3 days / CC: ~2 hours)
 **Priority:** P3
 
+## GStack Browser
+
+### Anti-bot stealth: Playwright CDP patches (rebrowser-style)
+
+**What:** Write a postinstall script that patches Playwright's CDP layer to suppress `Runtime.enable` and use `addBinding` for context ID discovery, same approach as rebrowser-patches. Eliminates the `navigator.webdriver`, `cdc_` markers, and other CDP artifacts that sites like Google use to detect automation.
+
+**Why:** Our current stealth patches (UA override, navigator.webdriver=false, fake plugins) work on most sites but Google still triggers captchas. The real detection is at the CDP protocol level. rebrowser-patches proved the approach works but their patches target Playwright 1.52.0 and don't apply to our 1.58.2. We need our own patcher using string matching instead of line-number diffs. 6 files, ~200 lines of patches total.
+
+**Context:** Full analysis of rebrowser-patches source: patches 6 files in `playwright-core/lib/server/` (crConnection.js, crDevTools.js, crPage.js, crServiceWorker.js, frames.js, page.js). Key technique: suppress `Runtime.enable` (the main CDP detection vector), use `Runtime.addBinding` + `CustomEvent` trick to discover execution context IDs without it. Our extension communicates via Chrome extension APIs, not CDP Runtime, so it should be unaffected. Write E2E tests that verify: (1) extension still loads and connects, (2) Google.com loads without captcha, (3) sidebar chat still works.
+
+**Effort:** L (human: ~2 weeks / CC: ~3 hours)
+**Priority:** P1
+**Depends on:** None
+
+### Chromium fork (long-term alternative to CDP patches)
+
+**What:** Maintain a Chromium fork where anti-bot stealth, GStack Browser branding, and native sidebar support live in the source code, not as runtime monkey-patches.
+
+**Why:** The CDP patches are brittle. They break on every Playwright upgrade and target compiled JS with fragile string matching. A proper fork means: (1) stealth is permanent, not patched, (2) branding is native (no plist hacking at launch), (3) native sidebar replaces the extension (Phase 4 of V0 roadmap), (4) custom protocols (gstack://) for internal pages. Companies like Brave, Arc, and Vivaldi maintain Chromium forks with small teams. With CC, the rebase-on-upstream maintenance could be largely automated.
+
+**Context:** Trigger criteria from V0 design doc: fork when extension side panel becomes the bottleneck, when anti-bot patches need to live deeper than CDP, or when native UI integration (sidebar, status bar) can't be done via extension. The Chromium build takes ~4 hours on a 32-core machine and produces ~50GB of build artifacts. CI would need dedicated build infra. See `docs/designs/GSTACK_BROWSER_V0.md` Phase 5 for full analysis.
+
+**Effort:** XL (human: ~1 quarter / CC: ~2-3 weeks of focused work)
+**Priority:** P2
+**Depends on:** CDP patches proving the value of anti-bot stealth first
 ## Completed
 
 ### CI eval pipeline (v0.9.9)

diff --git a/VERSION b/VERSION
@@ -1 +1 @@
-0.21.1
+0.21.2
diff --git a/bin/gstack-learnings-search b/bin/gstack-learnings-search
@@ -43,13 +43,14 @@ if [ ${#FILES[@]} -eq 0 ]; then
 fi
 
 # Process all files through bun for JSON parsing, decay, dedup, filtering
-cat "${FILES[@]}" 2>/dev/null | bun -e "
+GSTACK_SEARCH_TYPE="$TYPE" GSTACK_SEARCH_QUERY="$QUERY" GSTACK_SEARCH_LIMIT="$LIMIT" GSTACK_SEARCH_SLUG="$SLUG" GSTACK_SEARCH_CROSS="$CROSS_PROJECT" \
+cat "${FILES[@]}" 2>/dev/null | GSTACK_SEARCH_TYPE="$TYPE" GSTACK_SEARCH_QUERY="$QUERY" GSTACK_SEARCH_LIMIT="$LIMIT" GSTACK_SEARCH_SLUG="$SLUG" GSTACK_SEARCH_CROSS="$CROSS_PROJECT" bun -e "
 const lines = (await Bun.stdin.text()).trim().split('\n').filter(Boolean);
 const now = Date.now();
-const type = '${TYPE}';
-const query = '${QUERY}'.toLowerCase();
-const limit = ${LIMIT};
-const slug = '${SLUG}';
+const type = process.env.GSTACK_SEARCH_TYPE || '';
+const query = (process.env.GSTACK_SEARCH_QUERY || '').toLowerCase();
+const limit = parseInt(process.env.GSTACK_SEARCH_LIMIT || '10', 10);
+const slug = process.env.GSTACK_SEARCH_SLUG || '';
 
 const entries = [];
 for (const line of lines) {
@@ -67,7 +68,7 @@ for (const line of lines) {
 
     // Determine if this is from the current project or cross-project
     // Cross-project entries are tagged for display
-    e._crossProject = !line.includes(slug) && '${CROSS_PROJECT}' === 'true';
+    e._crossProject = !line.includes(slug) && process.env.GSTACK_SEARCH_CROSS === 'true';
 
     entries.push(e);
   } catch {}

diff --git a/bin/gstack-telemetry-sync b/bin/gstack-telemetry-sync
@@ -122,6 +122,11 @@ case "$HTTP_CODE" in
     # Advance by SENT count (not inserted count) because we can't map inserted back to
     # source lines. If inserted==0, something is systemically wrong — don't advance.
     INSERTED="$(grep -o '"inserted":[0-9]*' "$RESP_FILE" 2>/dev/null | grep -o '[0-9]*' || echo "0")"
+    # Check for upsert errors (installation tracking failures) — log but don't block cursor advance
+    UPSERT_ERRORS="$(grep -o '"upsertErrors"' "$RESP_FILE" 2>/dev/null || true)"
+    if [ -n "$UPSERT_ERRORS" ]; then
+      echo "[gstack-telemetry-sync] Warning: installation upsert errors in response" >&2
+    fi
     if [ "${INSERTED:-0}" -gt 0 ] 2>/dev/null; then
       NEW_CURSOR=$(( CURSOR + COUNT ))
       echo "$NEW_CURSOR" > "$CURSOR_FILE" 2>/dev/null || true

diff --git a/browse/SKILL.md b/browse/SKILL.md
@@ -499,6 +499,9 @@ After `resume`, you get a fresh snapshot of wherever the user left off.
 ## Snapshot Flags
 
 The snapshot is your primary tool for understanding and interacting with pages.
+`$B` is the browse binary (resolved from `$_ROOT/.claude/skills/gstack/browse/dist/browse` or `~/.claude/skills/gstack/browse/dist/browse`).
+
+**Syntax:** `$B snapshot [flags]`
 
 ```
 -i        --interactive           Interactive elements only (buttons, links, inputs) with @e refs
@@ -514,6 +517,12 @@ The snapshot is your primary tool for understanding and interacting with pages.
 All flags can be combined freely. `-o` only applies when `-a` is also used.
 Example: `$B snapshot -i -a -C -o /tmp/annotated.png`
 
+**Flag details:**
+- `-d <N>`: depth 0 = root element only, 1 = root + direct children, etc. Default: unlimited. Works with all other flags including `-i`.
+- `-s <sel>`: any valid CSS selector (`#main`, `.content`, `nav > ul`, `[data-testid="hero"]`). Scopes the tree to that subtree.
+- `-D`: outputs a unified diff (lines prefixed with `+`/`-`/` `) comparing the current snapshot against the previous one. First call stores the baseline and returns the full tree. Baseline persists across navigations until the next `-D` call resets it.
+- `-a`: saves an annotated screenshot (PNG) with red overlay boxes and @ref labels drawn on each interactive element. The screenshot is a separate output from the text tree — both are produced when `-a` is used.
+
 **Ref numbering:** @e refs are assigned sequentially (@e1, @e2, ...) in tree order.
 @c refs from `-C` are numbered separately (@c1, @c2, ...).
 

diff --git a/browse/src/activity.ts b/browse/src/activity.ts
@@ -31,6 +31,7 @@ export interface ActivityEntry {
   result?: string;
   tabs?: number;
   mode?: string;
+  clientId?: string;
 }
 
 // ─── Buffer & Subscribers ───────────────────────────────────────