Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
.env
node_modules/
dist/
browse/dist/
design/dist/
bin/gstack-global-discover
Expand Down
210 changes: 209 additions & 1 deletion CHANGELOG.md

Large diffs are not rendered by default.

11 changes: 10 additions & 1 deletion CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -95,7 +95,8 @@ gstack/
├── cso/ # /cso skill (OWASP Top 10 + STRIDE security audit)
├── design-consultation/ # /design-consultation skill (design system from scratch)
├── design-shotgun/ # /design-shotgun skill (visual design exploration)
├── connect-chrome/ # /connect-chrome skill (headed Chrome with side panel)
├── open-gstack-browser/ # /open-gstack-browser skill (launch GStack Browser)
├── connect-chrome/ # symlink → open-gstack-browser (backwards compat)
├── design/ # Design binary CLI (GPT Image API)
│ ├── src/ # CLI + commands (generate, variants, compare, serve, etc.)
│ ├── test/ # Integration tests
Expand Down Expand Up @@ -167,6 +168,14 @@ When you need to interact with a browser (QA, dogfooding, cookie setup), use the
`mcp__claude-in-chrome__*` tools — they are slow, unreliable, and not what this
project uses.

**Sidebar architecture:** Before modifying `sidepanel.js`, `background.js`,
`content.js`, `sidebar-agent.ts`, or sidebar-related server endpoints, read
`docs/designs/SIDEBAR_MESSAGE_FLOW.md`. It documents the full initialization
timeline, message flow, auth token chain, tab concurrency model, and known
failure modes. The sidebar spans 5 files across 2 codebases (extension + server)
with non-obvious ordering dependencies. The doc exists to prevent the kind of
silent failures that come from not understanding the cross-component flow.

## Vendored symlink awareness

When developing gstack, `.claude/skills/gstack` may be a symlink back to this
Expand Down
84 changes: 76 additions & 8 deletions README.md

Large diffs are not rendered by default.

9 changes: 9 additions & 0 deletions SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -631,6 +631,9 @@ $B css ".button" "background-color"
## Snapshot System

The snapshot is your primary tool for understanding and interacting with pages.
`$B` is the browse binary (resolved from `$_ROOT/.claude/skills/gstack/browse/dist/browse` or `~/.claude/skills/gstack/browse/dist/browse`).

**Syntax:** `$B snapshot [flags]`

```
-i --interactive Interactive elements only (buttons, links, inputs) with @e refs
Expand All @@ -646,6 +649,12 @@ The snapshot is your primary tool for understanding and interacting with pages.
All flags can be combined freely. `-o` only applies when `-a` is also used.
Example: `$B snapshot -i -a -C -o /tmp/annotated.png`

**Flag details:**
- `-d <N>`: depth 0 = root element only, 1 = root + direct children, etc. Default: unlimited. Works with all other flags including `-i`.
- `-s <sel>`: any valid CSS selector (`#main`, `.content`, `nav > ul`, `[data-testid="hero"]`). Scopes the tree to that subtree.
- `-D`: outputs a unified diff (lines prefixed with `+`/`-`/` `) comparing the current snapshot against the previous one. First call stores the baseline and returns the full tree. Baseline persists across navigations until the next `-D` call resets it.
- `-a`: saves an annotated screenshot (PNG) with red overlay boxes and @ref labels drawn on each interactive element. The screenshot is a separate output from the text tree — both are produced when `-a` is used.

**Ref numbering:** @e refs are assigned sequentially (@e1, @e2, ...) in tree order.
@c refs from `-C` are numbered separately (@c1, @c2, ...).

Expand Down
41 changes: 36 additions & 5 deletions TODOS.md
Original file line number Diff line number Diff line change
Expand Up @@ -199,16 +199,22 @@ Sidebar agent writes structured messages to `.context/sidebar-inbox/`. Workspace
**Priority:** P3
**Depends on:** Headed mode (shipped)

### Sidebar agent needs Write tool + better error visibility
### Sidebar agent needs Write tool + better error visibility — SHIPPED

**What:** Two issues with the sidebar agent (`sidebar-agent.ts`): (1) `--allowedTools` is hardcoded to `Bash,Read,Glob,Grep`, missing `Write`. Claude can't create files (like CSVs) when asked. (2) When Claude errors or returns empty, the sidebar UI shows nothing, just a green dot. No error message, no "I tried but failed", nothing.

**Why:** Users ask "write this to a CSV" and the sidebar silently can't. Then they think it's broken. The UI needs to surface errors visibly, and Claude needs the tools to actually do what's asked.
**Completed:** v0.15.4.0 (2026-04-04). Write tool added to allowedTools. 40+ empty catch blocks replaced with `[gstack sidebar]`, `[gstack bg]`, `[browse]`, `[sidebar-agent]` prefixed console logging across all 4 files (sidepanel.js, background.js, server.ts, sidebar-agent.ts). Error placeholder text now shows in red. Auth token stale-refresh bug fixed.

**Context:** `sidebar-agent.ts:163` hardcodes `--allowedTools`. The event relay (`handleStreamEvent`) handles `agent_done` and `agent_error` but the extension's sidepanel.js may not be rendering error states. The sidebar should show "Error: ..." or "Claude finished but produced no output" instead of staying on the green dot forever.
### Sidebar direct API calls (eliminate claude -p startup tax)

**Effort:** S (human: ~2h / CC: ~10min)
**Priority:** P1
**What:** Each sidebar message spawns a fresh `claude -p` process (~2-3s cold start overhead). For "click @e24" that's absurd. Direct Anthropic API calls would be sub-second.

**Why:** The `claude -p` startup cost is: process spawn (~100ms) + CLI init (~500ms-1s) + API connection (~200ms) + first token. Model routing (Sonnet for actions) helps but doesn't fix the CLI overhead.

**Context:** `server.ts:spawnClaude()` builds args and writes to queue file. `sidebar-agent.ts:askClaude()` spawns `claude -p`. Replace with direct `fetch('https://api.anthropic.com/...')` with tool use. Requires `ANTHROPIC_API_KEY` accessible to the browse server.

**Effort:** M (human: ~1 week / CC: ~30min)
**Priority:** P2
**Depends on:** None

### Chrome Web Store publishing
Expand Down Expand Up @@ -846,6 +852,31 @@ Shipped in v0.6.5. TemplateContext in gen-skill-docs.ts bakes skill name into pr
**Effort:** M (human: ~3 days / CC: ~2 hours)
**Priority:** P3

## GStack Browser

### Anti-bot stealth: Playwright CDP patches (rebrowser-style)

**What:** Write a postinstall script that patches Playwright's CDP layer to suppress `Runtime.enable` and use `addBinding` for context ID discovery, same approach as rebrowser-patches. Eliminates the `navigator.webdriver`, `cdc_` markers, and other CDP artifacts that sites like Google use to detect automation.

**Why:** Our current stealth patches (UA override, navigator.webdriver=false, fake plugins) work on most sites but Google still triggers captchas. The real detection is at the CDP protocol level. rebrowser-patches proved the approach works but their patches target Playwright 1.52.0 and don't apply to our 1.58.2. We need our own patcher using string matching instead of line-number diffs. 6 files, ~200 lines of patches total.

**Context:** Full analysis of rebrowser-patches source: patches 6 files in `playwright-core/lib/server/` (crConnection.js, crDevTools.js, crPage.js, crServiceWorker.js, frames.js, page.js). Key technique: suppress `Runtime.enable` (the main CDP detection vector), use `Runtime.addBinding` + `CustomEvent` trick to discover execution context IDs without it. Our extension communicates via Chrome extension APIs, not CDP Runtime, so it should be unaffected. Write E2E tests that verify: (1) extension still loads and connects, (2) Google.com loads without captcha, (3) sidebar chat still works.

**Effort:** L (human: ~2 weeks / CC: ~3 hours)
**Priority:** P1
**Depends on:** None

### Chromium fork (long-term alternative to CDP patches)

**What:** Maintain a Chromium fork where anti-bot stealth, GStack Browser branding, and native sidebar support live in the source code, not as runtime monkey-patches.

**Why:** The CDP patches are brittle. They break on every Playwright upgrade and target compiled JS with fragile string matching. A proper fork means: (1) stealth is permanent, not patched, (2) branding is native (no plist hacking at launch), (3) native sidebar replaces the extension (Phase 4 of V0 roadmap), (4) custom protocols (gstack://) for internal pages. Companies like Brave, Arc, and Vivaldi maintain Chromium forks with small teams. With CC, the rebase-on-upstream maintenance could be largely automated.

**Context:** Trigger criteria from V0 design doc: fork when extension side panel becomes the bottleneck, when anti-bot patches need to live deeper than CDP, or when native UI integration (sidebar, status bar) can't be done via extension. The Chromium build takes ~4 hours on a 32-core machine and produces ~50GB of build artifacts. CI would need dedicated build infra. See `docs/designs/GSTACK_BROWSER_V0.md` Phase 5 for full analysis.

**Effort:** XL (human: ~1 quarter / CC: ~2-3 weeks of focused work)
**Priority:** P2
**Depends on:** CDP patches proving the value of anti-bot stealth first
## Completed

### CI eval pipeline (v0.9.9)
Expand Down
2 changes: 1 addition & 1 deletion VERSION
Original file line number Diff line number Diff line change
@@ -1 +1 @@
0.21.1
0.21.2
13 changes: 7 additions & 6 deletions bin/gstack-learnings-search
Original file line number Diff line number Diff line change
Expand Up @@ -43,13 +43,14 @@ if [ ${#FILES[@]} -eq 0 ]; then
fi

# Process all files through bun for JSON parsing, decay, dedup, filtering
cat "${FILES[@]}" 2>/dev/null | bun -e "
GSTACK_SEARCH_TYPE="$TYPE" GSTACK_SEARCH_QUERY="$QUERY" GSTACK_SEARCH_LIMIT="$LIMIT" GSTACK_SEARCH_SLUG="$SLUG" GSTACK_SEARCH_CROSS="$CROSS_PROJECT" \
cat "${FILES[@]}" 2>/dev/null | GSTACK_SEARCH_TYPE="$TYPE" GSTACK_SEARCH_QUERY="$QUERY" GSTACK_SEARCH_LIMIT="$LIMIT" GSTACK_SEARCH_SLUG="$SLUG" GSTACK_SEARCH_CROSS="$CROSS_PROJECT" bun -e "
const lines = (await Bun.stdin.text()).trim().split('\n').filter(Boolean);
const now = Date.now();
const type = '${TYPE}';
const query = '${QUERY}'.toLowerCase();
const limit = ${LIMIT};
const slug = '${SLUG}';
const type = process.env.GSTACK_SEARCH_TYPE || '';
const query = (process.env.GSTACK_SEARCH_QUERY || '').toLowerCase();
const limit = parseInt(process.env.GSTACK_SEARCH_LIMIT || '10', 10);
const slug = process.env.GSTACK_SEARCH_SLUG || '';
const entries = [];
for (const line of lines) {
Expand All @@ -67,7 +68,7 @@ for (const line of lines) {
// Determine if this is from the current project or cross-project
// Cross-project entries are tagged for display
e._crossProject = !line.includes(slug) && '${CROSS_PROJECT}' === 'true';
e._crossProject = !line.includes(slug) && process.env.GSTACK_SEARCH_CROSS === 'true';
entries.push(e);
} catch {}
Expand Down
5 changes: 5 additions & 0 deletions bin/gstack-telemetry-sync
Original file line number Diff line number Diff line change
Expand Up @@ -122,6 +122,11 @@ case "$HTTP_CODE" in
# Advance by SENT count (not inserted count) because we can't map inserted back to
# source lines. If inserted==0, something is systemically wrong — don't advance.
INSERTED="$(grep -o '"inserted":[0-9]*' "$RESP_FILE" 2>/dev/null | grep -o '[0-9]*' || echo "0")"
# Check for upsert errors (installation tracking failures) — log but don't block cursor advance
UPSERT_ERRORS="$(grep -o '"upsertErrors"' "$RESP_FILE" 2>/dev/null || true)"
if [ -n "$UPSERT_ERRORS" ]; then
echo "[gstack-telemetry-sync] Warning: installation upsert errors in response" >&2
fi
if [ "${INSERTED:-0}" -gt 0 ] 2>/dev/null; then
NEW_CURSOR=$(( CURSOR + COUNT ))
echo "$NEW_CURSOR" > "$CURSOR_FILE" 2>/dev/null || true
Expand Down
9 changes: 9 additions & 0 deletions browse/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -499,6 +499,9 @@ After `resume`, you get a fresh snapshot of wherever the user left off.
## Snapshot Flags

The snapshot is your primary tool for understanding and interacting with pages.
`$B` is the browse binary (resolved from `$_ROOT/.claude/skills/gstack/browse/dist/browse` or `~/.claude/skills/gstack/browse/dist/browse`).

**Syntax:** `$B snapshot [flags]`

```
-i --interactive Interactive elements only (buttons, links, inputs) with @e refs
Expand All @@ -514,6 +517,12 @@ The snapshot is your primary tool for understanding and interacting with pages.
All flags can be combined freely. `-o` only applies when `-a` is also used.
Example: `$B snapshot -i -a -C -o /tmp/annotated.png`

**Flag details:**
- `-d <N>`: depth 0 = root element only, 1 = root + direct children, etc. Default: unlimited. Works with all other flags including `-i`.
- `-s <sel>`: any valid CSS selector (`#main`, `.content`, `nav > ul`, `[data-testid="hero"]`). Scopes the tree to that subtree.
- `-D`: outputs a unified diff (lines prefixed with `+`/`-`/` `) comparing the current snapshot against the previous one. First call stores the baseline and returns the full tree. Baseline persists across navigations until the next `-D` call resets it.
- `-a`: saves an annotated screenshot (PNG) with red overlay boxes and @ref labels drawn on each interactive element. The screenshot is a separate output from the text tree — both are produced when `-a` is used.

**Ref numbering:** @e refs are assigned sequentially (@e1, @e2, ...) in tree order.
@c refs from `-C` are numbered separately (@c1, @c2, ...).

Expand Down
1 change: 1 addition & 0 deletions browse/src/activity.ts
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ export interface ActivityEntry {
result?: string;
tabs?: number;
mode?: string;
clientId?: string;
}

// ─── Buffer & Subscribers ───────────────────────────────────────
Expand Down
Loading
Loading