Created by Michael Olmos · MIT licensed
Tsaagan gives your AI agent real hands on the web — so it can finish the browser work you'd otherwise stop and do yourself (log in, click through, fill forms, pull data from behind a login), on the tasks you're authorized to do. (Full mission below.)
Human-like browser control for AI agents. Tsaagan lets an AI drive a real browser the way a person does — perceive the page, click, type, navigate, handle logins, popups, iframes, downloads — and, crucially, verify that each action actually worked before moving on.
The mental model: your agent is the brain, Tsaagan is the hands. Whatever AI drives it (Claude, a Codex/ChatGPT CLI, Cursor, or any model) supplies intent and judgment; Tsaagan supplies reliable perception → action → verification → memory. The agent decides what; Tsaagan reliably does how, and proves it worked.
Tsaagan gives your AI agent real hands on the web. It lets an agent autonomously do the browser work that today still makes you stop and take over — logging in, clicking through, filling forms, pulling data from behind a login — and gives that time back to you.
One-liner: Tsaagan is the part of your agentic workflow that actually drives the browser — reliably, like a human, on the tasks you're authorized to do.
You can think with an agent, write with an agent, code with an agent — but the moment the work needs a browser ("log in here, click through there, grab this, submit that"), the agent stalls and you become the hands. Tsaagan closes that gap. The brain (Claude, Codex, a small open model, or a person) supplies intent and judgment; Tsaagan supplies reliable perception → action → verification → memory.
Three ideas make it click:
- It hovers, then strikes. Like the bird — patient perception, then one exact move. It reads the page's accessibility tree, picks the right element, acts, and then proves the action worked from real page state (URL changed? expected text? errors?) instead of clicking and hoping. That verify step is the difference between "automation that mostly works" and automation you can trust unattended.
- Lightest tool that does the job. If a site has an API, use the API. If not, drive the real browser like a human (trusted input). Only fall back to raw automation for the broad middle of the web.
- It remembers and improves. It keeps an inspectable, file-based memory of the sites it works — learned selectors, what went wrong last time — so it gets better at your recurring tasks.
- Turns "every week I log into X, click through, and copy Y" into "my agent does it and reports back what it verified."
- Works on real, logged-in sites — dashboards, portals, tools with no API — where
a scraper or
WebFetchcan't go. - Keeps you in control: it stops and hands back at CAPTCHAs, and flags consequential, irreversible actions (buy, send, delete). Empowerment, not autopilot-into-a-wall.
Tsaagan is not a tool for defeating security, solving CAPTCHAs, evading bans, or doing anything against a site's Terms of Service. It is for builders and creators who want their agents to finish the job on work they're already authorized to do — and it's honest about its limits rather than overselling. Only automate accounts and systems you own or are authorized to access. When it meets a CAPTCHA or an anti-abuse wall, it stops and hands control back to you. See the Acceptable Use Policy and the disclaimer — you are solely responsible for how you use it.
I kept hitting the same wall in my own agentic workflows: you can reason, write, and code with an agent — but the moment a task needs a browser (log in, click through, pull this, file that), the agent stalls and a human has to take over. That handoff is where the time goes.
Tsaagan exists to give your AI agents real hands — so they can autonomously do that web work and give the time back to you. It's for builders and creators who want their agents to actually finish the job, not just plan it.
And it's built on a clear ethic: empowerment, not abuse. Tsaagan deliberately does not defeat CAPTCHAs, asks for human confirmation on consequential actions, stores secrets only in your OS keychain, keeps everything local, and respects site Terms of Service. It exists to extend what you can build — not to create risk. That ethic is encoded in SOUL.md and enforced in AGENTS.md.
Use it on tasks you're authorized to do. Tsaagan is for automating your own accounts and authorized work — not for unauthorized access or circumventing security. Please read the Acceptable Use Policy and the disclaimer.
— Michael Olmos, creator
- API layer — if a site has an official API, Tsaagan calls it directly with a key kept in your Keychain (fastest and most reliable). The browser layers can even set the key up for you. → docs/API.md
- Real-browser layer —
extension(any OS) /native(macOS): drives your real Chrome with trusted input, no debug port — for sites that ignore synthetic input. → docs/EXTENSION.md - CDP layer — Playwright on the real Chrome binary for the broad middle of the web.
Plus an agent layer (identity + brain memory) that makes Tsaagan a self-improving agent, not just a tool. → docs/AGENT.md
Tsaagan doesn't just remember — it learns from each run and feeds the lesson back:
- Reflect: a successful run is stored as a reusable procedure; a failure becomes
a one-line lesson (via an LLM reflection); an anti-abuse wall flips the domain to
pace: human. Accumulated lessons are synthesized into durable rules. - Feed back: every
snapshotcarries amemoryblock (rules, lessons, pace, success track) — so the driving agent (Claude, Codex, or any LLM) adapts before it acts. - Research-grade recall: ranked by relevance (embeddings) × importance × recency,
not substring match. Zero-dependency by default; real neural embeddings if
TSG_EMBED_URLor@xenova/transformersis present. - Look-ahead + evaluator gate: it imagines the outcome before irreversible actions, and only stores a procedure when success is validator- and structurally-verified.
Memory lives on your machine (~/.tsaagan/brain.db) — not in a model, not in the cloud.
→ docs/AGENT.md
Most browser-agent stacks fail for the same two reasons: they ground actions on ephemeral element ids that break the moment the page changes, and they trust the model's guess that an action succeeded. Tsaagan fixes both:
- Stable, self-healing grounding. Elements are addressed by accessibility-tree refs; if a ref goes stale, Tsaagan re-snapshots and re-locates the same element by role + name automatically.
- Structural verification. After every action Tsaagan returns a
verifyblock built from real page state — URL delta, expected text, new console errors, failed network requests — so the agent knows whether it worked.
It's three things in one:
- A CLI + daemon you (or a coding agent) drive verb-by-verb.
- A library (
agent.js) with an autonomous planner→navigator→validator loop. - A standalone server you hand goals to over HTTP — no agent framework required.
| Tsaagan | |
|---|---|
| Perception | Accessibility tree (primary) + vision Set-of-Marks fallback for canvas/visual UIs |
| Grounding | Stable refs → self-heal on staleness. Never raw coordinates, never brittle XPath as primary |
| Verification | Structural post-conditions every action (URL / text / console / network) |
| Hard cases | iframes, shadow DOM, popups, cookie banners, file upload/download, multi-tab |
| Memory | Learns each site — durable selectors persisted per domain, replayed across sessions |
| Auth | Cloned/real Chrome profile, one-shot login, TOTP 2FA (RFC 6238) |
| Trusted input | isTrusted=true clicks/keystrokes via native (macOS) / extension modes — for sites that ignore synthetic input |
| Autonomy | Planner → navigator → validator loop; runs headless or as an HTTP goal server |
The browser-agent space is crowded — and Tsaagan is the new, small project here. It doesn't try to out-scale browser-use or out-distribute Microsoft's playwright-mcp. It wins on a different axis: reliability you can audit. The one capability every other tool lacks — structural proof that each action worked — is Tsaagan's core.
| Capability | Tsaagan | browser-use | stagehand | playwright-mcp | skyvern |
|---|---|---|---|---|---|
| Verify-first (post-condition proof per action) | ✅ core | ❌ | ❌ | ❌ | |
| API-first layer (skip the browser when a site has an API) | ✅ | ❌ | ❌ | ❌ | ❌ |
| Self-healing stable a11y refs | ✅ | ❌ | |||
| OS-keychain vault + TOTP/2FA | ✅ | ❌ | ❌ | ❌ | |
| Cross-session site memory (semantic recall) | ✅ | ❌ | ❌ | ❌ | |
| MCP server | ✅ | ❌ | ✅ | ✅ | ❌ |
| Trusted input, no CDP debug port (native + MV3) | ✅ | ❌ | ❌ | ❌ | ❌ |
| Vision fallback (Set-of-Marks) | ✅ | ✅ | ✅ | ❌ | ✅ |
| Local / offline capable | ✅ | ✅ | ❌ | ✅ | ❌ |
| Model-agnostic (bring your own LLM) | ✅ | ✅ | ✅ | ✅ | |
| Language | Node.js | Python | TypeScript | TypeScript | Python |
| License | MIT | MIT | MIT | Apache-2.0 | AGPL-3.0 |
| Published WebVoyager score | ✅ ~89% | ❌ | ✅ ~86% | ||
| GitHub stars (Jun 2026) | early | ~99k | ~23k | ~34k | ~22k |
| Cloud / hosted option | ❌ self-host | ❌ Browserbase | ❌ | ✅ |
✅ supported ·
Honest notes. Tsaagan is young — fewer stars and no hosted option, and its benchmark number is still being run (published scores across tools use different WebVoyager task subsets and aren't directly comparable). What it uniquely offers: proof-of-success on every action, an API-first fast path, a local credential vault, and trusted-input modes that need no debug port.
Tsaagan is one engine you can drive several ways:
- The tool — a CLI (
tsaagan.js) talking to a background daemon that holds a single persistent Playwright page. This is the engine; everything else wraps it. - As a skill — a short instruction file that teaches your AI how to drive the tool (the observe → act → verify loop). "Use the Tsaagan skill" → the AI reads it and drives Tsaagan step by step. (A ready-made skill for Claude Code is included.)
- As an agent — a dispatchable worker that bundles the skill + the tool, so you can hand it a goal and it runs the loop on its own.
- As an MCP server or SDK —
tsaagan mcpexposes the verify-first verbs to Claude Desktop / Claude Code / Cursor, and the JS + Python SDKs call the same daemon directly. (See Use it from Claude/Cursor (MCP) and Use it from your own code above.)
Who is the "brain"? Whatever drives it:
| You drive it with… | Brain | API key needed |
|---|---|---|
| Claude Code (the skill/agent) | your Claude session | none — your Claude plan |
| Codex / ChatGPT CLI, Cursor, etc. | that tool's model | that tool's |
tsaagan run / tsaagan serve (autonomous) |
any OpenAI-compatible LLM — Groq · OpenRouter · OpenAI · Google Gemini · Anthropic · xAI Grok · local | a key for your chosen provider |
Tsaagan itself makes no LLM calls when an agent drives it — it's pure
perception + action + verification. Only the autonomous tiers call an LLM, and that
LLM is your choice: Groq (default, fast), OpenRouter for open-source models
(Llama, Qwen, DeepSeek, Mistral), the enterprise vendors — OpenAI, Google Gemini,
Anthropic, xAI Grok (each via its OpenAI-compatible endpoint) — or any
OpenAI-compatible endpoint (Together, vLLM, Ollama for fully local) via
TSG_LLM_BASE_URL. Pick one with TSG_LLM_PROVIDER=groq|openrouter|openai|google|anthropic|xai.
See docs/AGENT.md.
Tsaagan is just a terminal program, so any agent that can run shell commands can drive it. Tell your agent something like:
"Use Tsaagan (
node tsaagan.js) to log into example.com and download my invoices. Snapshot, act on refs, and check theverifyblock after each step."
It will run tsaagan start, tsaagan snapshot, tsaagan click ref=…, read the
JSON results, and proceed. For Claude Code, the bundled skill/agent make this
automatic.
git clone https://github.com/michaelolmos/tsaagan.git && cd tsaagan
npm install
npx playwright install chromium chromium-headless-shellnode tsaagan.js start # launch browser (fresh isolated profile)
node tsaagan.js goto url=https://example.com expectText="Example"
node tsaagan.js snapshot # accessibility tree with [ref=eN]
node tsaagan.js click ref=e6 expectText="IANA"
node tsaagan.js type ref=e3 text="hello" submit=true
node tsaagan.js stopAdd an alias: alias tsaagan="node $PWD/tsaagan.js".
Tsaagan ships a built-in MCP server, so any Model Context Protocol
host can drive the browser directly. Unlike other browser MCP servers, every mutating
tool returns Tsaagan's verify block (URL changed? console errors? expected text?) —
proof the action worked, in the same response, with no extra snapshot round-trip.
Claude Code:
claude mcp add tsaagan -- node /path/to/tsaagan/tsaagan.js mcp
# or, if you've installed the `tsaagan` bin globally: claude mcp add tsaagan -- tsaagan mcpClaude Desktop (claude_desktop_config.json) or Cursor (~/.cursor/mcp.json):
{
"mcpServers": {
"tsaagan": { "command": "node", "args": ["/path/to/tsaagan/tsaagan.js", "mcp"] }
}
}The server auto-starts a headless daemon on first use. It exposes 21 verify-first
tools — tsaagan_navigate, tsaagan_snapshot, tsaagan_click, tsaagan_fill_form,
tsaagan_extract, tsaagan_network (discover a site's own data API), and more.
Set TSAAGAN_HEADLESS=0 to watch it work, or TSG_TOKEN to lock the control plane on
shared machines. See docs/PROTOCOL.md.
Prefer calling Tsaagan programmatically? Both clients return a verify block on
every call — data and proof, together (the thing other tools make you assert yourself).
JavaScript / TypeScript (sdk/, zero dependencies):
import { createTsaagan } from 'tsaagan/sdk'; // or '/path/to/tsaagan/sdk/index.mjs'
const k = await createTsaagan(); // auto-starts a headless daemon
await k.goto('https://example.com', { expectText: 'Example Domain' });
const r = await k.extract('the page heading');
console.log(r.data, r.verify); // data + { urlChanged, expectTextFound, newConsoleErrors, ... }
await k.stop();Python (pip install tsaagan, zero dependencies):
from tsaagan import Tsaagan
k = Tsaagan() # auto-starts a headless daemon
k.goto("https://example.com", expect_text="Example Domain")
r = k.extract("the page heading")
print(r.data, r.verify)
k.stop()The Python client needs the Node daemon reachable — set TSAAGAN_JS=/path/to/tsaagan.js
or put the tsaagan binary on PATH. See clients/python/.
export GROQ_API_KEY=... # the autonomous brain (fast default)
# …or any other provider — name it and supply that provider's key:
export OPENROUTER_API_KEY=... TSG_LLM_PROVIDER=openrouter # open-source models
export OPENAI_API_KEY=... TSG_LLM_PROVIDER=openai # GPT-4o / o-series
export GEMINI_API_KEY=... TSG_LLM_PROVIDER=google # Gemini
export ANTHROPIC_API_KEY=... TSG_LLM_PROVIDER=anthropic # Claude
export XAI_API_KEY=... TSG_LLM_PROVIDER=xai # Grok
export TSG_LLM_BASE_URL=http://localhost:11434/v1/chat/completions # Ollama, fully local
node tsaagan.js run goal="go to news.ycombinator.com and report the #1 story title"
node tsaagan.js journal # what it has done before (persistent memory)Per-role model overrides work with any provider: TSG_PLANNER_MODEL,
TSG_NAV_MODEL, TSG_VALIDATOR_MODEL (e.g. point the planner at a stronger model
for reasoning while a fast model handles navigation).
node tsaagan.js serve port=39820
curl -s localhost:39820/goal -H 'content-type: application/json' -d '{"goal":"...","max":16}'node tsaagan.js bench # autonomous task suite + deterministic capability matrixLatest run (Groq brain, 4-task suite):
| Metric | Result |
|---|---|
| Task success | 3/4 (75%) |
| Self-heal (recover from stale refs) | ✅ pass |
| Vision Set-of-Marks | ✅ pass |
| Structural verify (post-conditions) | ✅ pass |
A directly-comparable WebVoyager score is in progress — see bench/METHODOLOGY.md for exactly how it will be run and reported honestly (homepage-start, no post-hoc judge overrides, published task set).
| Command | Description |
|---|---|
start [mode=fresh|clone|live] [headless=true] [port=] |
launch / attach a browser |
status · stop |
state · shut down |
snapshot [full=true] · snapshot mode=vision |
a11y tree with refs · numbered visual overlay |
goto url=.. [expectText=..] |
navigate + verify |
click ref=eN|som=N|selector=..|text=.. [key=..] [expectText=..] [expectGone=..] [expectUrl=..] |
click + verify + self-heal |
type ref=eN text=".." [key=..] [submit=true] [expectText=..] |
clear + type |
select · press keys=".." [expectText=..] · scroll [direction=|to_text=] · wait_for [..] |
interaction |
upload_file ref=|selector= path=a,b · handle_dialog accept=true · dismiss_overlays · downloads |
hard cases |
extract [query=..] · screenshot [path=] [fullPage=true] |
read page |
tabs · switch_tab index=N · new_tab url=.. · close_tab · back · forward |
tabs / history |
remember key=.. selector=.. [note=..] · recall · forget |
learn a site (per-domain memory) |
login userSelector=.. passSelector=.. [submitSelector=..] [totpSelector=.. totpSecret=..] |
one-shot login + 2FA |
totp secret=$TSG_TOTP_SECRET · detect_captcha |
2FA code · detect a CAPTCHA/anti-abuse wall → hand off to a human |
paste text=".." [ref=|selector=] [submit=true] · type_human · key_human keys="return" |
OS-level input (macOS) — trusted (isTrusted=true) keystrokes/paste for sites that ignore synthetic input |
pace set=human|slow|fast · click_xy x= y= · keychain service= account= |
cadence governor · coordinate click · macOS Keychain secret |
record_start name=.. · record_stop [path=..] · replay path=.. · report [format=json|md] |
record/replay verified sequences · export audit reports |
doctor · protocol |
local diagnostics · JSON Schema / TypeScript action protocol paths |
run goal=".." [max=16] · serve [port=] · journal [n=] · bench |
autonomy |
network [filter=] [limit=] |
captured XHR/fetch + responses (discover a site's internal API) |
cookies [set=<json>] |
export / import cookies (session reuse) |
pdf [path=] · assert [text=] [url=] [selectorVisible=] |
save page PDF · explicit verification check |
eval js=".." · console_log [limit=] |
escape hatch · debugging |
Start flags also include proxy=http://user:pass@host:port (e.g. a corporate proxy), channel=chrome\|chromium, pace=fast\|slow\|human (politeness), timezone=.
| Command | Description |
|---|---|
vault set service=.. secret=.. [account=..] · vault get/list/delete |
Keychain-encrypted key store (secrets never hit disk) |
api service=.. path=/.. [method=] [body=] · api detect=<url> · api providers |
call an official API with a stored key (Layer 1) |
ext-setup [profile=fresh|clone] [browser=chrome] |
zero-step extension mode: launches Chrome for Testing with the companion extension auto-loaded + starts mode=extension. (browser=chrome targets your branded Chrome instead — needs a one-time manual load, since Chrome 137+ blocks --load-extension.) |
brain · brain recall query=.. · brain advise domain=.. · brain synthesize domain=.. |
evolving SQLite memory: stats · semantic recall · what-it-knows · compress lessons→rules |
advise [domain=..] [query=..] |
what Tsaagan learned about a site (also auto-surfaced in every snapshot.memory) |
journal [n=] |
recent autonomous-run history |
tsaagan record_start name=weekly-report
tsaagan goto url=https://example.com expectText="Example"
tsaagan click ref=e6 expectText="IANA"
tsaagan record_stop # writes ~/.tsaagan/records/*.json
tsaagan replay path=/Users/me/.tsaagan/records/...
tsaagan report format=md # writes ~/.tsaagan/reports/*.mdRecording captures successful actions with structural verify evidence and stores
stable role/name targets when a step was originally addressed by an a11y ref, so
replays are less tied to one transient browser snapshot. Reports summarize actions,
failures, verification blocks, failed requests, and human-handoff moments for audit
or debugging.
tsaagan doctor # local install/config health check
tsaagan protocol # paths to protocol/actions.schema.json + protocol/tsaagan.d.tsdoctor does not start a browser; it checks local prerequisites, extension files,
vault backend availability, daemon reachability, and shared-host token posture.
ref (from a fresh snapshot) → selector (stable CSS) → text → som (vision). Never coordinates.
- fresh (default) — isolated real Chrome (
channel: chrome) with its own profile. CDP. - clone / live — connect over CDP to a Chrome started with
--remote-debugging-port=9222. - native (macOS) — real Chrome, no CDP (AppleScript + OS-level input). For sites that ignore synthetic input. See "Reliability" above.
- extension (any OS) — Chrome via a companion extension (
chrome.debuggerInput → isTrusted=true, viewport coords, no debug port, no coordinate math). One command (ext-setup), zero manual steps; E2E-verified — a page-side listener readsevent.isTrusted === true. The most reliable mode on sites that ignore synthetic input. See docs/EXTENSION.md.
Brain (autonomous mode, pick one): a provider key — GROQ_API_KEY ·
OPENROUTER_API_KEY · OPENAI_API_KEY · GEMINI_API_KEY/GOOGLE_API_KEY ·
ANTHROPIC_API_KEY · XAI_API_KEY/GROK_API_KEY — or TSG_LLM_BASE_URL
(+ TSG_LLM_API_KEY) for any OpenAI-compatible endpoint · TSG_LLM_PROVIDER=groq|openrouter|openai|google|anthropic|xai
to name one · model overrides TSG_PLANNER_MODEL / TSG_NAV_MODEL / TSG_VALIDATOR_MODEL
(legacy GROQ_*_MODEL still honored).
Other: TSG_USER / TSG_PASS / TSG_TOTP_SECRET (login) · TSG_PORT (daemon port) ·
TSG_EMBED_URL (optional embedding endpoint for semantic recall). State lives in ~/.tsaagan/.
tsaagan.js ── thin CLI ──▶ daemon.js (one persistent Playwright page; HTTP control plane)
│ perceive · ground · act · VERIFY · self-heal
agent.js ── planner→navigator→validator loop (LLM-agnostic via lib/llm.js) + task memory
run.js ── one-shot autonomous run server.js ── standalone HTTP goal server
bench.js ── capability benchmark
The daemon holds a single long-lived page so accessibility-tree refs stay valid across discrete CLI calls, and verification/self-heal run server-side with full page state.
Tsaagan has three ways to drive the browser. The reason for more than one is purely
correctness: a DOM event has an isTrusted flag that's true only for genuine
user input, and some well-built sites ignore synthetic clicks/keystrokes for their own
integrity — so your authorized automation silently does nothing. Native and extension
modes deliver isTrusted=true input so your agent's actions actually take effect.
CDP mode (default) — Playwright driving Chrome. Fast, full-featured, parallel,
works on the vast majority of the web (dashboards, SaaS, internal tools). Uses the
real Chrome binary. Input here is isTrusted=false, so the rare strict site ignores
it — that's what the other two modes are for.
Native mode (macOS) — tsaagan start mode=native. Drives your real Chrome with
no debug port: perception via AppleScript, input via real OS-level mouse/keyboard
(cliclick) → isTrusted=true. Trade-offs: macOS-only · slower · single front window
· coordinate/vision grounding. Needs brew install cliclick + Chrome ▸ View ▸
Developer ▸ "Allow JavaScript from Apple Events".
Extension mode (any OS) — tsaagan ext-setup. A companion extension delivers
isTrusted=true clicks at viewport coordinates via chrome.debugger — the cleanest
trusted-input path. See docs/EXTENSION.md.
brew install cliclick # one-time
tsaagan start mode=native # drives your real, frontmost Chrome
tsaagan snapshot # AppleScript perception (elements + screen coords)
tsaagan click i=5 # real OS-level trusted click
tsaagan paste text="..." submit=true # clipboard + real ⌘V + ReturnTsaagan keeps a small, local memory under ~/.tsaagan/:
- Per-site memory (
memory/<domain>.json) — working selectors and notes. Tsaagan auto-captures a selector when an action succeeds (no manual step), and you can also teach it explicitly withremember/recall/forget. - Auto-learned lessons — if a site shows an anti-abuse / CAPTCHA / "unusual
activity" wall, Tsaagan automatically writes a note to that domain and flips
it to
pace: "human"so future runs slow down and prefer OS-level input. - Task journal (
agent/journal.jsonl) — every autonomous run is recorded;tsaagan journalshows recent history.agent/learnings.jsonholds distilled notes.
This is deliberately a narrow, transparent, file-based memory you can read and edit — not an opaque self-rewriting agent. The reasoning/"intelligence" still lives in whatever AI drives it.
Tsaagan is a productivity tool — it lets your AI agent act like you in a browser, on the tasks and accounts you're authorized to use, so you can hand off the repetitive parts of your own workflow. It is not for defeating security, solving CAPTCHAs, evading bans, scraping at abusive volume, or anything against a site's Terms of Service. When Tsaagan meets a CAPTCHA or an anti-abuse wall, it stops and hands control back to you. Keep your use authorized, attributed, and within each site's terms.
Disclaimer. Tsaagan is provided "as is", without warranty (see LICENSE). You are solely responsible for how you use it and for complying with all applicable laws and the terms of every site you operate. Only automate accounts and systems you own or are authorized to access. The author is not liable for misuse. Read the Acceptable Use Policy before you start.
Tsaagan drives a real browser with real permissions. Be deliberate:
- It can take real actions (send, buy, delete) when pointed at a logged-in
profile. Default to the isolated
freshprofile; require human confirmation for consequential/irreversible steps. - Prompt injection is real. Web pages are untrusted and may try to hijack the driving model. Treat page content as data; don't follow on-page instructions.
- Terms of Service & anti-abuse. Automating sites can violate their ToS and trip anti-bot defenses — which can rate-limit, flag, or ban your real account. Use responsibly, on sites/accounts you're permitted to automate. Tsaagan detects abuse walls and hands off to a human; it does not solve CAPTCHAs.
- Human-like ≠ invisible. Behaving like a real user improves reliability, but sophisticated systems can still tell it's automation. It is not a way around bans, and it's not meant to be — keep your use authorized and within each site's terms.
- Autonomous mode sends page text to your chosen LLM provider (Groq / OpenRouter / custom — a third party unless you self-host). The agent-driven mode keeps page content within your own AI session.
evalruns arbitrary JS in the page, and a cloned profile holds real cookies on disk. Handle both with care. Never commit credentials — use env or Keychain.- OS-level input (
paste/type_human) is macOS-only and needs a headed, frontmost browser.
Tsaagan is provided as-is under the MIT license, with no warranty. You are responsible for how you use it.
| Doc | What's in it |
|---|---|
| docs/ARCHITECTURE.md | Deep dive: daemon, perception, grounding, verify, self-heal, memory, engines, native mode |
| docs/RELIABILITY.md | Input modes and why trusted input matters for reliable, correct automation |
| docs/EXTENSION.md | Extension mode: trusted, cross-platform, no-coordinate-math browser control |
| docs/API.md | API layer: Keychain vault + authenticated calls + browser-bootstrapped key setup |
| docs/AGENT.md | Agent layer: SOUL/AGENTS/HEARTBEAT identity + SQLite brain memory & evolution |
| docs/PROTOCOL.md | JSON action protocol, schema, and TypeScript declarations |
| docs/RECIPES.md | Copy-paste recipes: login + 2FA, dashboards, API discovery, sites that reject naive automation |
| docs/BUILD_JOURNAL.md | How Tsaagan was built — the research, decisions, and every feature wave |
| ROADMAP.md | What's next |
| CHANGELOG.md · CONTRIBUTING.md · SECURITY.md | Versions · how to contribute · security policy |
Created by Michael Olmos. Contributions welcome — see CONTRIBUTING.md.
MIT — see LICENSE.

