An agent monitoring hub for macOS. Connect your apps, schedule monitors, and let Claude watch them and take action — your inbox, your ads, your Instagram DMs, your revenue, all in one native, glassmorphic window.
ServicePilot v2 — rebuilt from scratch around agents instead of dev services.
- Describe a workflow — don't want to fill in a form? Type what you want in plain English ("every weekday at 9am, sweep Slack and Gmail and build my task list, and draft replies I can approve") and Claude assembles the whole monitor for you — which apps to read, the schedule, the instruction, notify-vs-auto, and any cross-app actions — then opens the editor pre-filled for you to review and create.
- Integrations — connect the apps you want watched (Gmail, Meta Ads, Instagram, Slack, Stripe, Sentry, Calendar, or any custom HTTP endpoint).
- Monitors — a monitor is a schedule + one or more apps + a plain-English instruction. Every tick, Sentinel pulls fresh context, hands it to Claude, and either notifies you or acts automatically (reply, pause, retry…). Pick several sources and it becomes a multi-source review — one run that sweeps all of them into a single briefing (e.g. a 9am "what needs me" pass across Slack + email that builds your task list).
- Cross-app actions — a monitor can be granted typed write capabilities on other connected apps. Watch your inbox → draft a reply in Gmail → ping yourself in Slack to review it, all from one monitor. Claude reasons freely but can only act through the capabilities you grant, and each write carries its own safety setting (e.g. Gmail draft-vs-send per monitor).
- Activity — a live timeline of everything your agents have observed and done.
- Tasks — a prioritized to-do list monitors write to, sorted by who has to act: Needs you, Claude can handle (often with a one-click "Let Claude do it"), and FYI.
- Claude execution engine — each run is a real Claude call; auto-act mode lets it take the action and log it.
- Tauri 2 + a real
NSVisualEffectView(window vibrancy) — the glass is the OS compositor, not a CSS fake. - Hidden title bar, inset traffic lights, draggable chrome, SF typography, aurora wash, frosted panels.
src/ React 19 + Vite + Tailwind 4 — the glass UI
views/ Dashboard · Integrations · Monitors · Activity · Tasks · Memory · Settings
lib/api.ts invoke() wrappers with a browser-mock fallback (runs in plain Vite too)
src-tauri/
src/db.rs SQLite (rusqlite) — integrations, monitors, runs, settings, memories, tasks
src/scheduler.rs tokio loop that runs due monitors + periodic memory upkeep
src/claude.rs Anthropic Messages API client (structured monitor verdicts)
src/memory.rs tiered memory: retrieval, promotion, consolidation
src/tasks.rs task list: lanes, parsing agent-written to-dos, dedup
src/connectors/ connector catalog + credential schemas + live API calls
src/commands.rs Tauri command surface
Native macOS app (real vibrancy):
npm install
npm run app # tauri devBrowser preview of the UI (no native chrome, uses the mock backend):
npm run dev # http://localhost:1420Each connector declares the credentials it needs (src-tauri/src/connectors/mod.rs) and makes a real, read-only API call to gather context (src-tauri/src/connectors/live.rs). Add an app under Integrations, paste credentials, and hit Test connection to validate them with a live call before saving.
| Connector | Auth | What it reads |
|---|---|---|
| Custom HTTP | URL + optional Authorization |
GET/POST any endpoint, hands the response to the agent |
| Stripe | secret/restricted key (sk_/rk_) |
balance, recent charges, failed payments, open disputes |
| Sentry | auth token + org slug | unresolved issues in the last 24h, ranked by frequency |
| CloudWatch | ambient AWS (local aws CLI) |
recent log-group events matching an error filter pattern |
| Slack | bot token (channels optional) | recent messages + @here/@channel pings across every channel the bot is in — one connection, not one per channel |
| GitHub | PAT (+ optional repo) | review requests, notifications, open PRs |
| Meta Ads | Graph token + act_ id |
7-day spend, ROAS, CTR/CPC |
| Graph token + IG user id | followers, recent posts, engagement | |
| Gmail | Google OAuth token | messages matching a search filter |
| Calendar | Google OAuth token | events in the next 24h |
Token-based connectors (Stripe, Sentry, Slack, GitHub, Meta) work the moment you paste a key. Google connectors take an OAuth access token (e.g. from the OAuth 2.0 Playground); a full hosted OAuth flow slots into the same live.rs functions. CloudWatch stores no secret — it shells out to your local aws CLI and uses whatever credentials are active (profile / SSO session), so keep that signed in (aws sso login).
At the top of Monitors is a composer: tell Claude what you want in plain English and it builds the monitor for you. Sentinel hands Claude your connected integrations and the exact actions each one can be granted, and Claude returns a structured config — name, source app(s), schedule (it picks cron for "weekdays at 9am", an interval otherwise), the agent instruction, notify-vs-auto, and any cross-app grants. Every id and capability it returns is validated server-side against your real integrations before it reaches the UI: a source app or action that isn't actually connected and capable is dropped, never silently invented (commands::draft_monitor → normalize_draft). The draft opens the normal monitor editor pre-filled — with a one-line note from Claude on what it set up and any gaps (e.g. an app you mentioned that isn't connected, or that it chose draft/notify for safety) — so you review, tweak, and hit Create. Nothing is saved until you do; it always defaults to the safe choice (notify, draft) when your intent is ambiguous. Needs the local Claude CLI, same as a run.
Monitors run for real — no sample or demo data. Two things are required:
- Credentials per integration — added under Integrations; a card is badged Live once its required credentials are present. Without them a monitor's run errors and tells you to add them.
- The Claude CLI — Sentinel reasons over each monitor through your local
claudeCLI (claude -p), using your existing Claude Code sign-in, so no Anthropic API key is needed. Install it withnpm i -g @anthropic-ai/claude-codeand runclaudeonce to sign in; Settings shows whether it's detected. Each tick pulls fresh context from the connector and hands it to Claude; if the CLI isn't found the run errors honestly rather than fabricating a verdict.
When a run is an alert, Sentinel fires a native macOS notification. In auto mode it also executes the agent's plan — but only the typed actions you granted the monitor, against the integrations you named. Anything the agent proposes outside that allow-list is dropped and logged, never run.
A monitor's Actions this monitor can take section lets you grant write capabilities from any connected app:
| Connector | Capability | Safety |
|---|---|---|
| Gmail | gmail.reply — draft or send a reply to a watched thread |
per-monitor draft (default) or send |
| Slack | slack.send_message — post a message (e.g. "go review 2 drafts") |
channel per grant, or the integration's channel |
| Custom HTTP | custom.post_webhook — POST the briefing to your endpoint |
— |
| GitHub | github.open_fix_pr — spawn a Claude coding agent that fixes the error and opens a PR |
repo_path per grant; runs with full permissions in an isolated git worktree of that repo |
The classic recipe — watch inbox → draft replies → Slack me to review — is one Gmail monitor in auto mode with two grants: gmail.reply (mode draft) on your Gmail, and slack.send_message on Slack.
The headline recipe: a CloudWatch monitor in auto mode that watches a log group for errors and, when it finds one, opens a fix PR for you.
- Integrations → connect CloudWatch (log group, optional region/profile) and GitHub (PAT). Make sure
aws sso loginis current andgh auth statusis signed in. - Monitors → new monitor, watch the CloudWatch integration, schedule (e.g. every 5 min), mode Auto-act.
- Under Actions this monitor can take, grant GitHub →
github.open_fix_prand set the repo path to the local checkout of the service behind those logs (e.g./Users/you/Desktop/Edge/repos/edge-server). - Instruction, e.g.: "Watch these logs for application errors. When a real, code-level error appears, open a PR that fixes it; ignore transient/infra noise and don't open duplicate PRs for the same error."
Each tick the watcher (no tools) reads the error events and decides; in auto mode it hands a qualifying error to a second, tool-enabled claude agent that runs inside a throwaway git worktree of the granted repo — it investigates, branches, fixes, pushes, and runs gh pr create. The worktree keeps the fix off your actual checkout (your files, index, and current branch are never touched) and is removed once the run finishes; the work lives on the pushed branch / PR. That fixer runs with --dangerously-skip-permissions, scoped to the one repo path you set; it only ever runs through this explicit grant, in auto mode. A run can take minutes, and the resulting PR URL is logged to Activity.
Duplicate PRs are prevented deterministically, not by the model. Because the watcher re-detects the same recurring error every tick, Sentinel derives a stable branch name (sentinel/fix-<fingerprint>) from a normalized fingerprint of the error — so the same error always maps to the same branch. Before spawning the fixer it checks GitHub (gh pr list --head <branch> --state open); if an open PR already exists on that branch, it skips the whole run and logs the existing PR instead of opening a second one. Once that PR is merged or closed, a recurring error is free to file a fresh fix.
Gmail writes need the
gmail.modifyscope. The watch-only default isgmail.readonly, which cannot create drafts. To let a monitor draft/send, re-mint your Gmail refresh token withhttps://www.googleapis.com/auth/gmail.modifyand reconnect. Sentinel surfaces a clear error if a write is attempted without the scope.
Monitors don't start from scratch on every run. Sentinel keeps a shared, three-tier memory store (src-tauri/src/memory.rs, persisted in the memories table) that monitors read from and write to, visible and editable under the Memory tab.
Tiers. Every memory is born short-term (a fresh observation). Each time a memory is pulled into a run's context it's referenced — its reference count rises. Cross a threshold and it's promoted: short → long (3 refs) → core (8 refs). Core memories are the distilled, always-loaded knowledge; the lower tiers are recalled only when relevant.
Read / write, per run. Before each run the scheduler retrieves the relevant memories — every in-scope core memory plus the most useful long/short ones for that monitor and integration — and injects them into the agent's prompt (retrieve_for_run → format_for_prompt). Retrieval is a reference, so the memories that keep proving useful climb the tiers (touch_memories). The agent can write new facts back by adding a memory array to its JSON reply; those land as short-term memories (parse_written → store_written), de-duplicated on a (scope, key) handle so refining a known fact doesn't pile up duplicates.
Scope. A memory is either shared (global — available to every monitor) or scoped to the monitor that wrote it. The agent picks per fact; you can also add or re-scope memories by hand.
Staying lean. A daily maintenance pass (run_maintenance, called from the scheduler loop, self-throttled) does two things: decay — stale, barely-referenced short-term memories are forgotten — and consolidation — any scope that has accumulated too many short-term memories is handed to Claude to merge duplicates, fold related observations into single durable facts, and drop the trivial. Promoted (long/core) and pinned memories are never touched by either. Pin a memory (or add one pre-pinned) to force it to core and protect it permanently. The Consolidate button runs the pass on demand.
Runs are ephemeral ("here's what's true right now"); the Tasks tab is the durable layer on top — a prioritized to-do list (src-tauri/src/tasks.rs, persisted in the tasks table) that monitors write to and you work through. Every task lands in a lane chosen by who has to act:
- Needs you (
user) — your judgment, credentials, or a real-world decision. - Claude can handle (
claude) — something Sentinel can do for you. If the task carries a concrete proposed action, the card shows Let Claude do it. - FYI (
fyi) — worth surfacing, no action required.
Each task also has a priority (high / normal / low) and a status (open → doing → done, or dismissed).
How monitors file them. Alongside memory, the agent adds a tasks array to its JSON reply (parse_written → store_written). Filing is directed, not optional: the protocol requires a matching task whenever a run surfaces something actionable — in particular, any alert status or recommended action must have a corresponding task — and only a clean, all-clear run leaves the array empty. A frequent monitor still doesn't flood the list: tasks are de-duplicated per monitor on title (upsert_task), so re-filing the same item refreshes it in place; a title that recurs after you've resolved it is treated as a genuine new task.
Let Claude do it. A claude-lane task may include an action (capability + integration_id + args) — the exact write the monitor would make. Approving it runs resolve_task_with_claude, which executes through the same grant-checked path the scheduler uses for an auto plan: the action must still match a grant on the monitor that filed it, or it's refused. This is "propose → you approve → Claude executes" — the safe middle ground between notify-only and full autonomy. Success marks the task done with the outcome; failure leaves it open with the error so you can retry or take it over.
You can also add, edit, dismiss, reopen and delete tasks by hand. Open task counts surface on the Dashboard and as a badge in the sidebar.
A run is no longer a dead end. Open any finished run in Activity and you'll find a Chat with this run thread: it reopens the exact Claude session that did the run (claude -p --resume <session_id>), so the agent still has the run's full live context, fetched data and verdict — for free — and answers conversationally.
Ask it "why did you flag this charge?" and it explains from what it saw. Tell it "draft the reply, keep it short" and it proposes the write. Each turn resumes the latest session in the thread (persisted in run_messages); a run that predates session capture falls back to replaying its stored exchange, so chat works everywhere (claude::chat_turn → commands::send_run_message).
The agent can only propose actions — never fire them. Anything it wants to do appears as an approval card below the thread (and in the Approvals tab), showing the exact capability and arguments. Same model, end to end: a chat-proposed action is queued as a pending_action and, when you approve it, runs through the same grant-checked path the scheduler uses for an auto plan — re-authorized against the source monitor's current grants at approve time, so a conversation can never exceed the powers the monitor was granted. Tweak the draft before approving, or reject it outright.
The Approvals tab is the queue of every write your agents have proposed and not yet run — the trust surface that lets you keep agents on a short leash while still moving fast. Each card carries a one-line preview, the precise arguments, and the monitor it came from; approve, edit the arguments first, or reject. Nothing in the queue has executed. Approving runs the action immediately and records the outcome both on the card and as a note back in the originating run's chat. The pending count shows on the Dashboard and as a sidebar badge (pending_actions table; approve_action / reject_action / update_action_args).