v0.8.0 feat: goal mode — verifiable goals with an independent verification gate by suncommit · Pull Request #28 · getcrew44/crew44

suncommit · 2026-06-11T08:43:00Z

Summary

Goal mode (daemon). A per-chat mode for long-running, verifiable tasks (bd8ddb2a, 532a61c2): the lead agent scopes the goal with a structured clarify round, locks criteria into a verification gate, and the daemon owns the phase machine — scoping → running → awaiting_signoff → done. Verification is never the crew's to run: a READY declaration triggers an isolated turn by a dedicated anonymous verifier (fresh session, no history, no handover powers, browser MCP disabled, per-chat env dir), and only that turn's VERIFY verdict moves the gate. A held gate auto-continues the crew with the failed criteria attached, capped at 5 attempts per run.

Goal mode (UI). (133f5369) Goal toggle in the new-task composer, interactive clarify cards, a pinned editable criteria checklist, verification result cards, and sign-off (accept closes the task; send-back resets the gate with notes).

Gate hardening (8e644115, 960779cd — from this ship's specialist/adversarial/red-team review): fence-aware marker extraction (quoting a marker can never trigger it; prompt examples are themselves fenced), nested-block inerting, CRLF tolerance, payload size caps + newline collapsing, fail-closed duplicate verdicts, ownership-first marker validation, split correction budgets, READY re-arms the gate in awaiting_signoff, changed verify methods reset verified status, goal-state writes serialized under the app lock (including all run-goroutine chat saves), clarify_seq round binding with revert-on-failed-turn-start, UI error surfacing, optimistic serialized checklist edits, superseded-round answer guard.

New-task UX. Always lead with the Partner agent, lead picker removed (1403ad01); send controls stay pinned right when the toolbar wraps (0111046a).

Infrastructure. gofmt normalization pass over daemon internals (7899f1a9); README Core concepts row for Goal (5a9b4360).

Test plan

Go suite passes (16/16 packages, 446 tests, exit 0, -race on touched packages)
Vitest passes (341 tests, 21 files)
Mobile suites pass (26 + 19 tests)
gofmt clean

Import ordering, comment alignment, and trailing-newline cleanup picked up by a tree-wide gofmt -w; no behavior change. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Goal mode (docs/goal-0610.md): per-chat long-running tasks with a verification gate. This lays the data layer: - ChatRecord.Goal *GoalState (nil = zero behavior change), with phases scoping -> running -> awaiting_signoff -> done, criteria, clarify questions/answers, and the attempt counter/cap. - Five timeline event types: goal_clarify, goal_lock, goal_verify, goal_done, goal_signoff, with payload structs on Event. - Marker protocol: line-anchored multiline JSON blocks (CREW44_GOAL_CLARIFY / _LOCK / _VERIFY) emitted by the lead agent. ExtractGoalMarkers strips blocks (malformed bodies included, returned with Err) and validates payloads; lock criteria get normalized IDs. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

The daemon owns the goal lifecycle. The lead agent scopes the goal with a clarify round, locks criteria, and the crew iterates until a verify run passes every criterion: - chats.create gains goal_mode via new CreateChatWithOptions (legacy CreateChat stays as a wrapper); seeds scoping state with attempt cap 5. - runChat parses goal markers next to handover extraction; clarify/lock/ verify update GoalState, append events, and publish chat.updated. - Auto-continue: a failed gate immediately queues another lead turn naming the failed criteria (handover chains win; pending steers and cancel suppress; cap 5 consecutive turns per run, then idle with a goal_attempt_cap error event). A lock kicks off the first work turn so the crew never stalls after locking. Malformed markers get one corrective turn. - New RPCs: chats.goal.answer (structured clarify answers -> internal lock turn), chats.goal.criteria.update (whole-list replacement, edits reset to pending and re-arm the gate), chats.goal.signoff (accept closes the chat; send_back resets criteria and starts a rework turn). - System prompt: per-phase Goal Mode section — clarify/lock grammar for the lead in scoping, live criteria + verify protocol in running, read-only goal context for delegated agents. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Ports the goal mock (mocks/CrewAI v3/goal.jsx) onto the real event pipeline: - src/GoalMode.jsx: pinned GoalCard criteria checklist (collapsed by default, animated expand/collapse, inline edit/add/remove committing whole-list replacements), interactive GoalClarifyEvent (chip + text answers, collapses once answered via chat.goal), GoalLockDivider, GoalVerifyEvent gate card, GoalDoneEvent banner with accept / send-back-with-notes, GoalSignoffDivider, GoalModeChip/Detail for the composer, and the GoalHeaderPill. - TaskView: EventRouter cases for the five goal event kinds, pinned GoalCard between header and timeline, header pill, and the answer/criteria/signoff handlers. - utils.mapBackendEvent maps the new event types; api.js gains goal_mode on createChat plus answerGoal / updateGoalCriteria / signoffGoal. - New Task composer: Goal mode toggle with detail strip, goal-aware placeholder and 'Set goal' submit label. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

The composer toolbar was one wrapping flex row, so on narrow windows the Start button dropped to a stray second line at far left. Split it into a left chip group that wraps internally and a right action group (send-shortcut menu + Start) that never wraps. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

The lead is no longer user-selectable: tasks always start with the default-crew Partner agent (preset_id 'default-crew', preset_key 'partner'), falling back to the first agent for setups without the default crew. Drops the Lead picker chip, the lead draft persistence, and the related state. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…ized state writes, restricted verifier Marker protocol: goal blocks inside Markdown code fences or nested in another block's body are quotes, not commands; CRLF-terminated blocks parse; payload fields are size-capped and newline-collapsed before they reach prompts; duplicate verify results merge fail-closed; duplicate clarify ids are reassigned. Prompt examples are fenced (inert if echoed) with an explicit never-fence-your-marker rule. Phase machine: ownership is checked before malformedness so non-lead markers can't burn the lead's correction budget; lead and verifier correction budgets are split; lock kickoff clears a same-message READY; the verifier always starts with clean malformed state; READY in awaiting_signoff re-arms the gate (criteria reset, verifier re-runs) as the signoff prompt promises; a changed verify method resets a criterion's verified status; gate-loop store failures surface as goal_state_unavailable instead of stopping silently. Concurrency: AnswerGoal/UpdateGoalCriteria/SignoffGoal and every run- goroutine chat save now go through the app lock (mutateChat applies only run-owned fields), closing the lost-update race on mid-stream criteria edits. chats.goal.answer requires clarify_seq so answers bind to their round; failed lock/rework turn starts revert the persisted answer or signoff state. Verifier turns run without browser MCP and with a per-chat runtime env dir; the lead's ready claim is delimited as unverified output in the verifier prompt. Adds cancel-mid-gate-loop, interleaving, fence, CRLF, nested-marker, re-arm, and revert tests.

…und binding Goal RPC failures are no longer silent: rejected answer, sign-off, and checklist saves clear the waiting state, re-enable their controls, and show an inline error near the control that failed; the "gate re-arms" footer only appears once a save actually lands. The GoalCard keeps an optimistic working copy with serialized saves and temp keys for just-added rows, so rapid edits can't resurrect removed criteria or smear edits across unsaved rows. Clarify answers carry clarify_seq and superseded rounds stop borrowing the current round's answers. Also: Partner-lead predicate extracted to utils.isPartnerAgent, the goal-mode Switch hoisted to module scope so its transitions animate, focus/hover affordances on goal inputs and chips, full send-back notes on hover, and an api-layer seam test for the goal RPC wire format.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

README.md: add Goal to the Core concepts table — goal mode shipped in 0.8.0 and is now discoverable alongside Worktree and Handover. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

suncommit and others added 11 commits June 10, 2026 17:30

chore: gofmt normalization pass over daemon internals

7899f1a

Import ordering, comment alignment, and trailing-newline cleanup picked up by a tree-wide gofmt -w; no behavior change. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

chore: bump version and changelog (v0.8.0)

e802042

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

docs: update project documentation for v0.8.0

5a9b436

README.md: add Goal to the Core concepts table — goal mode shipped in 0.8.0 and is now discoverable alongside Worktree and Handover. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

fix(goal): harden verification gate races

a22d30e

suncommit merged commit 628627d into main Jun 11, 2026
10 checks passed

suncommit deleted the feat0610 branch June 11, 2026 09:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

v0.8.0 feat: goal mode — verifiable goals with an independent verification gate#28

v0.8.0 feat: goal mode — verifiable goals with an independent verification gate#28
suncommit merged 11 commits into
mainfrom
feat0610

suncommit commented Jun 11, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

suncommit commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

suncommit commented Jun 11, 2026 •

edited

Loading