Skip to content

v0.8.0 feat: goal mode — verifiable goals with an independent verification gate#28

Merged
suncommit merged 11 commits into
mainfrom
feat0610
Jun 11, 2026
Merged

v0.8.0 feat: goal mode — verifiable goals with an independent verification gate#28
suncommit merged 11 commits into
mainfrom
feat0610

Conversation

@suncommit

@suncommit suncommit commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Summary

Goal mode (daemon). A per-chat mode for long-running, verifiable tasks (bd8ddb2a, 532a61c2): the lead agent scopes the goal with a structured clarify round, locks criteria into a verification gate, and the daemon owns the phase machine — scoping → running → awaiting_signoff → done. Verification is never the crew's to run: a READY declaration triggers an isolated turn by a dedicated anonymous verifier (fresh session, no history, no handover powers, browser MCP disabled, per-chat env dir), and only that turn's VERIFY verdict moves the gate. A held gate auto-continues the crew with the failed criteria attached, capped at 5 attempts per run.

Goal mode (UI). (133f5369) Goal toggle in the new-task composer, interactive clarify cards, a pinned editable criteria checklist, verification result cards, and sign-off (accept closes the task; send-back resets the gate with notes).

Gate hardening (8e644115, 960779cd — from this ship's specialist/adversarial/red-team review): fence-aware marker extraction (quoting a marker can never trigger it; prompt examples are themselves fenced), nested-block inerting, CRLF tolerance, payload size caps + newline collapsing, fail-closed duplicate verdicts, ownership-first marker validation, split correction budgets, READY re-arms the gate in awaiting_signoff, changed verify methods reset verified status, goal-state writes serialized under the app lock (including all run-goroutine chat saves), clarify_seq round binding with revert-on-failed-turn-start, UI error surfacing, optimistic serialized checklist edits, superseded-round answer guard.

New-task UX. Always lead with the Partner agent, lead picker removed (1403ad01); send controls stay pinned right when the toolbar wraps (0111046a).

Infrastructure. gofmt normalization pass over daemon internals (7899f1a9); README Core concepts row for Goal (5a9b4360).

Test plan

  • Go suite passes (16/16 packages, 446 tests, exit 0, -race on touched packages)
  • Vitest passes (341 tests, 21 files)
  • Mobile suites pass (26 + 19 tests)
  • gofmt clean

suncommit and others added 11 commits June 10, 2026 17:30
Import ordering, comment alignment, and trailing-newline cleanup picked
up by a tree-wide gofmt -w; no behavior change.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Goal mode (docs/goal-0610.md): per-chat long-running tasks with a
verification gate. This lays the data layer:

- ChatRecord.Goal *GoalState (nil = zero behavior change), with phases
  scoping -> running -> awaiting_signoff -> done, criteria, clarify
  questions/answers, and the attempt counter/cap.
- Five timeline event types: goal_clarify, goal_lock, goal_verify,
  goal_done, goal_signoff, with payload structs on Event.
- Marker protocol: line-anchored multiline JSON blocks
  (CREW44_GOAL_CLARIFY / _LOCK / _VERIFY) emitted by the lead agent.
  ExtractGoalMarkers strips blocks (malformed bodies included, returned
  with Err) and validates payloads; lock criteria get normalized IDs.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The daemon owns the goal lifecycle. The lead agent scopes the goal with
a clarify round, locks criteria, and the crew iterates until a verify
run passes every criterion:

- chats.create gains goal_mode via new CreateChatWithOptions (legacy
  CreateChat stays as a wrapper); seeds scoping state with attempt cap 5.
- runChat parses goal markers next to handover extraction; clarify/lock/
  verify update GoalState, append events, and publish chat.updated.
- Auto-continue: a failed gate immediately queues another lead turn
  naming the failed criteria (handover chains win; pending steers and
  cancel suppress; cap 5 consecutive turns per run, then idle with a
  goal_attempt_cap error event). A lock kicks off the first work turn so
  the crew never stalls after locking. Malformed markers get one
  corrective turn.
- New RPCs: chats.goal.answer (structured clarify answers -> internal
  lock turn), chats.goal.criteria.update (whole-list replacement, edits
  reset to pending and re-arm the gate), chats.goal.signoff (accept
  closes the chat; send_back resets criteria and starts a rework turn).
- System prompt: per-phase Goal Mode section — clarify/lock grammar for
  the lead in scoping, live criteria + verify protocol in running,
  read-only goal context for delegated agents.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Ports the goal mock (mocks/CrewAI v3/goal.jsx) onto the real event
pipeline:

- src/GoalMode.jsx: pinned GoalCard criteria checklist (collapsed by
  default, animated expand/collapse, inline edit/add/remove committing
  whole-list replacements), interactive GoalClarifyEvent (chip + text
  answers, collapses once answered via chat.goal), GoalLockDivider,
  GoalVerifyEvent gate card, GoalDoneEvent banner with accept /
  send-back-with-notes, GoalSignoffDivider, GoalModeChip/Detail for the
  composer, and the GoalHeaderPill.
- TaskView: EventRouter cases for the five goal event kinds, pinned
  GoalCard between header and timeline, header pill, and the
  answer/criteria/signoff handlers.
- utils.mapBackendEvent maps the new event types; api.js gains
  goal_mode on createChat plus answerGoal / updateGoalCriteria /
  signoffGoal.
- New Task composer: Goal mode toggle with detail strip, goal-aware
  placeholder and 'Set goal' submit label.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The composer toolbar was one wrapping flex row, so on narrow windows the
Start button dropped to a stray second line at far left. Split it into a
left chip group that wraps internally and a right action group
(send-shortcut menu + Start) that never wraps.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The lead is no longer user-selectable: tasks always start with the
default-crew Partner agent (preset_id 'default-crew', preset_key
'partner'), falling back to the first agent for setups without the
default crew. Drops the Lead picker chip, the lead draft persistence,
and the related state.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ized state writes, restricted verifier

Marker protocol: goal blocks inside Markdown code fences or nested in
another block's body are quotes, not commands; CRLF-terminated blocks
parse; payload fields are size-capped and newline-collapsed before they
reach prompts; duplicate verify results merge fail-closed; duplicate
clarify ids are reassigned. Prompt examples are fenced (inert if echoed)
with an explicit never-fence-your-marker rule.

Phase machine: ownership is checked before malformedness so non-lead
markers can't burn the lead's correction budget; lead and verifier
correction budgets are split; lock kickoff clears a same-message READY;
the verifier always starts with clean malformed state; READY in
awaiting_signoff re-arms the gate (criteria reset, verifier re-runs) as
the signoff prompt promises; a changed verify method resets a
criterion's verified status; gate-loop store failures surface as
goal_state_unavailable instead of stopping silently.

Concurrency: AnswerGoal/UpdateGoalCriteria/SignoffGoal and every run-
goroutine chat save now go through the app lock (mutateChat applies only
run-owned fields), closing the lost-update race on mid-stream criteria
edits. chats.goal.answer requires clarify_seq so answers bind to their
round; failed lock/rework turn starts revert the persisted answer or
signoff state.

Verifier turns run without browser MCP and with a per-chat runtime env
dir; the lead's ready claim is delimited as unverified output in the
verifier prompt. Adds cancel-mid-gate-loop, interleaving, fence, CRLF,
nested-marker, re-arm, and revert tests.
…und binding

Goal RPC failures are no longer silent: rejected answer, sign-off, and
checklist saves clear the waiting state, re-enable their controls, and
show an inline error near the control that failed; the "gate re-arms"
footer only appears once a save actually lands. The GoalCard keeps an
optimistic working copy with serialized saves and temp keys for
just-added rows, so rapid edits can't resurrect removed criteria or
smear edits across unsaved rows. Clarify answers carry clarify_seq and
superseded rounds stop borrowing the current round's answers.

Also: Partner-lead predicate extracted to utils.isPartnerAgent, the
goal-mode Switch hoisted to module scope so its transitions animate,
focus/hover affordances on goal inputs and chips, full send-back notes
on hover, and an api-layer seam test for the goal RPC wire format.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
README.md: add Goal to the Core concepts table — goal mode shipped in
0.8.0 and is now discoverable alongside Worktree and Handover.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@suncommit suncommit merged commit 628627d into main Jun 11, 2026
10 checks passed
@suncommit suncommit deleted the feat0610 branch June 11, 2026 09:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant