feat: phantom_loop - autonomous iteration primitive with evolution integration by electronicBlacksmith · Pull Request #48 · ghostwright/phantom

electronicBlacksmith · 2026-04-08T03:42:37Z

Summary

Adds phantom_loop - an in-process MCP tool that lets the agent spawn iterative tasks where each tick is a fresh SDK session with state persisted in a markdown file.

Core loop:

Each tick is a fresh query() call with the goal + accumulated state
State persisted in markdown frontmatter (the agent reads/writes it naturally)
Termination: agent self-declares done, budget exhausted, success_command exits 0, or operator interrupt via Slack button

Slack integration:

AsyncLocalStorage context injection so loop ticks auto-target the operator's thread
Reaction ladder, progress bar, stop button
Status updates land in the originating thread

Evolution integration:

Post-loop evolution pipeline: bounded transcript accumulation, SessionData synthesis, fire-and-forget evolution + memory consolidation
Mid-loop critique checkpoints: optional Sonnet 4.6 review every N ticks
Memory context injection: cached at loop start, injected into every tick

Also includes:

OAuth token support for LLM judges (ANTHROPIC_AUTH_TOKEN, CLAUDE_CODE_OAUTH_TOKEN)
Documentation at docs/loop.md

Files added/changed

src/loop/ - Runner, store, state file, tool, prompt, notifications, critique, post-loop (8 files)
src/loop/__tests__/ - 6 test files
src/agent/slack-context.ts - AsyncLocalStorage for Slack thread context
src/index.ts - Wiring
docs/loop.md - Documentation

Test plan

945+ tests passing
End-to-end verified in Slack + non-Slack trigger
bun run typecheck clean
bun run lint clean

Introduces phantom_loop - an in-process MCP tool that lets the agent spawn iterative tasks where each tick is a fresh SDK session with state persisted in a markdown file. Termination signals: agent self-declares via status: done in the state file frontmatter, iteration or cost budget exhausted, optional success_command returns exit 0, or operator interrupt via Slack button. Runner is fully deterministic: budgets are enforced by TypeScript, the agent only reasons about the task. Crash recovery is just re-scheduling a tick against any loop still marked running; the state file is the source of truth.

Addresses two issues from code review: 1. parseFrontmatter now strips inline YAML comments (`status: done # yay`) and surrounding quotes (`status: "done"`). Without this, an agent writing natural-looking YAML would leave the loop status unparseable, silently burning through its iteration budget instead of terminating. 2. Loop.conversationId was stored but its only use was gating whether the start notice posted - tick and final updates posted regardless. It was meant to be the Slack thread target for loop status updates. Extend SlackChannel.postToChannel with an optional thread_ts and use it for the start notice, so status updates land in the caller's thread when provided.

- Guard workspace paths against traversal out of dataDir in start() - Document success_command env vars and timeout in MCP tool description - Narrow RunnerDeps.runtime to Pick<AgentRuntime,"handleMessage">, drop `as never` casts from tests - Harden SDK internals access in tool.test.ts via exported LOOP_TOOL_NAME constant with SDK version pinned in a comment - Add end-to-end test exercising autoSchedule:true setImmediate path - Collapse finalize() UPDATE+SELECT roundtrip: LoopStore.finalize now returns the updated Loop directly

Closes #5. AsyncLocalStorage context injection, reaction ladder, progress bar, state.md summary on completion. Stop button persists across tick edits. Tick/finalize race eliminated. LoopNotifier extracted from runner.ts. Verified end-to-end in Slack + non-Slack trigger. 945 tests passing.

* feat(loop): integrate evolution, memory, and mid-loop critique into loop ticks Loop ticks now use Phantom's full intelligence stack instead of running blind: Phase 1 - Memory context injection: cached once at loop start from the goal, injected into every tick prompt via TickPromptOptions. Cleared on finalize, rebuilt on resume. Phase 2 - Post-loop evolution and consolidation: bounded transcript accumulation (first tick + rolling 10 summaries + last tick), SessionData synthesis in finalize(), fire-and-forget evolution pipeline and LLM/heuristic memory consolidation with cost-cap guards matching the interactive path. Phase 3 - Mid-loop critique checkpoints: optional checkpoint_interval param lets the agent request Sonnet 4.6 review every N ticks. Guard requires evolution enabled, LLM judges active, and cost cap not exceeded. Critique is awaited before next tick to avoid race conditions. Closes #8 * fix(loop): address code review findings from PR #9 - Decouple postLoopDeps so evolution and memory run independently (evolution works when memory is down and vice versa) - Skip mid-loop critique on terminal ticks to avoid wasted Sonnet calls - Track judge cost on failure paths via JudgeParseError carrying usage data - Extract recordTranscript/clamp from runner.ts to post-loop.ts (292 < 300 lines) * fix(evolution): support OAuth tokens for LLM judge auth resolveJudgeMode() and judge client now check ANTHROPIC_AUTH_TOKEN and CLAUDE_CODE_OAUTH_TOKEN in addition to ANTHROPIC_API_KEY. Enables LLM judges on Max subscription deployments using OAuth bearer tokens. * docs: add phantom_loop documentation for upstream PR Covers MCP tool parameters, state file contract, tick lifecycle, Slack integration, mid-loop critique, post-loop evolution pipeline, memory context injection, and tips for writing effective goals. Closes #12 * fix(test): stabilize trigger-auth and judge-activation tests for CI trigger-auth: use inline Bun.serve instead of startServer to avoid module-level globals and disk I/O that can race across test files. judge-activation: save/restore ANTHROPIC_AUTH_TOKEN and CLAUDE_CODE_OAUTH_TOKEN alongside ANTHROPIC_API_KEY so tests that expect "no credentials" actually clear all auth env vars. --------- Co-authored-by: electronicBlacksmith <electronicBlacksmith@users.noreply.github.com>

electronicBlacksmith and others added 5 commits April 8, 2026 03:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: phantom_loop - autonomous iteration primitive with evolution integration#48

feat: phantom_loop - autonomous iteration primitive with evolution integration#48
electronicBlacksmith wants to merge 5 commits intoghostwright:mainfrom
electronicBlacksmith:upstream/feat/loop-primitive

electronicBlacksmith commented Apr 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

electronicBlacksmith commented Apr 8, 2026

Summary

Files added/changed

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant