Add coding-loop workflow + fix tmux agent stage-status env forwarding by mattleaverton · Pull Request #82 · danshapiro/kilroy

mattleaverton · 2026-04-17T20:38:31Z

Summary

New workflow package workflows/coding-loop/ — an iterative coding-agent loop: task chooser → implementer → reviewer → done-gate, wrapped in the trapezium/invtrapezium loop primitive with file-based termination. One sub-task per iteration; feedback persisted to .reviews/iter-NNN.md + rolling .reviews/latest.md; done-gate uses LLM judgment against the spec.
Engine fix — the tmux agent session now receives KILROY_STAGE_STATUS_PATH and KILROY_STAGE_STATUS_FALLBACK_PATH (plus the full BuildStageRuntimeEnv set). The engine's status-contract preamble already tells agents to write to these vars, but previously only the API agent_loop path actually set them in the process env. The tmux path didn't. This affects every agent_tool=claude|codex|gemini|opencode node in every workflow.

Motivation

Writing and running the coding-loop workflow exposed a latent tmux/API parity gap. When I ran the workflow end-to-end on a toy-list spec (7 sub-tasks → 7 iterations, full green run), I observed the implementer burning roughly 15 of 45 tool calls per iteration hunting for $KILROY_STAGE_STATUS_PATH that was never set:

echo "$KILROY_STAGE_STATUS_PATH"     → (empty)
printenv | grep -i kilroy            → only KILROY_RUN_ID, KILROY_NODE_ID
grep -r "status_path" <logs_root>    → no match in any config

Agents eventually gave up and wrote status.json to arbitrary fallback paths. Runs still succeeded because the engine already tolerates a missing status file, but the wasted latency per iteration was substantial (~2-3 minutes each) and the behavior was confusing (preamble tells you to write to an env var that doesn't exist).

What changed

internal/attractor/agents/tmux_handler.go — extracted the session-env construction into buildTmuxAgentEnv, which now merges:

The tool template's BuildEnv() defaults (unchanged)
engine.BuildStageRuntimeEnv — run/node IDs, worktree/logs paths, data dir, inputs manifest, KILROY_INPUT_* (replaces the previous hand-rolled subset)
engine.BuildStageStatusContract(...).EnvVars — the two status-contract paths (new)

This brings the tmux session env into parity with what buildAgentLoopOverrides already provides to the API agent_loop path.

internal/attractor/agents/tmux_env_test.go — new unit tests covering both populated and nil-template code paths; asserts all expected runtime + status-contract vars are present.

workflows/coding-loop/ — new workflow package (graph.dot, workflow.toml, README.md) exercising the loop primitive end-to-end. Proven on a toy-math spec (1 iter, 4m 24s) and a toy-list spec (7 iters, 30m 54s). Both green.

Test plan

go test ./internal/attractor/agents/ ./internal/attractor/engine/ — green (agents 7.4s, engine 220.4s)
kilroy attractor validate --graph workflows/coding-loop/graph.dot — ok
End-to-end run: toy-math (3 sub-tasks) — 1 iteration, success
End-to-end run: toy-list (7 sub-tasks, empty-safe invariant) — 7 iterations, all features + tests implemented, go test ./... exits 0 in target repo
go build ./cmd/kilroy/ — clean
go vet ./... — clean
gofmt -l on touched files — clean (pre-existing drift elsewhere unchanged; PR fix(ci): gofmt all unformatted files (engine.go, worktree_hint_test.go, cli_only_models_test.go, codergen_router_cxdb_test.go) #74 covers some of it)

Notes

Two pre-existing test failures on main (TestRunWithConfig_ForceModel_BypassesCatalogGate, TestRunWithConfig_AllowsKimiAndZai_WhenCatalogUsesOpenRouterPrefixes) are unrelated to this change.
The workflow package is functional but v0.1.0 — future work could use the housekeeping-LLM primitive (separate exploration) to replace the exact-string termination match, drop the loop_max=12 cap, and make the done-gate more robust to prompt variation.

🤖 Generated with Claude Code

Iterative coding agent workflow: task chooser → implementer → reviewer → done-gate, wrapped in a trapezium/invtrapezium loop primitive with loop_max=8 and loop_until_file_contains-based termination. - Chooser and done-gate on claude-haiku-4.5 (cheap, API) - Implementer and reviewer on claude-sonnet-4.6 via agent_tool=claude - Feedback persisted to .reviews/iter-NNN.md plus .reviews/latest.md; chooser reads latest only, done-gate can list/read any iteration - Spec passed via --input spec=<abs-path> and read in place; never committed into the target repo - Reviewer uses git show HEAD (vs git diff HEAD~1 HEAD) so the first-iteration case works without a fallback branch Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Force one-subtask-per-iteration to exercise the loop dynamics. - Chooser: pick EXACTLY ONE smallest self-contained sub-task; do not bundle. Explicit guardrail written into .kilroy/task.md for the implementer. - Implementer: implement ONLY what the task asks for; do not guess ahead. - loop_max bumped 8 → 12 to accommodate multi-iteration specs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Agents run via TmuxAgentHandler (agent_tool=claude|codex|gemini|opencode) were seeing the engine-injected status-contract preamble that instructs them to write status JSON to $KILROY_STAGE_STATUS_PATH — but the tmux session env never actually set those variables. Only the API agent_loop path set them (via buildAgentLoopOverrides). Agents wasted tool calls hunting for the unset env var and eventually gave up. Consolidate the env build into buildTmuxAgentEnv, which now merges: - The tool template's BuildEnv() defaults - Engine runtime invariants (KILROY_RUN_ID, KILROY_NODE_ID, KILROY_LOGS_ROOT, KILROY_STAGE_LOGS_DIR, KILROY_WORKTREE_DIR, KILROY_DATA_DIR, KILROY_INPUTS_MANIFEST_PATH, KILROY_INPUT_*) via BuildStageRuntimeEnv - Stage status contract paths (KILROY_STAGE_STATUS_PATH, KILROY_STAGE_STATUS_FALLBACK_PATH) via BuildStageStatusContract This matches the API agent_loop path's env, so tmux and API backends are now consistent with respect to what the status-contract preamble can actually reference. Observed in the wild: a 7-iteration coding-loop run where the implementer burned ~15 of 45 tool calls per iteration searching for KILROY_STAGE_STATUS_PATH. With this fix the env var is set at session start and the preamble instruction is actionable. Adds unit test coverage in tmux_env_test.go. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

mattleaverton and others added 3 commits April 17, 2026 15:49

mattleaverton force-pushed the feat/coding-loop-workflow branch from 73bea5e to 4e30264 Compare April 17, 2026 20:49

mattleaverton merged commit 7073ea0 into danshapiro:main Apr 17, 2026
1 check failed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add coding-loop workflow + fix tmux agent stage-status env forwarding#82

Add coding-loop workflow + fix tmux agent stage-status env forwarding#82
mattleaverton merged 3 commits into
danshapiro:mainfrom
mattleaverton:feat/coding-loop-workflow

mattleaverton commented Apr 17, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

mattleaverton commented Apr 17, 2026

Summary

Motivation

What changed

Test plan

Notes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant