feat: DAG reassessment via boi plan + dispatch-many by mrap · Pull Request #14 · mrap/boi

mrap · 2026-04-29T23:13:09Z

Motivation

Multi-spec BOI tracks had to be re-chained manually via --after flags at
dispatch time. In this session's 5-spec track, ordering had to be verified by
hand — the wrong order only surfaces when a dependent task fails mid-run
because upstream work wasn't ready yet.

Before: whoever dispatches must remember in-flight deps + express them
via --after. Easy to mis-order. In-flight specs can expand scope after a
dependent is queued, silently breaking the assumed contract.

After: boi dispatch-many runs a DAG analysis + LLM critique before
any dispatch happens. Wrong orderings are flagged with suggested fixes.
boi dispatch gets a lightweight implicit-dep WARN for free.

Also: typed failure reasons + inline display in `boi status` (closes SO S12 gap)

Problem: Audit (2026-04-29) found that 8/8 recent failed specs had
spec.error = NULL. boi status only showed ✗ S1234 Title 2m ago —
nothing about why. To find out, users had to dig through daemon logs.
This broke SO S12 ("loud failures") at the user surface.

Fix: Structured FailureReason enum + persistence + rendering.

Typed enum (`src/failure.rs`)

pub enum FailureReason {
    ModelResolution { model, provider },
    ProviderRateLimit { provider, retry_after_s },
    ProviderHttp { provider, status, body_excerpt },
    ProviderAuth { provider, env_var },
    Timeout { phase, secs },
    ToolError { phase, message },
    VerifyFailed { task, exit_code, stderr_excerpt },
    WorkerCrash { phase, signal, message },
    Other { message },
}

Helpers: short_summary() (one-line for status), detail() (multi-line for boi why).
Serialized as JSON to spec.error / task.error columns. Falls back to Other { message } for legacy NULL/string values.

Failure capture (`src/queue.rs`, `src/worker.rs`, `src/runner.rs`)

New helpers queue.fail_spec(id, FailureReason) and queue.fail_task(spec_id, task_id, FailureReason)
write the JSON reason to the DB and emit structured telemetry events (boi.spec.failed / boi.task.failed).

Every failure path now maps to a typed reason:

RuntimeError::Timeout → Timeout { ... }
NonZeroExit("HTTP 429") → ProviderRateLimit { ... }
NonZeroExit("HTTP 4xx/5xx") → ProviderHttp { ... }
Missing API key → ProviderAuth { ... }
Model not found → ModelResolution { ... }
Verify command failure → VerifyFailed { ... }
Subprocess SIGKILL/SIGSEGV → WorkerCrash { ... }

Status rendering (`src/cli/status.rs`)

Failed specs now render a second indented line with the error reason:

✗ SA015  My spec title            2m ago
    └─ Verify failed: exit 1 (stderr: assertion failed at line 42)

Long summaries are truncated with …. Pass --verbose / -v for multi-line detail().

`boi why <spec-id>` (`src/cli/why.rs`)

Fast forensics: prints the full FailureReason::detail() for any spec.

boi why SA015

Tests

failure_reason: all variants roundtrip, truncation, invalid-JSON fallback
failure_capture: every failure path produces typed reason; NO NULL errors
status_render_error: no-error → no second line; typed error → short summary; long → ellipsis; verbose → detail

Implementation

New commands

boi plan [specs...] — loads in-flight + queued specs from DB, builds
a DAG, detects cycles, topologically sorts it, then runs an LLM critique
(claude-haiku) that flags specs that should depend on each other but don't,
wrongly-serial specs that could run in parallel, and scope contradictions.
Critique is cached by hash(DAG topology + spec titles) to avoid
re-spending tokens on identical state.
boi dispatch-many <spec1> <spec2> ... — runs plan first, then:
- block-severity concerns → refuse + print concerns, exit non-zero
- warn-severity → print proposed order + prompt (or --yes/--force)
- clean → auto-dispatch in topological order with correct --after chain
boi why <spec-id> — explain the last failure for a spec from DB
(uses the new failure.rs structured failure capture).

Lightweight single-dispatch gate

boi dispatch <spec> now runs deterministic implicit-dep detection (no LLM)
when in-flight specs exist. If artifact overlap is detected and no --after
was provided: WARN + suggest --after. Add --skip-plan to bypass.

Core modules

File	What it does
`src/cli/plan.rs`	DAG model: `collect_artifacts`, `detect_implicit_deps`, topo-sort, cycle detection + LLM critique pass
`src/cli/dispatch_many.rs`	Gated multi-spec dispatch
`src/cli/dispatch.rs`	Lightweight implicit-dep WARN added to existing command
`src/runtime/openrouter.rs`	HTTP client for OpenRouter (haiku critique calls)
`src/failure.rs`	Structured failure-reason capture + FailureReason enum
`src/cli/why.rs`	`boi why` command
`src/cli/status.rs`	Error rendering under failed specs

Test coverage

dag_build: empty queue, single spec, two-spec chain, fan-out, diamond,
cycle detection (errors loud), implicit-dep detection
dispatch_many: correct --after chain for 3-spec implicit chain; cycle
in declared deps → refusal; --force overrides warn but not block
dispatch_dag_warn: WARN emitted when artifact overlap with no --after
failure_reason: roundtrip, truncation, legacy fallback
failure_capture: no NULL errors on any failure path
status_render_error: rendering correctness + verbose mode

Example: before vs after (DAG)

Before (manual, fragile):

boi dispatch specs/a.yaml
boi dispatch --after=A specs/b.yaml   # must remember A writes the file B reads
boi dispatch --after=B specs/c.yaml   # must remember order

After (automatic):

boi dispatch-many specs/a.yaml specs/b.yaml specs/c.yaml
# → plan detects b depends on a, c depends on b
# → dispatches in order A → B (--after=A) → C (--after=B)

Do not merge — Mike reviews.

🤖 Generated with Claude Code

- Add deterministic runtime (builtin:commit, builtin:merge, builtin:cleanup) that skips Claude spawn entirely — cold-start win for post-task phases - Add spec-critique ↔ spec-improve loop (separate Claude sessions, max 3 rounds) replacing the old spec-review phase - Add commit/merge/cleanup phase TOMLs wired to deterministic builtins - Wire mode.v2 in pipelines.toml with spec_pre_phases / spec_post_phases - Add end-to-end v2 smoke test (tests/v2_smoke.rs) - Update README, SKILL.md, and docs/pipelines.md for v2 mode v1 modes (default, challenge, discover, generate) are untouched. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- Add spawns_per_tick config field (default 4) to cap workers spawned per tick - Rewrite daemon dequeue loop: drain up to spawns_per_tick per tick with 50-150ms jitter - SIGHUP handler: live-reload max_workers/spawns_per_tick/claude_bin without restart - Add `boi daemon reload` subcommand (sends SIGHUP to daemon.pid) - Add try_load()/try_load_from() for fallible config parsing (bad config = no-op) - docs/daemon.md: tick cadence, spawns_per_tick semantics, hot-reload behavior - 14 new tests: daemon_batch (8) + daemon_hotreload (6), all passing Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Adds automatic dependency analysis to boi dispatch flow so ordering mistakes are caught mechanically before a spec queue goes wrong. Core additions: - src/cli/plan.rs: DAG model (collect_artifacts, detect_implicit_deps, topological sort, cycle detection) + LLM critique pass via claude-haiku. Critique is cached by hash(topology + titles) to avoid re-spending tokens. - src/cli/dispatch_many.rs: boi dispatch-many — accepts N specs, runs plan first, gates dispatch on critique result (block=refuse, warn=prompt, clean=auto-approve). Dispatches in topological order with --after chain. - src/cli/dispatch.rs: lightweight implicit-dep WARN on single dispatch when no --after is provided. Full LLM pass reserved for plan/dispatch-many. - src/runtime/openrouter.rs: HTTP client for OpenRouter (haiku critique calls). - src/failure.rs: structured failure-reason capture for runner diagnostics. - src/cli/why.rs: boi why <spec-id> — explain last failure from DB. Documentation: - README.md: boi plan + dispatch-many sections with examples - docs/dag-reassess.md: model explanation + command selection guide - SKILL.md: updated CLI table Tests: dag_build (empty/single/chain/fan-out/diamond/cycle/implicit-dep), dispatch_many (right --after chain, cycle refusal, --force behaviour), dispatch_dag_warn (warn-on-implicit-dep for single dispatch). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

mrap and others added 3 commits April 29, 2026 16:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: DAG reassessment via boi plan + dispatch-many#14

feat: DAG reassessment via boi plan + dispatch-many#14
mrap wants to merge 3 commits into
mainfrom
feat/dag-reassess

mrap commented Apr 29, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

mrap commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Also: typed failure reasons + inline display in boi status (closes SO S12 gap)

Typed enum (src/failure.rs)

Failure capture (src/queue.rs, src/worker.rs, src/runner.rs)

Status rendering (src/cli/status.rs)

boi why <spec-id> (src/cli/why.rs)

Tests

Implementation

New commands

Lightweight single-dispatch gate

Core modules

Test coverage

Example: before vs after (DAG)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

mrap commented Apr 29, 2026 •

edited

Loading

Also: typed failure reasons + inline display in `boi status` (closes SO S12 gap)

Typed enum (`src/failure.rs`)

Failure capture (`src/queue.rs`, `src/worker.rs`, `src/runner.rs`)

Status rendering (`src/cli/status.rs`)

`boi why <spec-id>` (`src/cli/why.rs`)