cpplain · cpplain · Mar 28, 2026 · Mar 28, 2026 · Mar 28, 2026 · Mar 28, 2026
diff --git a/AGENTS.md b/AGENTS.md
@@ -1,52 +1 @@
-# Lorah
-
-## Project Overview
-
-Lorah is a simple infinite-loop harness for long-running autonomous coding agents. It runs Claude Code CLI in a continuous loop, parsing stream-JSON output and formatting it nicely. Includes a task management system for structured agent workflow coordination. The agent manages its own workflow — Lorah just provides the loop, error recovery, output formatting, and task tracking. Follows the Ralph pattern. Distributed as a single self-contained binary with no external runtime dependencies.
-
-## Commands
-
-```bash
-make build                         # Build binary
-go run . run PROMPT.md             # Development run
-lorah run PROMPT.md [flags...]     # Run loop (all flags after prompt passed to claude CLI)
-lorah task <subcommand> [args...]  # Task management
-```
-
-Use TDD — write tests before implementation. Use `make fmt` and `make lint`.
-
-## Architecture
-
-```
-main.go                   CLI router: subcommand dispatch, help text, version
-internal/loop/
-  loop.go                 Run() entry point, signal handling, infinite loop
-  claude.go               Subprocess execution
-  output.go               Stream-JSON parsing and formatted output
-  constants.go            ANSI colors, buffer size, retry delay
-internal/task/
-  task.go                 Core types: Phase, Section, Task, TaskStatus, TaskList, Filter
-  storage.go              Storage interface
-  json_storage.go         JSONStorage implementation (.lorah/tasks.json)
-  format.go               Output formatters: json, markdown
-  cmd.go                  CLI subcommand handlers
-docs/design/              Design specifications (authoritative reference)
-```
-
-## Design Principles
-
-**Ralph Philosophy**: The agent is smart enough to manage its own workflow. Don't orchestrate — provide a simple loop and trust the model.
-
-**Radical Simplicity**: Every line of code is overhead. The simplest solution that works is the best solution. Prefer deleting code over adding it.
-
-**Agent is in Control**: The harness provides the loop and nice output. The agent reads the codebase, decides what to do, and makes progress. No phase management needed.
-
-**No Ceremony**: No config files, session state, lock files, or scaffolding commands. Just a prompt file and a loop.
-
-**Filesystem as State**: No session files. Git commits show progress. Agent reads files to understand context.
-
-**Design Specifications**: Authoritative design docs live in `docs/design/`. When in doubt about intended behavior, consult the specs: `cli.md`, `run.md`, `output.md`, `task.md`.
-
-## Dependencies
-
-No external runtime dependencies. All functionality uses the Go standard library. The `claude` CLI (separate install) is the only runtime requirement.
+If you discover something non-obvious about this project, ask if it should be noted in CLAUDE.md.
diff --git a/docs/guide/README.md b/docs/guide/README.md
@@ -0,0 +1,13 @@
+# Lorah Guides
+
+Practical guides for using Lorah effectively. These cover _how to use_ Lorah — for _how Lorah is built_, see the [design specifications](../design/README.md).
+
+## Index
+
+| Guide                                | Description                                                |
+| ------------------------------------ | ---------------------------------------------------------- |
+| [workflow.md](workflow.md)           | An incremental spec-driven development workflow pattern    |
+| [design-specs.md](design-specs.md)   | How to write design specs that agents can reliably execute |
+| [configuration.md](configuration.md) | Setting up a .lorah project directory                      |
+| [prompts.md](prompts.md)             | Router and phase prompt templates for the agent loop       |
+| [tasks.md](tasks.md)                 | Task file format and status lifecycle                      |
diff --git a/docs/guide/configuration.md b/docs/guide/configuration.md
@@ -0,0 +1,104 @@
+# Guide: Configuration
+
+A `.lorah` directory contains the files that define a unit of work for the agent loop. It sits at the project root by default (overridable with Lorah's `--dir` flag). Commit `.lorah/` to git — the workflow depends on git history for state and continuity, and task files are committed as part of the loop.
+
+```
+.lorah/
+├── plan.md              # scope and acceptance criteria
+├── prompt.md            # orient + route to phase prompt
+├── prompts/
+│   ├── plan.md          # task selection
+│   ├── test.md          # write tests for selected task
+│   └── implement.md     # make tests pass
+├── tasks/
+│   ├── 01-<task>.md     # one file per task
+│   └── ...
+└── settings.json        # Claude Code CLI settings
+```
+
+The `lorah` CLI wraps Claude Code to run the agent loop. It accepts a prompt file and forwards additional flags to `claude`.
+
+## Plan file
+
+The plan file is the output of the scoping step — [Setup](workflow.md#setup-scope-the-work) in the spec-driven workflow. It defines what is being built and what done looks like. The agent loop uses it as the contract between the human and the agents.
+
+A plan file contains:
+
+- **Scope** — what is being built, at the level of a brief description and a list of capabilities. Reference the design specs rather than duplicating them.
+- **Boundaries** — constraints and invariants that apply across the work (e.g., "stdlib only", "no external dependencies").
+- **Acceptance criteria** — concrete, verifiable conditions that define when the work is complete. An agent should be able to check each criterion against git state and test results.
+
+A plan file does not contain individual tasks. Task selection happens inside the loop, where each agent picks the next task based on current state.
+
+```markdown
+# <Project/Feature Name>
+
+## Scope
+
+What is being built — brief description and list of capabilities.
+Reference the design specs rather than duplicating them.
+
+## Boundaries
+
+- Constraints and invariants that apply across the work.
+
+## Acceptance Criteria
+
+- [ ] Concrete, verifiable conditions.
+```
+
+## Prompt files
+
+The prompt structure splits into a router prompt and phase-specific prompts. This keeps each agent's context small and focused. See the [prompt files guide](prompts.md) for templates.
+
+## Task files
+
+Each task gets its own file in `.lorah/tasks/`. The planning agent creates one task file per iteration; the testing and implementation agents update it as they work. See the [task files guide](tasks.md) for the template and status values.
+
+## Settings
+
+`settings.json` is a standard Claude Code CLI settings file. Pass it via the `--settings` flag:
+
+```sh
+lorah run prompt.md --settings .lorah/settings.json
+```
+
+Common fields:
+
+```json
+{
+  "model": "sonnet",
+  "permissions": {
+    "defaultMode": "bypassPermissions"
+  },
+  "sandbox": {
+    "enabled": true,
+    "autoAllowBashIfSandboxed": true
+  },
+  "attribution": { "commit": "", "pr": "" }
+}
+```
+
+- **model** — which Claude model to use.
+- **permissions** — `bypassPermissions` is typical for autonomous loops where no human is approving each action.
+- **sandbox** — enables sandboxed execution. `autoAllowBashIfSandboxed` avoids permission prompts for shell commands when sandboxing is on.
+- **attribution** — text added to commit messages (as git trailers) and PR descriptions. Empty strings disable attribution; omitting the field uses Claude Code's defaults.
+
+See the [Claude Code settings reference](https://code.claude.com/docs/en/settings) for all available settings.
+
+## Claude flags
+
+Additional Claude CLI flags can be passed after the prompt file:
+
+```sh
+lorah run prompt.md --settings .lorah/settings.json --model claude-opus-4-6 --max-turns 50
+```
+
+Flags are passed through to the `claude` CLI unchanged. Common flags:
+
+- `--settings <file>` — path to settings file
+- `--model <model>` — override the model (takes precedence over settings.json)
+- `--max-turns <n>` — limit the number of agent turns per iteration
+- `--allowedTools <tools>` — restrict which tools the agent can use
+
+See the [Claude Code CLI reference](https://code.claude.com/docs/en/cli-reference) for all available flags.
diff --git a/docs/guide/design-specs.md b/docs/guide/design-specs.md
@@ -0,0 +1,136 @@
+# Guide: Writing Design Specs
+
+A design spec is a behavioral contract — it defines what the system does, not how to code it or how to use it. Specs are the foundation of the [incremental spec-driven workflow](workflow.md) and the single source of truth for their domain. If the spec and the code disagree, one of them has a bug.
+
+Specs are stable during execution — modifying a spec mid-loop invalidates the plan file, existing tests, and completed work derived from it. Changes happen between units of work, not during them.
+
+Specs are not tutorials, READMEs, API reference docs, or implementation plans.
+
+A spec emerges through iteration between an engineer and an agent. The engineer holds the intent; the agent drives the structure. The properties below guide each pass — they are not a checklist to complete once. How the engineer-agent pair reaches a spec that meets these properties is up to them.
+
+## Spec Structure
+
+A spec has three invariant parts — Overview, Examples, Related Specifications — and topic-specific behavioral sections in between. The middle sections diverge based on what the component does; their shape is dictated by the content, not a template.
+
+```markdown
+# <Title> Specification
+
+---
+
+## 1. Overview
+
+### Purpose
+
+What this component does and why it exists — one paragraph.
+
+### Goals
+
+- Bulleted list of what this spec defines.
+
+### Non-Goals
+
+- Bulleted list of active exclusions.
+
+---
+
+## 2–N. [Topic-specific sections]
+
+The middle sections define the component's behavior. Their shape
+depends on what the component does:
+
+- If it has a user-facing interface (CLI, API), define it first —
+  commands, flags, endpoints, parameters.
+- If it has distinct behavioral modes or lifecycle phases, give
+  each its own section.
+- If it has data structures or storage, specify the schema.
+- If it has internal rules or algorithms, describe them precisely
+  enough to test against.
+
+Use tables, code blocks, and subsections as the content demands.
+
+---
+
+## N+1. Examples
+
+5–15 concrete input/output examples. These become test cases.
+
+---
+
+## N+2. Related Specifications
+
+- Links to specs that interact with this one.
+```
+
+**One spec per logical unit.** A logical unit is a behavioral domain that can be tested in isolation. Split specs by what a component _does_ — its observable behavior — not by file or package. If two behaviors can be tested without referencing each other, they belong in separate specs. If testing one requires understanding the other, they either belong together or need an explicit cross-reference.
+
+**Cross-reference, do not duplicate.** When two specs interact, link between them. Duplicated content diverges over time, and agents cannot know which copy is authoritative.
+
+**Self-contained sections.** An agent working on output formatting should be able to read the relevant section of the output spec without reading every section that precedes it. Each section should establish its own context.
+
+## Properties of a Good Spec
+
+These properties define what makes a spec effective. When properties conflict, prioritize testability and boundary-completeness over scannability.
+
+### Behavioral, not implementational
+
+Specs define what a component does as observed from the outside — its inputs, outputs, error cases, and side effects. They do not prescribe internal implementation. The exception is cross-boundary contracts: shared data formats, storage schemas, or contracts that multiple components depend on. A useful test: if a test in a _different_ spec would assert on this detail, it is a cross-boundary contract and belongs in the spec.
+
+- Cross-boundary: "Tasks are persisted as a `TaskList` JSON object in `tasks.json` with the schema defined in §3." — Multiple specs depend on this format.
+- Not cross-boundary: "The router uses a switch statement to dispatch subcommands." — Only this spec's implementation cares.
+- Borderline: "The `Storage` interface defines `Load`, `Save`, `Get`, `List`, `Create`, `Update`, `Delete`." — Specify it now if the intent is to stabilize it for multiple consumers; leave it as an implementation detail if the interface is still in flux.
+
+### Prescriptive tone
+
+Use present-tense declarative statements. "The CLI exits 1 on unknown command" — not "should exit" or "ideally exits."
+
+### Testable
+
+The difference between a testable spec and a vague one is concrete, observable values:
+
+- Vague: "The program handles Ctrl+C gracefully."
+  Precise: "First SIGINT sets a stopping flag and lets the current iteration complete. Second SIGINT calls os.Exit(0) immediately."
+- Vague: "Long tool inputs are truncated."
+  Precise: "Tool inputs with more than one line display the first line followed by `... +N lines` where N is the remaining line count."
+- Vague: "The system creates a default file if none exists."
+  Precise: "If tasks.json does not exist on Load, return an empty TaskList with Version 1.0. Do not create the file on disk until the first Save."
+- Vague: "The system writes results to an output file."
+  Precise: "On successful completion, the command writes the result JSON to `{dir}/output.json` with 0644 permissions. If the file exists, it is overwritten atomically via write-to-temp-then-rename. If the directory does not exist, the command returns an error — it does not create parent directories."
+
+If a claim is hard to make precise, the behavior is underspecified — return to it and tighten it before moving on.
+
+### Boundary-complete
+
+Every input has defined behavior for empty, missing, and invalid cases. Every output has a defined format and error representation.
+
+### Explicitly scoped
+
+Every spec needs Goals and Non-Goals. Non-Goals are active exclusions, not a "future work" list.
+
+### Decision rationale
+
+Record why, not just what. Apply to non-obvious constraints and rejected alternatives — self-evident decisions don't need rationale. Keep rationale inline, close to the decision it explains.
+
+### Defined vocabulary
+
+Define terms once in a central glossary — a shared file (e.g., `glossary.md`) or a section in the specs directory README — and use them consistently. Agents treat synonyms as distinct concepts.
+
+### Scannable structure
+
+Each spec uses numbered top-level sections with horizontal rule dividers. Use tables for reference data, code blocks for formats, and subsection headings for distinct behavioral areas.
+
+## Readiness Checklist
+
+A spec is ready for implementation when all of these hold:
+
+- [ ] Does the spec define observable behavior without prescribing internal implementation?
+- [ ] Can you write a test assertion for every behavioral claim?
+- [ ] Are all inputs covered for empty, missing, and invalid cases?
+- [ ] Does every output have a defined format and error representation?
+- [ ] Are cross-boundary contracts identified and specified?
+- [ ] Do Non-Goals actively exclude the most likely scope creep?
+- [ ] Do 5–15 concrete examples exist and were they easy to write?
+- [ ] Does the spec use present-tense declarative statements without hedging?
+- [ ] Do non-obvious decisions include inline rationale?
+- [ ] Are terms defined in the glossary and used consistently?
+
+If any criterion fails, the spec needs more work. This is expected — specs tighten through iteration.