Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
18 commits
Select commit Hold shift + click to select a range
2517366
docs(guide): add user guides with SDD workflow doc
cpplain Mar 28, 2026
a819b6e
docs(guide): add design specs writing guide
cpplain Mar 28, 2026
cfd05a8
docs(guide): simplify AGENTS.md to minimal instructions
cpplain Mar 28, 2026
870749c
docs(guide): restructure design specs from prescriptive to descriptive
cpplain Mar 28, 2026
cc801af
docs(guide): sharpen design specs guide for agent clarity
cpplain Mar 28, 2026
acb9043
docs(guide): add cross-boundary and side-effect examples to design sp…
cpplain Mar 28, 2026
b2eba10
docs(guide): trim design specs guide to reference weight
cpplain Mar 28, 2026
b1ed73a
docs(guide): add configuration guide and consolidate with prompts
cpplain Mar 29, 2026
40a8a76
docs: simplify AGENTS.md to defer to CLAUDE.md
cpplain Mar 31, 2026
df30faa
docs(guide): add prompt, task, and plan templates with router pattern
cpplain Mar 31, 2026
75c4834
docs(guide): fix router prompt and add blocked task handling
cpplain Mar 31, 2026
1df5d7c
docs(guide): fix cross-file inconsistencies
cpplain Mar 31, 2026
6c7e035
docs(guide): clarify ambiguities and missing context across guides
cpplain Apr 1, 2026
651fe78
docs(guide): add glossary location, task invariant, and plan constraints
cpplain Apr 1, 2026
7bb89f8
docs(guide): add plan termination step and fix inconsistencies
cpplain Apr 1, 2026
8bb3eb2
docs(guide): use consistent h1 prefix in workflow guide
cpplain Apr 1, 2026
2d8edb8
docs(guide): relabel workflow sections to distinguish setup from loop
cpplain Apr 1, 2026
fad7932
docs(examples): add latest lorah configuration example
cpplain Apr 4, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
53 changes: 1 addition & 52 deletions AGENTS.md
Original file line number Diff line number Diff line change
@@ -1,52 +1 @@
# Lorah

## Project Overview

Lorah is a simple infinite-loop harness for long-running autonomous coding agents. It runs Claude Code CLI in a continuous loop, parsing stream-JSON output and formatting it nicely. Includes a task management system for structured agent workflow coordination. The agent manages its own workflow — Lorah just provides the loop, error recovery, output formatting, and task tracking. Follows the Ralph pattern. Distributed as a single self-contained binary with no external runtime dependencies.

## Commands

```bash
make build # Build binary
go run . run PROMPT.md # Development run
lorah run PROMPT.md [flags...] # Run loop (all flags after prompt passed to claude CLI)
lorah task <subcommand> [args...] # Task management
```

Use TDD — write tests before implementation. Use `make fmt` and `make lint`.

## Architecture

```
main.go CLI router: subcommand dispatch, help text, version
internal/loop/
loop.go Run() entry point, signal handling, infinite loop
claude.go Subprocess execution
output.go Stream-JSON parsing and formatted output
constants.go ANSI colors, buffer size, retry delay
internal/task/
task.go Core types: Phase, Section, Task, TaskStatus, TaskList, Filter
storage.go Storage interface
json_storage.go JSONStorage implementation (.lorah/tasks.json)
format.go Output formatters: json, markdown
cmd.go CLI subcommand handlers
docs/design/ Design specifications (authoritative reference)
```

## Design Principles

**Ralph Philosophy**: The agent is smart enough to manage its own workflow. Don't orchestrate — provide a simple loop and trust the model.

**Radical Simplicity**: Every line of code is overhead. The simplest solution that works is the best solution. Prefer deleting code over adding it.

**Agent is in Control**: The harness provides the loop and nice output. The agent reads the codebase, decides what to do, and makes progress. No phase management needed.

**No Ceremony**: No config files, session state, lock files, or scaffolding commands. Just a prompt file and a loop.

**Filesystem as State**: No session files. Git commits show progress. Agent reads files to understand context.

**Design Specifications**: Authoritative design docs live in `docs/design/`. When in doubt about intended behavior, consult the specs: `cli.md`, `run.md`, `output.md`, `task.md`.

## Dependencies

No external runtime dependencies. All functionality uses the Go standard library. The `claude` CLI (separate install) is the only runtime requirement.
If you discover something non-obvious about this project, ask if it should be noted in CLAUDE.md.
13 changes: 13 additions & 0 deletions docs/guide/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# Lorah Guides

Practical guides for using Lorah effectively. These cover _how to use_ Lorah — for _how Lorah is built_, see the [design specifications](../design/README.md).

## Index

| Guide | Description |
| ------------------------------------ | ---------------------------------------------------------- |
| [workflow.md](workflow.md) | An incremental spec-driven development workflow pattern |
| [design-specs.md](design-specs.md) | How to write design specs that agents can reliably execute |
| [configuration.md](configuration.md) | Setting up a .lorah project directory |
| [prompts.md](prompts.md) | Router and phase prompt templates for the agent loop |
| [tasks.md](tasks.md) | Task file format and status lifecycle |
104 changes: 104 additions & 0 deletions docs/guide/configuration.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
# Guide: Configuration

A `.lorah` directory contains the files that define a unit of work for the agent loop. It sits at the project root by default (overridable with Lorah's `--dir` flag). Commit `.lorah/` to git — the workflow depends on git history for state and continuity, and task files are committed as part of the loop.

```
.lorah/
├── plan.md # scope and acceptance criteria
├── prompt.md # orient + route to phase prompt
├── prompts/
│ ├── plan.md # task selection
│ ├── test.md # write tests for selected task
│ └── implement.md # make tests pass
├── tasks/
│ ├── 01-<task>.md # one file per task
│ └── ...
└── settings.json # Claude Code CLI settings
```

The `lorah` CLI wraps Claude Code to run the agent loop. It accepts a prompt file and forwards additional flags to `claude`.

## Plan file

The plan file is the output of the scoping step — [Setup](workflow.md#setup-scope-the-work) in the spec-driven workflow. It defines what is being built and what done looks like. The agent loop uses it as the contract between the human and the agents.

A plan file contains:

- **Scope** — what is being built, at the level of a brief description and a list of capabilities. Reference the design specs rather than duplicating them.
- **Boundaries** — constraints and invariants that apply across the work (e.g., "stdlib only", "no external dependencies").
- **Acceptance criteria** — concrete, verifiable conditions that define when the work is complete. An agent should be able to check each criterion against git state and test results.

A plan file does not contain individual tasks. Task selection happens inside the loop, where each agent picks the next task based on current state.

```markdown
# <Project/Feature Name>

## Scope

What is being built — brief description and list of capabilities.
Reference the design specs rather than duplicating them.

## Boundaries

- Constraints and invariants that apply across the work.

## Acceptance Criteria

- [ ] Concrete, verifiable conditions.
```

## Prompt files

The prompt structure splits into a router prompt and phase-specific prompts. This keeps each agent's context small and focused. See the [prompt files guide](prompts.md) for templates.

## Task files

Each task gets its own file in `.lorah/tasks/`. The planning agent creates one task file per iteration; the testing and implementation agents update it as they work. See the [task files guide](tasks.md) for the template and status values.

## Settings

`settings.json` is a standard Claude Code CLI settings file. Pass it via the `--settings` flag:

```sh
lorah run prompt.md --settings .lorah/settings.json
```

Common fields:

```json
{
"model": "sonnet",
"permissions": {
"defaultMode": "bypassPermissions"
},
"sandbox": {
"enabled": true,
"autoAllowBashIfSandboxed": true
},
"attribution": { "commit": "", "pr": "" }
}
```

- **model** — which Claude model to use.
- **permissions** — `bypassPermissions` is typical for autonomous loops where no human is approving each action.
- **sandbox** — enables sandboxed execution. `autoAllowBashIfSandboxed` avoids permission prompts for shell commands when sandboxing is on.
- **attribution** — text added to commit messages (as git trailers) and PR descriptions. Empty strings disable attribution; omitting the field uses Claude Code's defaults.

See the [Claude Code settings reference](https://code.claude.com/docs/en/settings) for all available settings.

## Claude flags

Additional Claude CLI flags can be passed after the prompt file:

```sh
lorah run prompt.md --settings .lorah/settings.json --model claude-opus-4-6 --max-turns 50
```

Flags are passed through to the `claude` CLI unchanged. Common flags:

- `--settings <file>` — path to settings file
- `--model <model>` — override the model (takes precedence over settings.json)
- `--max-turns <n>` — limit the number of agent turns per iteration
- `--allowedTools <tools>` — restrict which tools the agent can use

See the [Claude Code CLI reference](https://code.claude.com/docs/en/cli-reference) for all available flags.
136 changes: 136 additions & 0 deletions docs/guide/design-specs.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,136 @@
# Guide: Writing Design Specs

A design spec is a behavioral contract — it defines what the system does, not how to code it or how to use it. Specs are the foundation of the [incremental spec-driven workflow](workflow.md) and the single source of truth for their domain. If the spec and the code disagree, one of them has a bug.

Specs are stable during execution — modifying a spec mid-loop invalidates the plan file, existing tests, and completed work derived from it. Changes happen between units of work, not during them.

Specs are not tutorials, READMEs, API reference docs, or implementation plans.

A spec emerges through iteration between an engineer and an agent. The engineer holds the intent; the agent drives the structure. The properties below guide each pass — they are not a checklist to complete once. How the engineer-agent pair reaches a spec that meets these properties is up to them.

## Spec Structure

A spec has three invariant parts — Overview, Examples, Related Specifications — and topic-specific behavioral sections in between. The middle sections diverge based on what the component does; their shape is dictated by the content, not a template.

```markdown
# <Title> Specification

---

## 1. Overview

### Purpose

What this component does and why it exists — one paragraph.

### Goals

- Bulleted list of what this spec defines.

### Non-Goals

- Bulleted list of active exclusions.

---

## 2–N. [Topic-specific sections]

The middle sections define the component's behavior. Their shape
depends on what the component does:

- If it has a user-facing interface (CLI, API), define it first —
commands, flags, endpoints, parameters.
- If it has distinct behavioral modes or lifecycle phases, give
each its own section.
- If it has data structures or storage, specify the schema.
- If it has internal rules or algorithms, describe them precisely
enough to test against.

Use tables, code blocks, and subsections as the content demands.

---

## N+1. Examples

5–15 concrete input/output examples. These become test cases.

---

## N+2. Related Specifications

- Links to specs that interact with this one.
```

**One spec per logical unit.** A logical unit is a behavioral domain that can be tested in isolation. Split specs by what a component _does_ — its observable behavior — not by file or package. If two behaviors can be tested without referencing each other, they belong in separate specs. If testing one requires understanding the other, they either belong together or need an explicit cross-reference.

**Cross-reference, do not duplicate.** When two specs interact, link between them. Duplicated content diverges over time, and agents cannot know which copy is authoritative.

**Self-contained sections.** An agent working on output formatting should be able to read the relevant section of the output spec without reading every section that precedes it. Each section should establish its own context.

## Properties of a Good Spec

These properties define what makes a spec effective. When properties conflict, prioritize testability and boundary-completeness over scannability.

### Behavioral, not implementational

Specs define what a component does as observed from the outside — its inputs, outputs, error cases, and side effects. They do not prescribe internal implementation. The exception is cross-boundary contracts: shared data formats, storage schemas, or contracts that multiple components depend on. A useful test: if a test in a _different_ spec would assert on this detail, it is a cross-boundary contract and belongs in the spec.

- Cross-boundary: "Tasks are persisted as a `TaskList` JSON object in `tasks.json` with the schema defined in §3." — Multiple specs depend on this format.
- Not cross-boundary: "The router uses a switch statement to dispatch subcommands." — Only this spec's implementation cares.
- Borderline: "The `Storage` interface defines `Load`, `Save`, `Get`, `List`, `Create`, `Update`, `Delete`." — Specify it now if the intent is to stabilize it for multiple consumers; leave it as an implementation detail if the interface is still in flux.

### Prescriptive tone

Use present-tense declarative statements. "The CLI exits 1 on unknown command" — not "should exit" or "ideally exits."

### Testable

The difference between a testable spec and a vague one is concrete, observable values:

- Vague: "The program handles Ctrl+C gracefully."
Precise: "First SIGINT sets a stopping flag and lets the current iteration complete. Second SIGINT calls os.Exit(0) immediately."
- Vague: "Long tool inputs are truncated."
Precise: "Tool inputs with more than one line display the first line followed by `... +N lines` where N is the remaining line count."
- Vague: "The system creates a default file if none exists."
Precise: "If tasks.json does not exist on Load, return an empty TaskList with Version 1.0. Do not create the file on disk until the first Save."
- Vague: "The system writes results to an output file."
Precise: "On successful completion, the command writes the result JSON to `{dir}/output.json` with 0644 permissions. If the file exists, it is overwritten atomically via write-to-temp-then-rename. If the directory does not exist, the command returns an error — it does not create parent directories."

If a claim is hard to make precise, the behavior is underspecified — return to it and tighten it before moving on.

### Boundary-complete

Every input has defined behavior for empty, missing, and invalid cases. Every output has a defined format and error representation.

### Explicitly scoped

Every spec needs Goals and Non-Goals. Non-Goals are active exclusions, not a "future work" list.

### Decision rationale

Record why, not just what. Apply to non-obvious constraints and rejected alternatives — self-evident decisions don't need rationale. Keep rationale inline, close to the decision it explains.

### Defined vocabulary

Define terms once in a central glossary — a shared file (e.g., `glossary.md`) or a section in the specs directory README — and use them consistently. Agents treat synonyms as distinct concepts.

### Scannable structure

Each spec uses numbered top-level sections with horizontal rule dividers. Use tables for reference data, code blocks for formats, and subsection headings for distinct behavioral areas.

## Readiness Checklist

A spec is ready for implementation when all of these hold:

- [ ] Does the spec define observable behavior without prescribing internal implementation?
- [ ] Can you write a test assertion for every behavioral claim?
- [ ] Are all inputs covered for empty, missing, and invalid cases?
- [ ] Does every output have a defined format and error representation?
- [ ] Are cross-boundary contracts identified and specified?
- [ ] Do Non-Goals actively exclude the most likely scope creep?
- [ ] Do 5–15 concrete examples exist and were they easy to write?
- [ ] Does the spec use present-tense declarative statements without hedging?
- [ ] Do non-obvious decisions include inline rationale?
- [ ] Are terms defined in the glossary and used consistently?

If any criterion fails, the spec needs more work. This is expected — specs tighten through iteration.
Loading