Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions .gitattributes
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# The pre-commit hook stack enforces LF line endings. Keep checkout behavior
# aligned across Windows, macOS, and Linux so `pre-commit run --all-files` does
# not rewrite the working tree on Windows clones with global autocrlf enabled.
* text=auto eol=lf

*.png binary
*.jpg binary
*.jpeg binary
*.gif binary
*.ico binary
*.pdf binary
9 changes: 9 additions & 0 deletions .github/pull_request_template.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,15 @@
<!-- Required for UI change. Delete this section for non-UI PRs. -->


<!--
## Local Beads

Optional opt-in. Only add this section if your team uses a local Beads queue
(see docs/BEADS.md). Uncomment the heading and replace this comment with the
Bead id. GitHub issue linkage below is still required either way.
-->


## Linked issue

Closes #
3 changes: 2 additions & 1 deletion .github/scripts/check_aspirational_tickets.py
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,7 @@
from pathlib import Path

INVARIANTS_DOC = Path("docs/INVARIANTS.md")
GITHUB_API_ERRORS = (urllib.error.URLError, TimeoutError, json.JSONDecodeError)

# A marker line *starts* with one or two asterisks immediately followed by
# `Aspirational` and a word boundary. Avoids picking up mid-sentence prose
Expand Down Expand Up @@ -88,7 +89,7 @@ def _issue_state(repo: str, number: str, token: str) -> str | None:
try:
with urllib.request.urlopen(req, timeout=5) as response: # noqa: S310
payload = json.loads(response.read().decode("utf-8"))
except urllib.error.URLError, TimeoutError, json.JSONDecodeError:
except GITHUB_API_ERRORS:
return None
state = payload.get("state")
return state if isinstance(state, str) else None
Expand Down
3 changes: 2 additions & 1 deletion .github/scripts/check_pin_freshness.py
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,7 @@ def _load_pin_module() -> ModuleType:

_pins = _load_pin_module()
_API_BASE = "https://api.github.com"
GITHUB_API_ERRORS = (urllib.error.URLError, TimeoutError, json.JSONDecodeError)


def _fetch_json(url: str, token: str) -> dict[str, object] | None:
Expand All @@ -104,7 +105,7 @@ def _fetch_json(url: str, token: str) -> dict[str, object] | None:
try:
with urllib.request.urlopen(req, timeout=10) as response: # noqa: S310
payload = json.loads(response.read().decode("utf-8"))
except urllib.error.URLError, TimeoutError, json.JSONDecodeError:
except GITHUB_API_ERRORS:
return None
return payload if isinstance(payload, dict) else None

Expand Down
4 changes: 3 additions & 1 deletion .github/scripts/check_tests_present.py
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,8 @@
import sys
from pathlib import Path

EVENT_READ_ERRORS = (OSError, json.JSONDecodeError)

# Prefixes that declare a behaviour change → tests required.
BLOCKING_PREFIXES: frozenset[str] = frozenset({"feat", "fix"})

Expand All @@ -59,7 +61,7 @@ def pr_title_from_event() -> str | None:
return None
try:
data = json.loads(Path(event_path).read_text(encoding="utf-8"))
except OSError, json.JSONDecodeError:
except EVENT_READ_ERRORS:
return None
pr = data.get("pull_request")
if not isinstance(pr, dict):
Expand Down
3 changes: 2 additions & 1 deletion .github/scripts/check_version_bump.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@
PYPROJECT = Path("pyproject.toml")
UV_LOCK = Path("uv.lock")
PACKAGE_NAME = "harness-python-react"
EVENT_READ_ERRORS = (OSError, json.JSONDecodeError)

# Match the project's self-version block in uv.lock:
#
Expand Down Expand Up @@ -105,7 +106,7 @@ def pr_title_from_event() -> str | None:
return None
try:
data = json.loads(Path(event_path).read_text(encoding="utf-8"))
except OSError, json.JSONDecodeError:
except EVENT_READ_ERRORS:
return None
pr = data.get("pull_request")
if not isinstance(pr, dict):
Expand Down
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,10 @@
.claude/bash-log.txt
.claude/worktrees/

# Optional local Beads queue state
.beads/
beads/

# Node / Frontend
node_modules/
frontend/dist/
Expand Down
22 changes: 19 additions & 3 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,14 +34,15 @@ The subject is **lowercase** after the colon. Title Case prose (`Add the thing`)

1. Open the issue first. Use a feature/bug template; fill every section.
2. Branch off `develop` with the matching name.
3. Land one logical change per PR. Stack PRs if the work is naturally split.
4. The PR template asks five things — answer each (`None` is valid where applicable):
3. If your team uses Beads, mirror or claim the linked issue in the local Beads queue after the issue exists. Beads track local ready/blocked execution only; GitHub Issues remain canonical for scope, discussion, PR linkage, and closure.
4. Land one logical change per PR. Stack PRs if the work is naturally split.
5. The PR template asks five things — answer each (`None` is valid where applicable):
- **What & why** (1–3 lines)
- **Test plan** (checkbox list; CI covers most of it)
- **Invariants affected** — cite numbered rules from `docs/INVARIANTS.md`
- **New deps / actions / external surface** (anchor for supply-chain review)
- **Screenshots** (UI changes only)
5. Wait for green CI + a code-owner review before merging.
6. Wait for green CI + a code-owner review before merging.

### Solo-owner merge policy

Expand All @@ -55,6 +56,21 @@ gh pr merge <N> --admin --squash --delete-branch

When a second collaborator joins, drop the `--admin` flag and adopt standard PR review. Update this section + `CODEOWNERS` in the same PR.

## Line endings (Windows clones)

This repo enforces LF line endings via `.gitattributes` (`* text=auto eol=lf`)
and the pre-commit hygiene hook. If you cloned on Windows with
`core.autocrlf=true`, the first checkout after pulling the `.gitattributes`
change can leave the working tree out of sync with the index. Renormalise
once:

```sh
git add --renormalize .
git commit -m "chore: renormalise line endings"
```

After that, day-to-day work is unaffected.

## Local pre-push gate

```sh
Expand Down
4 changes: 3 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,9 +15,10 @@
- **Backend:** Python 3.14, FastAPI, Pydantic v2 (`StrictModel` base), `uv` deps, OpenTelemetry SDK + OTLP exporter, structured JSON logs, generic tool-registry pattern.
- **Frontend:** Node 24 LTS, React 19.2, Vite 8, TypeScript strict, ESLint 10 flat config, Prettier, Vitest + jsdom + Testing Library.
- **Eval harness:** provider-agnostic runner + LLM-judge `Protocol`, three tolerance modes (exact / numeric / semantic), one example golden case, nightly workflow (disabled by default).
- **CI:** 15 required status checks across `ci.yml` (lint/format, mypy strict, unit tests, coverage ≥75%, import-linter architecture, pre-commit, frontend build, frontend quality, branch-protection sync, commit-type sync) + `security.yml` (gitleaks, pip-audit, npm audit, trivy) + PR-title lint.
- **CI:** 21 required status checks across `ci.yml` (lint/format, mypy strict, unit tests, coverage, import-linter architecture, pre-commit, frontend build, frontend quality, branch-protection sync, commit-type sync, version/action/tests/docs audits) + `security.yml` (gitleaks, pip-audit, npm audit, trivy) + PR-title lint.
- **Release:** tag-triggered workflow that builds the image, pushes to `ghcr.io`, generates a CycloneDX SBOM, and publishes the GitHub Release.
- **Agent integration:** `.claude/hooks/` (forbidden-flag blocker, secret scan, formatter dispatch, SessionStart context) + six auto-activating skills (architect / code-reviewer / devops / frontend / qa-engineer / technical-writer).
- **Issue execution:** GitHub Issues remain the external source of truth; optional Beads guidance adds a local dependency-aware execution queue without changing issue closure authority.
- **Docker:** multi-stage Dockerfile (non-root, healthcheck), `docker compose up` boots app + frontend + Jaeger.

## Quickstart
Expand Down Expand Up @@ -114,6 +115,7 @@ See [`docs/HARNESS.md`](docs/HARNESS.md) for the full umbrella. Highlights:
| [`docs/BOUNDARIES.md`](docs/BOUNDARIES.md) | Module layering + the import-linter contracts |
| [`docs/DEVELOPMENT.md`](docs/DEVELOPMENT.md) | Local setup, branching, justfile, CI |
| [`docs/EVAL_HARNESS.md`](docs/EVAL_HARNESS.md) | Eval flywheel + opt-in for the nightly workflow |
| [`docs/BEADS.md`](docs/BEADS.md) | Optional local Beads queue layered under GitHub Issues |
| [`docs/SECURITY.md`](docs/SECURITY.md) | Threat model + defence-in-depth map |
| [`docs/ARCHITECTURE.md`](docs/ARCHITECTURE.md) | Scaffold-level component view |
| [`CONTRIBUTING.md`](CONTRIBUTING.md) | Branching, commit format, PR flow |
Expand Down
149 changes: 149 additions & 0 deletions docs/BEADS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,149 @@
# Optional Beads execution queue

[Beads](https://github.com/steveyegge/beads) is an open-source
dependency-aware issue tracker designed for AI coding agents — it gives an
agent a local ready/blocked view of work, a dependency graph, and restart-safe
task claims that GitHub Issues alone do not.

This document is **optional and additive**. The base harness does not assume
Beads; if your team has no agent or multi-actor execution concern, GitHub
Issues plus the PR template is sufficient and you can skip this doc entirely.
Beads is recommended specifically when you are coordinating an LLM agent (or
several) against this repo and want dependency planning the public issue
tracker does not provide. The README and `docs/HARNESS.md` references describe
Beads as optional infrastructure, not part of the standard contributor flow.

Wherever Beads is used, GitHub Issues remain the external source of truth and
the authority for issue closure.

## Review of existing GitHub issue guidance

The current harness already treats GitHub as the public planning and merge
record:

- `.github/ISSUE_TEMPLATE/bug.md`, `feature.md`, and `eval-regression.md`
define the supported intake paths, and blank issues are disabled in
`.github/ISSUE_TEMPLATE/config.yml`.
- `CONTRIBUTING.md` requires one issue per branch, short-lived branches named
`<type>/<issue-number>-<kebab-title>`, and green CI plus review before merge.
- `.github/pull_request_template.md` requires What & why, Test plan,
Invariants affected, supply-chain surface, Screenshots when relevant, and a
linked issue.
- `CLAUDE.md` and `docs/DEVELOPMENT.md` describe the same one-issue,
one-branch, `develop` to `main` release flow for agent and human operators.
- `docs/TASKS.md` is a project-local planning map cross-referenced with GitHub
issues and the project board.

There is no Beads-specific policy in the base harness today. Any Beads addition
must therefore be additive and must not make GitHub issue state ambiguous.

## GitHub Issues vs Beads

| System | Owns | Does not own |
|---|---|---|
| GitHub Issues | Public backlog, user-facing requirements, labels, project board state, discussion, acceptance criteria, links from PRs, and final issue closure. | Local agent claims, transient execution notes, or dependency scheduling that would be noisy in the public issue. |
| Beads | Local execution queue, ready/blocked views, dependency graph, implementation notes, reviewer handoff notes, and restart-safe task claims. | The canonical requirement, public status, release notes, or authority to close a GitHub issue. |

The rule is simple: **GitHub answers what work exists and whether it is
externally done; Beads answers what the local execution system should pick up
next.**

## Sync contract

When using Beads with this harness:

1. Create or confirm the GitHub issue first.
2. Mirror the issue into Beads with an immutable external reference:
- GitHub repository owner/name.
- GitHub issue number.
- GitHub issue URL.
- Original issue title.
3. Use Beads for local status only: `ready`, `in_progress`, `blocked`,
`review`, or `done` are execution states, not replacements for the GitHub
issue state.
4. Put the Bead id in local notes, branch notes, or PR body when useful, but
keep `Closes #<issue>` pointing at the GitHub issue.
5. Do not close a GitHub issue because a Bead is marked done. Close only after
the PR is merged, required checks are green, any required manual or browser
validation is recorded, and a human-readable note has been added to the
issue or PR.

If the GitHub issue changes after import, update the Bead from GitHub before
continuing. GitHub wins on scope, acceptance criteria, and user-visible status.

## Recommended Bead fields

A Bead should carry enough information for a new agent or contributor to resume
without reopening every browser tab:

| Field | Purpose |
|---|---|
| `external_ref` | GitHub issue URL, for example `https://github.com/owner/repo/issues/123`. |
| `github_issue` | Numeric issue id used by branches and PRs. |
| `acceptance` | The current acceptance criteria copied or summarized from GitHub. |
| `dependencies` | Other Beads or GitHub issues that must land first. |
| `status` | Local execution state. |
| `owner` | Optional local agent or human claim. |
| `evidence` | Paths or URLs for test output, review notes, screenshots, or deploy checks. |
| `closeout` | Merge SHA, PR URL, and verification notes once complete. |

A short YAML example:

```yaml
external_ref: https://github.com/owner/repo/issues/123
github_issue: 123
acceptance: |
/api/v1/echo rejects payloads >1KiB with HTTP 413.
dependencies: [122] # other Bead ids or GitHub issues
status: ready
owner: agent-a
evidence:
- tests/test_api.py::test_echo_size_cap
closeout: null
```

Avoid storing secrets, tokens, credentials, private customer data, or raw
production payloads in Beads. Treat Beads data as local operational metadata.
Note that `.beads/` is gitignored, so anything Beads stores locally — including
agent-action audit logs — is wiped by `git clean -fdx`; commit deliberate
summaries to the repo if you need them to survive workspace resets.

## PR discipline when Beads are used

The existing PR template still applies. Add Beads information without deleting
any required section:

- `Linked issue` remains `Closes #<issue>`.
- Mention the Bead id or local queue reference under `What & why` or the
optional Beads section.
- Include Beads-derived evidence paths in `Test plan` only when they are useful
to a reviewer.
- If the Bead changed scope, update the GitHub issue before asking for review.
- If the Bead was blocked by an external dependency, note that in the PR or
issue rather than hiding it in the local queue.

## Local artifact hygiene

Beads state is usually local execution metadata. Do not commit raw Beads
databases, scratch exports, or agent logs by default. Commit only intentional
summaries or docs that reviewers need.

If a downstream project decides to version Beads state, document that policy in
that project and make sure secret scanning, review, and retention expectations
are explicit.

## Closure checklist

The PR-merge and issue-closure gates already live in
`.github/pull_request_template.md` and `CONTRIBUTING.md` — don't duplicate them
here. The Bead-specific closure rule is narrower:

- Do not mark a Bead done until the GitHub issue's closure conditions (per the
PR template and `CONTRIBUTING.md`) are met. Beads track the local execution
state of work GitHub already authorised; they don't grant new closure
authority.
- If the Bead and the GitHub issue disagree on scope, acceptance, or status,
stop and reconcile against GitHub before continuing.

Beads improve local throughput only if they reduce ambiguity. If a Bead and a
GitHub issue disagree, the GitHub issue wins.
1 change: 1 addition & 0 deletions docs/DEVELOPMENT.md
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,7 @@ Every recipe uses `uv run --frozen` — bare `uv run` silently re-resolves when
- `main` is protected: every required CI context must pass + 1 review + commit-type sync + branch-protection sync.
- `develop` is the integration branch; same gates as `main` minus a strictness flag (`strict: false` so PRs don't need rebases).
- Feature branches are short-lived and named `<type>/<issue-number>-<kebab-title>`.
- Optional Beads queues can mirror GitHub issues for local execution, but GitHub remains the source of truth for requirements, PR linkage, and closure. See `docs/BEADS.md`.

## Commit messages

Expand Down
8 changes: 5 additions & 3 deletions docs/HARNESS.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,14 +13,15 @@ The "harness" is the set of mechanical controls that make LLM-driven coding prod
| **Tests** | Behaviour | `pytest tests/`, `pytest eval/`, `vitest` |
| **Coverage** | ≥ 75% on `src/` | `pyproject.toml` `[tool.coverage.report]` |
| **Pre-commit** | Local-first defence | `.pre-commit-config.yaml` (ruff, gitleaks, commitizen, mypy, hygiene) |
| **CI** | Non-bypassable | `.github/workflows/ci.yml` (15 contexts) + `security.yml` + `pr-title.yml` + `release.yml` + `release-drafter.yml` |
| **CI** | Non-bypassable | `.github/workflows/ci.yml` + `security.yml` + `pr-title.yml` (21 required contexts) plus release and maintenance workflows |
| **Branch protection** | Declarative, drift-checked | `.github/branch-protection/{develop,main}.json` + `branch-protection.yml` apply workflow + `check_required_contexts.py` meta-gate |
| **Commit format** | Seven prefixes only | `[tool.commitizen]` schema + `pr-title.yml` allowlist + `check_commit_types.py` meta-gate |
| **Secret scan** | Three checkpoints | local hook → pre-commit → `security.yml` gitleaks |
| **Container scan** | HIGH/CRITICAL CVEs block | `security.yml` trivy-action |
| **Dep scan** | Pinned + audited | pip-audit, npm audit |
| **Release** | Reproducible artefacts | `release.yml` (image push to GHCR + CycloneDX SBOM) |
| **Eval** | LLM-output regressions | `src/eval/`, `eval/`, `eval-nightly.yml` (workflow_dispatch by default) |
| **Issue execution** | GitHub stays canonical; Beads can drive local ready/blocked work | GitHub issue templates + PR template + optional `docs/BEADS.md` queue guidance |
| **Agent hooks** | LLM coder side enforcement | `.claude/hooks/{pretooluse_bash, posttooluse_writeedit, sessionstart}.py` + `settings.local.json.example` |
| **Skills** | Auto-activated agent guidance | `.claude/skills/{architect, code-reviewer, devops, frontend, qa-engineer, technical-writer}` |

Expand All @@ -40,5 +41,6 @@ For an engineer setting up the template:
2. **`docs/BOUNDARIES.md`** — module layering and the import-linter contracts.
3. **`docs/DEVELOPMENT.md`** — local setup, the `justfile`, the CI pipeline.
4. **`docs/EVAL_HARNESS.md`** — the eval flywheel; how to add a case, how to opt the nightly into running.
5. **`docs/SECURITY.md`** — threat model + the defence-in-depth map.
6. **`docs/ARCHITECTURE.md`** — scaffold-level diagram; expand as your domain lands.
5. **`docs/BEADS.md`** — optional local execution queue layered under GitHub Issues.
6. **`docs/SECURITY.md`** — threat model + the defence-in-depth map.
7. **`docs/ARCHITECTURE.md`** — scaffold-level diagram; expand as your domain lands.
3 changes: 3 additions & 0 deletions docs/HARNESS_PRIMER.md
Original file line number Diff line number Diff line change
Expand Up @@ -269,6 +269,7 @@ Distinct from the **build harness** (everything above), the **evaluation harness
|---|---|
| PR template | [.github/pull_request_template.md](../.github/pull_request_template.md). |
| Issue templates | [.github/ISSUE_TEMPLATE/](../.github/ISSUE_TEMPLATE/): `bug.md`, `feature.md`, `eval-regression.md`. Blank issues disabled. |
| Optional Beads queue | [docs/BEADS.md](BEADS.md): GitHub Issues remain canonical while Beads can track local ready/blocked execution. |
| Code ownership | [.github/CODEOWNERS](../.github/CODEOWNERS). |
| Branch protection | [.github/branch-protection/{main,develop}.json](../.github/branch-protection/) declarative configs, re-applied weekly by [branch-protection.yml](../.github/workflows/branch-protection.yml). |
| Commit message shape | Commitizen, configured in `pyproject.toml`. |
Expand Down Expand Up @@ -359,6 +360,7 @@ The error names the offending module, line, and contract — no guessing.
| **OpenTelemetry (OTel)** | Vendor-neutral standard for traces, metrics, logs. The repo follows `gen_ai.*` and `db.*` semantic conventions for attribute names. |
| **CycloneDX** | An SBOM format. Generated per release and attached to the GitHub Release. |
| **gitleaks** | Pattern-based secret scanner. |
| **Beads** | Optional local issue queue used for dependency-aware execution and handoffs; GitHub Issues remain canonical. |

---

Expand All @@ -372,4 +374,5 @@ The error names the offending module, line, and contract — no guessing.
| [ARCHITECTURE.md](ARCHITECTURE.md) | The system design — components, request flow. |
| [SECURITY.md](SECURITY.md) | Threat model + defence-in-depth mapping. |
| [EVAL_HARNESS.md](EVAL_HARNESS.md) | The eval flywheel. |
| [BEADS.md](BEADS.md) | Optional local Beads queue layered under GitHub Issues. |
| [DEVELOPMENT.md](DEVELOPMENT.md) | Local setup, branching, releases. |
Loading
Loading