Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 19 additions & 2 deletions .github/scripts/check_pin_freshness.py
Original file line number Diff line number Diff line change
Expand Up @@ -110,6 +110,22 @@ def _fetch_json(url: str, token: str) -> dict[str, object] | None:
return payload if isinstance(payload, dict) else None


def _action_repo(action: str) -> str:
"""Return `owner/repo` for an action string that may carry a sub-path.

Action references can be `owner/repo` or `owner/repo/path/to/subaction`
(e.g. `github/codeql-action/init`). Only the first two slash-segments
name the GitHub repository — the trailing segments are paths within
the repo's tree (containing per-subaction `action.yml` files). The
REST API endpoint we hit (`/repos/<owner>/<repo>/git/...`) only
accepts the `owner/repo` form; passing the full action string would
404 on every sub-path action and surface as a false-positive
"tag no longer resolves" finding.
"""
parts = action.split("/", 2)
return "/".join(parts[:2]) if len(parts) >= 2 else action


def _resolve_tag_sha(action: str, tag: str, token: str) -> str | None:
"""Return the commit SHA the tag points at, or None on missing/error.

Expand All @@ -118,7 +134,8 @@ def _resolve_tag_sha(action: str, tag: str, token: str) -> str | None:
commit. Lightweight tags resolve in one GET (the ref's `object.sha`
is the commit directly).
"""
ref = _fetch_json(f"{_API_BASE}/repos/{action}/git/refs/tags/{tag}", token)
repo = _action_repo(action)
ref = _fetch_json(f"{_API_BASE}/repos/{repo}/git/refs/tags/{tag}", token)
if ref is None:
return None
obj = ref.get("object")
Expand All @@ -132,7 +149,7 @@ def _resolve_tag_sha(action: str, tag: str, token: str) -> str | None:
return obj_sha
if obj_type == "tag":
# Annotated tag — dereference to the commit it points at.
annotated = _fetch_json(f"{_API_BASE}/repos/{action}/git/tags/{obj_sha}", token)
annotated = _fetch_json(f"{_API_BASE}/repos/{repo}/git/tags/{obj_sha}", token)
if annotated is None:
return None
inner = annotated.get("object")
Expand Down
16 changes: 8 additions & 8 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ jobs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
- uses: astral-sh/setup-uv@cec208311dfd045dd5311c1add060b2062131d57 # v8
- uses: astral-sh/setup-uv@cec208311dfd045dd5311c1add060b2062131d57 # v8.0.0
- uses: actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065 # v5
with:
python-version: "3.14"
Expand All @@ -31,7 +31,7 @@ jobs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
- uses: astral-sh/setup-uv@cec208311dfd045dd5311c1add060b2062131d57 # v8
- uses: astral-sh/setup-uv@cec208311dfd045dd5311c1add060b2062131d57 # v8.0.0
- uses: actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065 # v5
with:
python-version: "3.14"
Expand All @@ -44,7 +44,7 @@ jobs:
# Pure in-process tests — completes fast so PR authors get quick feedback.
steps:
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
- uses: astral-sh/setup-uv@cec208311dfd045dd5311c1add060b2062131d57 # v8
- uses: astral-sh/setup-uv@cec208311dfd045dd5311c1add060b2062131d57 # v8.0.0
- uses: actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065 # v5
with:
python-version: "3.14"
Expand All @@ -57,7 +57,7 @@ jobs:
# Enforces [tool.coverage.report].fail_under from pyproject.toml (75%).
steps:
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
- uses: astral-sh/setup-uv@cec208311dfd045dd5311c1add060b2062131d57 # v8
- uses: astral-sh/setup-uv@cec208311dfd045dd5311c1add060b2062131d57 # v8.0.0
- uses: actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065 # v5
with:
python-version: "3.14"
Expand All @@ -69,7 +69,7 @@ jobs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
- uses: astral-sh/setup-uv@cec208311dfd045dd5311c1add060b2062131d57 # v8
- uses: astral-sh/setup-uv@cec208311dfd045dd5311c1add060b2062131d57 # v8.0.0
- uses: actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065 # v5
with:
python-version: "3.14"
Expand All @@ -84,7 +84,7 @@ jobs:
# secret past the first defence layer.
steps:
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
- uses: astral-sh/setup-uv@cec208311dfd045dd5311c1add060b2062131d57 # v8
- uses: astral-sh/setup-uv@cec208311dfd045dd5311c1add060b2062131d57 # v8.0.0
- uses: actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065 # v5
with:
python-version: "3.14"
Expand Down Expand Up @@ -218,7 +218,7 @@ jobs:
# actual workflow jobs on disk.
steps:
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
- uses: astral-sh/setup-uv@cec208311dfd045dd5311c1add060b2062131d57 # v8
- uses: astral-sh/setup-uv@cec208311dfd045dd5311c1add060b2062131d57 # v8.0.0
- uses: actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065 # v5
with:
python-version: "3.14"
Expand All @@ -234,7 +234,7 @@ jobs:
# while PR titles fail in CI (or vice versa).
steps:
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
- uses: astral-sh/setup-uv@cec208311dfd045dd5311c1add060b2062131d57 # v8
- uses: astral-sh/setup-uv@cec208311dfd045dd5311c1add060b2062131d57 # v8.0.0
- uses: actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065 # v5
with:
python-version: "3.14"
Expand Down
4 changes: 2 additions & 2 deletions .github/workflows/codeql.yml
Original file line number Diff line number Diff line change
Expand Up @@ -44,12 +44,12 @@ jobs:
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4

- name: Initialize CodeQL
uses: github/codeql-action/init@v3
uses: github/codeql-action/init@v4
with:
languages: ${{ matrix.language }}
build-mode: ${{ matrix.build-mode }}

- name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@v3
uses: github/codeql-action/analyze@v4
with:
category: "/language:${{ matrix.language }}"
37 changes: 22 additions & 15 deletions .github/workflows/eval-nightly.yml
Original file line number Diff line number Diff line change
@@ -1,22 +1,29 @@
# Eval harness nightly — disabled-by-default.
#
# This workflow runs the golden QA dataset against the agent / LLM loop. It
# is `workflow_dispatch`-only by default to prevent accidental LLM API
# spend. To enable nightly runs:
# This workflow runs the golden QA dataset + worked-pattern cases against a
# real Azure OpenAI deployment. It is `workflow_dispatch`-only by default
# to prevent accidental API spend. To enable nightly runs:
#
# 1. Set the Azure OpenAI secrets in repo settings:
# AZURE_OPENAI_ENDPOINT e.g. https://my.openai.azure.com
# AZURE_OPENAI_API_KEY the Azure resource key
# AZURE_OPENAI_DEPLOYMENT deployment name, e.g. gpt-4o-mini
# AZURE_OPENAI_API_VERSION optional, defaults to 2024-10-21
#
# 1. Set the LLM secrets in repo settings (LLM_API_KEY at minimum;
# LLM_BASE_URL / LLM_MODEL / LLM_PROVIDER if your judge differs from
# OpenAI defaults).
# 2. Replace the `on:` block below with:
#
# on:
# schedule:
# - cron: "0 6 * * *" # daily 06:00 UTC
# workflow_dispatch:
#
# 3. Add the `eval-nightly.yml` to EXEMPT_WORKFLOWS in
# `.github/scripts/check_required_contexts.py` if it's not already
# there (it is, by default — scheduled runs never gate PRs).
# 3. Confirm `eval-nightly.yml` is in EXEMPT_WORKFLOWS in
# `.github/scripts/check_required_contexts.py` (it is, by default
# — scheduled runs never gate PRs).
#
# When the Azure secrets are absent, eval/test_golden_patterns.py is
# skipped via pytestmark — the toy eval/test_golden_qa.py case still
# runs as a smoke check on the runner mechanics.
#
# See docs/EVAL_HARNESS.md for the full setup story.

Expand All @@ -39,15 +46,15 @@ jobs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
- uses: astral-sh/setup-uv@cec208311dfd045dd5311c1add060b2062131d57 # v8
- uses: astral-sh/setup-uv@cec208311dfd045dd5311c1add060b2062131d57 # v8.0.0
- uses: actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065 # v5
with:
python-version: ${{ inputs.python_version || '3.14' }}
- run: uv sync --frozen --extra dev
- run: uv sync --frozen --extra dev --extra eval
- name: Run pytest eval/
env:
LLM_PROVIDER: ${{ secrets.LLM_PROVIDER }}
LLM_API_KEY: ${{ secrets.LLM_API_KEY }}
LLM_BASE_URL: ${{ secrets.LLM_BASE_URL }}
LLM_MODEL: ${{ secrets.LLM_MODEL }}
AZURE_OPENAI_ENDPOINT: ${{ secrets.AZURE_OPENAI_ENDPOINT }}
AZURE_OPENAI_API_KEY: ${{ secrets.AZURE_OPENAI_API_KEY }}
AZURE_OPENAI_DEPLOYMENT: ${{ secrets.AZURE_OPENAI_DEPLOYMENT }}
AZURE_OPENAI_API_VERSION: ${{ secrets.AZURE_OPENAI_API_VERSION }}
run: uv run pytest eval/ -v
2 changes: 1 addition & 1 deletion .github/workflows/release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ jobs:
# annotation when a new release lands and you've reviewed the diff.
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4

- uses: astral-sh/setup-uv@cec208311dfd045dd5311c1add060b2062131d57 # v8
- uses: astral-sh/setup-uv@cec208311dfd045dd5311c1add060b2062131d57 # v8.0.0

- uses: actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065 # v5
with:
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/security.yml
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ jobs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
- uses: astral-sh/setup-uv@cec208311dfd045dd5311c1add060b2062131d57 # v8
- uses: astral-sh/setup-uv@cec208311dfd045dd5311c1add060b2062131d57 # v8.0.0
- uses: actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065 # v5
with:
python-version: "3.14"
Expand Down
30 changes: 27 additions & 3 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,15 +46,39 @@ The subject is **lowercase** after the colon. Title Case prose (`Add the thing`)

### Solo-owner merge policy

This repo runs with a single code owner (`* @constk` in `CODEOWNERS`). GitHub forbids a PR author from approving their own PR, so the standard "1 code-owner review" gate cannot be satisfied without an admin override. While in this state, the **intended workflow is**:
> **Transitional — only while this repo has a single code owner.** Standard practice is a code-owner review on every PR. The flow below exists because GitHub forbids self-approval, so a single-owner repo cannot satisfy the "1 code-owner review" gate any other way. The exemption is **removed** the moment a second collaborator with merge rights joins.

This repo currently runs with a single code owner (`* @constk` in `CODEOWNERS`). While in this state, the intended merge command is:

```sh
gh pr merge <N> --admin --squash --delete-branch
```

…for `feat:` / `fix:` / `chore:` PRs, and `--admin --merge` (preserves history) for `release:` PRs. The `enforce_admins: false` line in `.github/branch-protection/{develop,main}.json` is the documented escape hatch — admin merge here is the policy, not a deviation from it.
…for `feat:` / `fix:` / `chore:` PRs, and `--admin --merge` (preserves history) for `release:` PRs. The `enforce_admins: false` line in `.github/branch-protection/{develop,main}.json` is the documented escape hatch — admin merge here is the documented single-owner workaround, not bypass of the gates (every required status check still has to pass).

**When the exemption ends.** As soon as a second collaborator with merge rights is onboarded:

1. Drop the `--admin` flag from the merge command and adopt standard PR review.
2. Remove this entire subsection.
3. Update `CODEOWNERS` to add the new collaborator.
4. Flip `enforce_admins` to `true` in the branch-protection JSON for both branches. Leaving it `false` would keep the admin-bypass door open even after the single-owner workaround is no longer needed — defeats the point of removing the workaround.

All four changes land in a single PR.

## Line endings (Windows clones)

This repo enforces LF line endings via `.gitattributes` (`* text=auto eol=lf`)
and the pre-commit hygiene hook. If you cloned on Windows with
`core.autocrlf=true`, the first checkout after pulling the `.gitattributes`
change can leave the working tree out of sync with the index. Renormalise
once:

```sh
git add --renormalize .
git commit -m "chore: renormalise line endings"
```

When a second collaborator joins, drop the `--admin` flag and adopt standard PR review. Update this section + `CODEOWNERS` in the same PR.
After that, day-to-day work is unaffected.

## Line endings (Windows clones)

Expand Down
26 changes: 17 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
[![React 19.2](https://img.shields.io/badge/react-19.2-61dafb.svg)](https://react.dev/)
[![Coverage 98%](https://img.shields.io/badge/coverage-98%25-brightgreen.svg)](docs/HARNESS.md)

> A production-quality coding harness for Python (FastAPI) + Vite/React/TypeScript projects. Designed for LLM-driven development: every gate lint, types, architecture, security, eval — is enforced mechanically so code quality stays consistent across many human and AI contributors.
> Production-grade SDLC harness for human–LLM coding collaborations — keeping quality consistent regardless of who shipped the code. Python (FastAPI) + Vite/React/TypeScript, with every gate (lint, types, architecture, security, eval) enforced mechanically in CI, not by discipline.

## What ships

Expand Down Expand Up @@ -81,23 +81,31 @@ The scaffold's React page hits `/api/v1/health` on load and renders the version

![Hello page](docs/images/hello-page.png)

### Jaeger trace (`docker compose up` + `/api/v1/health`)

The full stack — backend, frontend, Jaeger collector — boots with `docker compose up`. Hitting `/api/v1/health` once produces an OpenTelemetry trace exported via OTLP/gRPC; the span hierarchy is visible at <http://localhost:16686> under the `harness-python-react` service, with `agent_span(...)` attributes attached using only the keys constant-defined at the top of [`src/observability/spans.py`](src/observability/spans.py).

<!--
TODO (#28): one capture left — Jaeger trace.
Screenshot pending: docs/images/jaeger-trace.png

docs/images/jaeger-trace.png
With the full stack running (`docker compose up`), hit /api/v1/health
once, then open http://localhost:16686, select service
`harness-python-react`, click the most recent trace, screenshot the
span timeline.
Capture recipe (run once and commit the PNG to docs/images/):
1. docker compose up
2. curl http://localhost:8000/api/v1/health
3. open http://localhost:16686 -> select service "harness-python-react"
4. click the most recent trace
5. screenshot the span timeline, save as docs/images/jaeger-trace.png

When the PNG lands in docs/images/, replace this comment with a section
analogous to "Hello page" above.
When the PNG is committed, replace this whole comment with:

![Jaeger trace — span timeline for GET /api/v1/health](docs/images/jaeger-trace.png)
-->

## Why a harness

The differentiator isn't the scaffold — it's that every layer of the pipeline catches a different failure class **without relying on the human or LLM coder remembering to run anything**. The same posture protects code regardless of who wrote it.

> **Example.** An agent added `from src.tools import ...` inside `src.models` for type reuse. `lint-imports` failed CI — the `src.models depends on nothing in src/` contract broke — and pointed the next iteration at [`docs/BOUNDARIES.md`](docs/BOUNDARIES.md). The type moved into `src.models` instead. Never shipped.

See [`docs/HARNESS.md`](docs/HARNESS.md) for the full umbrella. Highlights:

- **Pydantic `StrictModel` everywhere a contract crosses a seam** (rejects unknown keys at construction).
Expand Down
Loading
Loading