Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -140,6 +140,7 @@ markers = [
"requires_streamlit: Tests requiring Streamlit dashboard on port 28501",
"quarantine: Quarantined flaky tests excluded from default CI runs",
"regression: Regression tests requiring live Langfuse + Qdrant (excluded from default pytest run)",
"process: structural/contract tests for markdown workflow files (no live services required)",
]
# F-012 companion: pytest installs catch_warnings(record=True) + simplefilter("always"),
# which overrides the production filter in src/memory/config.py. Mirror the
Expand Down
90 changes: 90 additions & 0 deletions tests/process/INDEX.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
# Process Test Index — TASK-071 Phase 4

**Authored**: feat/task071-process-tests
**Work order**: `oversight/specs/TASK-071-PHASE4-PROCESS-TEST-WORK-ORDER.md`
**Harness authority**: `oversight/knowledge/best-practices/BP-017-pytest-contract-testing-markdown-step-file-workflows.md`

All 30 processes from the Phase-2 Lane-5 inventory are listed below with their
coverage status. Testable processes are covered by the parametrized contract
suite in `tests/process/` — no dedicated per-process test files.

---

## Coverage Notes

### Cyclic workflows — handled gracefully

`cycles/review-cycle` intentionally loops: `step-03→step-04→step-05→step-03`.
The exit condition is prose-controlled via `exitStepFile: ./step-07-exit-cycle.md`.
`walk_step_chain()` terminates on revisit (cycle is not an error — all referenced
files resolve). The forward link-resolution contract still holds: every
`nextStepFile` reference in the loop resolves to a real file.

### Reverse-reachability (orphan) test — NOT implemented

The corpus contains branch/mode steps that are reached by prose routing in step
bodies rather than the linear `firstStep`→`nextStepFile` spine:

- `steps-e/` (edit-mode steps)
- `steps-v/` (validate-mode steps)
- `branches/branch-a..d/` (conditional branches)
- `route/step-01-resolve-backend.md` (shared routing step)

A corpus-wide reverse-reachability check (`all step*.md − linear-reachable == ∅`)
would false-fail on all of these. The forward link-resolution contract
(`test_step_chain.py` + `test_step_frontmatter.py::test_step_nextStepFile_resolves`)
is the false-positive-free equivalent: every *referenced* path resolves; prose-routed
paths that are intentionally unreachable from the linear spine are not asserted.

### FIRSTEP_EXEMPT set

Two workflows have no executable step chain and are skipped for `firstStep`-presence
and chain-walk assertions (they still pass name/description/H2 checks):

| Workflow | Reason |
|---|---|
| `session/status/workflow.md` | Single-step inline workflow — `firstStep: null` by design |
| `model-dispatch/claude-native/workflow.md` | Reference doc — no step chain |

### aim-best-practices-researcher — core skill root

`aim-best-practices-researcher` lives under `_ai-memory/skills/` (core skills root),
not `_ai-memory/pov/skills/` (pov skills root). `test_skill_procedures.py` uses
explicit per-skill paths for all three Section-C skills to accommodate both roots.

---

## Process Coverage Table

| # | Process ID | Root File | Section | Coverage | Test File / Notes |
|---|---|---|---|---|---|
| 1 | cycles/agent-dispatch | `_ai-memory/pov/workflows/cycles/agent-dispatch/workflow.md` | A | ✅ Contract suite | `test_workflow_frontmatter.py`, `test_step_frontmatter.py`, `test_step_chain.py` |
| 2 | cycles/approval-gate | `_ai-memory/pov/workflows/cycles/approval-gate/workflow.md` | A | ✅ Contract suite | `test_workflow_frontmatter.py`, `test_step_frontmatter.py`, `test_step_chain.py` |
| 3 | cycles/legitimacy-check | `_ai-memory/pov/workflows/cycles/legitimacy-check/workflow.md` | A | ✅ Contract suite | `test_workflow_frontmatter.py`, `test_step_frontmatter.py`, `test_step_chain.py` |
| 4 | cycles/research-protocol | `_ai-memory/pov/workflows/cycles/research-protocol/workflow.md` | A | ✅ Contract suite | `test_workflow_frontmatter.py`, `test_step_frontmatter.py`, `test_step_chain.py` |
| 5 | cycles/review-cycle | `_ai-memory/pov/workflows/cycles/review-cycle/workflow.md` | A | ✅ Contract suite | `test_workflow_frontmatter.py`, `test_step_frontmatter.py`, `test_step_chain.py` |
| 6 | first-breath | `_ai-memory/pov/workflows/first-breath/workflow.md` | A | ✅ Contract suite (structural); behavioral testing non-feasible | `test_workflow_frontmatter.py`, `test_step_frontmatter.py`, `test_step_chain.py` |
| 7 | init/existing | `_ai-memory/pov/workflows/init/existing/workflow.md` | A | ✅ Contract suite | `test_workflow_frontmatter.py`, `test_step_frontmatter.py`, `test_step_chain.py` |
| 8 | init/new | `_ai-memory/pov/workflows/init/new/workflow.md` | A | ✅ Contract suite | `test_workflow_frontmatter.py`, `test_step_frontmatter.py`, `test_step_chain.py` |
| 9 | phases/architecture | `_ai-memory/pov/workflows/phases/architecture/workflow.md` | A | ✅ Contract suite | `test_workflow_frontmatter.py`, `test_step_frontmatter.py`, `test_step_chain.py` |
| 10 | phases/discovery | `_ai-memory/pov/workflows/phases/discovery/workflow.md` | A | ✅ Contract suite | `test_workflow_frontmatter.py`, `test_step_frontmatter.py`, `test_step_chain.py` |
| 11 | phases/execution | `_ai-memory/pov/workflows/phases/execution/workflow.md` | A | ✅ Contract suite | `test_workflow_frontmatter.py`, `test_step_frontmatter.py`, `test_step_chain.py` |
| 12 | phases/integration | `_ai-memory/pov/workflows/phases/integration/workflow.md` | A | ✅ Contract suite | `test_workflow_frontmatter.py`, `test_step_frontmatter.py`, `test_step_chain.py` |
| 13 | phases/maintenance | `_ai-memory/pov/workflows/phases/maintenance/workflow.md` | A | ✅ Contract suite | `test_workflow_frontmatter.py`, `test_step_frontmatter.py`, `test_step_chain.py` |
| 14 | phases/planning | `_ai-memory/pov/workflows/phases/planning/workflow.md` | A | ✅ Contract suite | `test_workflow_frontmatter.py`, `test_step_frontmatter.py`, `test_step_chain.py` |
| 15 | phases/release | `_ai-memory/pov/workflows/phases/release/workflow.md` | A | ✅ Contract suite | `test_workflow_frontmatter.py`, `test_step_frontmatter.py`, `test_step_chain.py` |
| 16 | session/blocker | `_ai-memory/pov/workflows/session/blocker/workflow.md` | A | ✅ Contract suite | `test_workflow_frontmatter.py`, `test_step_frontmatter.py`, `test_step_chain.py` |
| 17 | session/close | `_ai-memory/pov/workflows/session/close/workflow.md` | A | ✅ Contract suite | `test_workflow_frontmatter.py`, `test_step_frontmatter.py`, `test_step_chain.py` |
| 18 | session/decision | `_ai-memory/pov/workflows/session/decision/workflow.md` | A | ✅ Contract suite | `test_workflow_frontmatter.py`, `test_step_frontmatter.py`, `test_step_chain.py` |
| 19 | session/handoff | `_ai-memory/pov/workflows/session/handoff/workflow.md` | A | ✅ Contract suite | `test_workflow_frontmatter.py`, `test_step_frontmatter.py`, `test_step_chain.py` |
| 20 | session/start | `_ai-memory/pov/workflows/session/start/workflow.md` | A | ✅ Contract suite | `test_workflow_frontmatter.py`, `test_step_frontmatter.py`, `test_step_chain.py` |
| 21 | session/status | `_ai-memory/pov/workflows/session/status/workflow.md` | A | ⚠️ Partial — EXEMPT | name/description/H2 tested; `firstStep`+chain SKIPPED (`firstStep: null` — single-step inline workflow by design) |
| 22 | session/verify | `_ai-memory/pov/workflows/session/verify/workflow.md` | A | ✅ Contract suite | `test_workflow_frontmatter.py`, `test_step_frontmatter.py`, `test_step_chain.py` |
| 23 | model-dispatch/api-dispatch | `_ai-memory/pov/skills/aim-model-dispatch/workflows/api-dispatch/workflow.md` | B | ✅ Contract suite | `test_workflow_frontmatter.py`, `test_step_frontmatter.py`, `test_step_chain.py` |
| 24 | model-dispatch/bmad-dispatch | `_ai-memory/pov/skills/aim-model-dispatch/workflows/bmad-dispatch/workflow.md` | B | ✅ Contract suite | `test_workflow_frontmatter.py`, `test_step_frontmatter.py`, `test_step_chain.py` |
| 25 | model-dispatch/tmux-dispatch | `_ai-memory/pov/skills/aim-model-dispatch/workflows/tmux-dispatch/workflow.md` | B | ✅ Contract suite | `test_workflow_frontmatter.py`, `test_step_frontmatter.py`, `test_step_chain.py` |
| 26 | model-dispatch/claude-native | `_ai-memory/pov/skills/aim-model-dispatch/workflows/claude-native/workflow.md` | B | ⚠️ Partial — EXEMPT | name/description/H2 tested; `firstStep`+chain SKIPPED (reference doc, no step chain — documented non-feasible) |
| 27 | skill/aim-agent-sanctum-init | `_ai-memory/pov/skills/aim-agent-sanctum-init/SKILL.md` | C | ✅ Existing test | `tests/test_install_sanctum_preservation.py` (idempotency) — not duplicated here per work-order §5 |
| 28 | skill/aim-agent-dispatch | `_ai-memory/pov/skills/aim-agent-dispatch/SKILL.md` | C | ✅ Skill procedures | `test_skill_procedures.py` |
| 29 | skill/aim-agent-lifecycle | `_ai-memory/pov/skills/aim-agent-lifecycle/SKILL.md` | C | ✅ Skill procedures | `test_skill_procedures.py` |
| 30 | skill/aim-best-practices-researcher | `_ai-memory/skills/aim-best-practices-researcher/SKILL.md` | C | ✅ Skill procedures | `test_skill_procedures.py` — core skill root (`_ai-memory/skills/`), distinct from pov skills root |
Empty file added tests/process/__init__.py
Empty file.
174 changes: 174 additions & 0 deletions tests/process/conftest.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,174 @@
"""Shared constants, helpers, and fixtures for tests/process/ contract tests.

All tests in this package are pure filesystem operations — no src/ import,
no live services, no LLM calls. Pattern: DAG-integrity style (BP-017).

Path anchoring: REPO_ROOT is derived from this file's location so it is
CWD-independent and safe in CI (BP-017 §10).
"""

from pathlib import Path

import pytest
import yaml

# ---------------------------------------------------------------------------
# Path constants
# ---------------------------------------------------------------------------

# tests/process/conftest.py → .parent = tests/process/
# .parent = tests/
# .parent = pov-work/ (repo root)
REPO_ROOT = Path(__file__).parent.parent.parent

WORKFLOWS_ROOT = REPO_ROOT / "_ai-memory/pov/workflows"
MODEL_DISPATCH_ROOT = REPO_ROOT / "_ai-memory/pov/skills/aim-model-dispatch/workflows"
SKILLS_ROOT = (
REPO_ROOT / "_ai-memory/pov/skills"
) # pov skills (aim-agent-dispatch, aim-agent-lifecycle, …)
CORE_SKILLS_ROOT = (
REPO_ROOT / "_ai-memory/skills"
) # core skills (aim-best-practices-researcher, …)

# Placeholder → absolute-path substitutions used in step frontmatter refs.
_PLACEHOLDERS = {
"{workflows_path}": str(WORKFLOWS_ROOT),
"{skills_path}": str(SKILLS_ROOT),
"{project-root}": str(REPO_ROOT),
}

# ---------------------------------------------------------------------------
# Exempt set — workflows with no step chain (inline or reference docs).
#
# claude-native : reference doc (type: reference), no firstStep key.
# session/status: single-step inline workflow (firstStep: null).
#
# Both are skipped for firstStep-presence and chain-walk assertions only.
# name/description/h2-section assertions still run for both.
# ---------------------------------------------------------------------------
FIRSTEP_EXEMPT = frozenset(
[
(MODEL_DISPATCH_ROOT / "claude-native/workflow.md").resolve(),
(WORKFLOWS_ROOT / "session/status/workflow.md").resolve(),
]
)


# ---------------------------------------------------------------------------
# Core helpers (plain functions — available at parametrize collection time)
# ---------------------------------------------------------------------------


def parse_frontmatter(path: Path) -> dict:
"""Return the YAML frontmatter dict from a markdown file, or {} if none."""
text = path.read_text(encoding="utf-8")
if not text.startswith("---"):
return {}
parts = text.split("---", 2)
if len(parts) < 3:
return {}
return yaml.safe_load(parts[1]) or {}


def walk_step_chain(workflow_md: Path) -> list:
"""Walk firstStep→nextStepFile chain from workflow_md.

Returns the list of visited step Paths. Each nextStepFile is resolved
relative to the *step file's* parent — not the workflow root (BP-017 §12).
Raises AssertionError on dangling reference (file does not exist).
Returns [] when firstStep is absent or null.

Cycles: some workflows deliberately loop (e.g. cycles/review-cycle, which
loops step-03→04→05→03 with an exitStepFile exit path controlled by prose
logic). A revisited step terminates the walk gracefully — the contract is
that every *referenced* file exists, not that the chain is acyclic.
"""
fm = parse_frontmatter(workflow_md)
first = fm.get("firstStep")
if not first:
return []

visited: list = []
seen: set = set()
next_path = (workflow_md.parent / first).resolve()

while next_path:
assert (
next_path.exists()
), f"Dangling step reference: {next_path} (from {workflow_md})"
if next_path in seen:
break # intentional loop — all links already verified; stop walking
seen.add(next_path)
visited.append(next_path)

step_fm = parse_frontmatter(next_path)
ref = step_fm.get("nextStepFile")
if not ref:
break
next_path = (next_path.parent / ref).resolve()

return visited


def resolve_template_ref(ref: str, step_path: Path) -> Path:
"""Resolve a frontmatter template/path ref to an absolute Path.

Substitutes {workflows_path}, {skills_path}, {project-root} placeholders,
then resolves relative refs against the step file's parent directory.
"""
raw = str(ref)
for placeholder, actual in _PLACEHOLDERS.items():
raw = raw.replace(placeholder, actual)
p = Path(raw)
if p.is_absolute():
return p.resolve()
return (step_path.parent / raw).resolve()


# ---------------------------------------------------------------------------
# Discovery functions (called at module level inside test files for parametrize)
# ---------------------------------------------------------------------------


def _all_workflow_mds() -> list:
"""All workflow.md files from both workflow roots, sorted."""
return sorted(
list(WORKFLOWS_ROOT.rglob("workflow.md"))
+ list(MODEL_DISPATCH_ROOT.rglob("workflow.md"))
)


def _all_step_mds() -> list:
"""All step*.md files from both workflow roots, sorted."""
return sorted(
list(WORKFLOWS_ROOT.rglob("step*.md"))
+ list(MODEL_DISPATCH_ROOT.rglob("step*.md"))
)


def _wf_id(p: Path) -> str:
"""Human-readable pytest parametrize ID for a workflow.md path."""
try:
return str(p.relative_to(WORKFLOWS_ROOT))
except ValueError:
return "model-dispatch/" + str(p.relative_to(MODEL_DISPATCH_ROOT))


def _step_id(p: Path) -> str:
"""Human-readable pytest parametrize ID for a step file path."""
try:
return str(p.relative_to(WORKFLOWS_ROOT))
except ValueError:
return "model-dispatch/" + str(p.relative_to(MODEL_DISPATCH_ROOT))


# ---------------------------------------------------------------------------
# Fixtures
# ---------------------------------------------------------------------------


@pytest.fixture(scope="session")
def workflows_root() -> Path:
"""Session-scoped fixture: WORKFLOWS_ROOT (Section A root)."""
assert WORKFLOWS_ROOT.is_dir(), f"WORKFLOWS_ROOT not found: {WORKFLOWS_ROOT}"
return WORKFLOWS_ROOT
36 changes: 36 additions & 0 deletions tests/process/test_corpus_sentinel.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
"""Vacuous-green guard: corpus discovery must find minimum expected file counts.

If either path anchor in conftest.py breaks (e.g. WORKFLOWS_ROOT no longer
exists), rglob returns [] → 0 tests collected across the parametrized suite →
false green. This non-parametrized sentinel catches that failure mode by
asserting hard minimums against the known corpus size.

Minimums reflect the corpus at time of authoring (TASK-071 Phase 4):
- workflow.md : 26 (22 under WORKFLOWS_ROOT + 4 under MODEL_DISPATCH_ROOT)
- step*.md : 211 (183 under WORKFLOWS_ROOT + 28 under MODEL_DISPATCH_ROOT)
"""

import pytest

from .conftest import _all_step_mds, _all_workflow_mds


@pytest.mark.process
def test_corpus_sentinel(workflows_root):
"""Discovery functions must find the minimum expected corpus size.

Uses the workflows_root fixture to also validate that the path anchor is a
real directory. If the anchor breaks, this test — not a silent 0-collected
run — is the failure signal.
"""
wf_count = len(_all_workflow_mds())
step_count = len(_all_step_mds())

assert wf_count >= 26, (
f"workflow.md discovery returned {wf_count} files — expected >= 26. "
f"Check WORKFLOWS_ROOT / MODEL_DISPATCH_ROOT anchors in conftest.py."
)
assert step_count >= 211, (
f"step*.md discovery returned {step_count} files — expected >= 211. "
f"Check WORKFLOWS_ROOT / MODEL_DISPATCH_ROOT anchors in conftest.py."
)
69 changes: 69 additions & 0 deletions tests/process/test_skill_procedures.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
"""Contract: Section-C embedded-procedure skill structural assertions.

Tests SKILL.md files for all three Section-C skills:
1. SKILL.md file exists
2. Frontmatter has non-empty 'name' and 'description'
3. Body contains at least one H2 section (## ...)

Skills and their roots (two different roots — NOT a single skills directory):
aim-agent-dispatch → _ai-memory/pov/skills/ (pov skill)
aim-agent-lifecycle → _ai-memory/pov/skills/ (pov skill)
aim-best-practices-researcher → _ai-memory/skills/ (core skill)

Scope notes (see INDEX.md for full coverage table):
- aim-agent-sanctum-init: covered by the existing idempotency test suite
(tests/test_install_sanctum_preservation.py); not duplicated here.
"""

import pytest

from .conftest import CORE_SKILLS_ROOT, SKILLS_ROOT, parse_frontmatter

# (skill_name, skill_md_path) — explicit per-skill paths because the three
# Section-C skills live under two different roots.
_SECTION_C_SKILLS = [
("aim-agent-dispatch", SKILLS_ROOT / "aim-agent-dispatch/SKILL.md"),
("aim-agent-lifecycle", SKILLS_ROOT / "aim-agent-lifecycle/SKILL.md"),
(
"aim-best-practices-researcher",
CORE_SKILLS_ROOT / "aim-best-practices-researcher/SKILL.md",
),
]


@pytest.mark.process
@pytest.mark.parametrize(
"skill_name,skill_md",
_SECTION_C_SKILLS,
ids=[name for name, _ in _SECTION_C_SKILLS],
)
def test_skill_md_exists(skill_name, skill_md):
assert skill_md.exists(), f"SKILL.md not found for {skill_name}: {skill_md}"


@pytest.mark.process
@pytest.mark.parametrize(
"skill_name,skill_md",
_SECTION_C_SKILLS,
ids=[name for name, _ in _SECTION_C_SKILLS],
)
def test_skill_frontmatter_schema(skill_name, skill_md):
if not skill_md.exists():
pytest.skip(f"SKILL.md not found: {skill_name}")
fm = parse_frontmatter(skill_md)
assert fm.get("name"), f"Missing/empty 'name' in {skill_md}"
assert fm.get("description"), f"Missing/empty 'description' in {skill_md}"


@pytest.mark.process
@pytest.mark.parametrize(
"skill_name,skill_md",
_SECTION_C_SKILLS,
ids=[name for name, _ in _SECTION_C_SKILLS],
)
def test_skill_has_h2_section(skill_name, skill_md):
if not skill_md.exists():
pytest.skip(f"SKILL.md not found: {skill_name}")
text = skill_md.read_text(encoding="utf-8")
h2_lines = [ln for ln in text.splitlines() if ln.startswith("## ")]
assert h2_lines, f"{skill_md}: SKILL.md has no '## ' sections"
Loading
Loading