Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
48 changes: 32 additions & 16 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,18 +5,17 @@ on:
branches: [main, develop]
pull_request:
branches: [main, develop]
workflow_dispatch: # allow manual re-runs via gh CLI or GitHub UI
workflow_dispatch:

concurrency:
group: ci-${{ github.ref }}
cancel-in-progress: true

permissions:
contents: read
# Default: deny all permissions. Each job grants only what it needs.
permissions: {}

jobs:
lint:
name: Lint (ruff)
runs-on: ubuntu-latest
permissions:
contents: read
steps:
- uses: actions/checkout@v6
- uses: actions/setup-python@v6
Expand All @@ -25,11 +24,16 @@ jobs:
cache: pip
- run: python -m pip install --upgrade pip
- run: pip install ruff
- run: ruff check src/ tests/
- run: ruff format --check src/ tests/
- name: ruff format --check
run: ruff format --check src/ tests/
- name: ruff check
run: ruff check src/ tests/

typecheck:
name: Type check (mypy)
runs-on: ubuntu-latest
permissions:
contents: read
steps:
- uses: actions/checkout@v6
- uses: actions/setup-python@v6
Expand All @@ -41,13 +45,15 @@ jobs:
- run: mypy src/specsmith/

test:
needs: [lint, typecheck]
name: Test (Python ${{ matrix.python-version }} / ${{ matrix.os }})
runs-on: ${{ matrix.os }}
permissions:
contents: read
strategy:
fail-fast: false
matrix:
os: [ubuntu-latest, windows-latest, macos-latest]
python-version: ["3.10", "3.12", "3.13"]
runs-on: ${{ matrix.os }}
python-version: ["3.10", "3.11", "3.12", "3.13"]
os: [ubuntu-latest, windows-latest]
steps:
- uses: actions/checkout@v6
- uses: actions/setup-python@v6
Expand All @@ -59,7 +65,10 @@ jobs:
- run: pytest --cov=specsmith --cov-report=term-missing

security:
name: Security audit (pip-audit)
runs-on: ubuntu-latest
permissions:
contents: read
steps:
- uses: actions/checkout@v6
- uses: actions/setup-python@v6
Expand All @@ -73,7 +82,10 @@ jobs:

sync-check:
# REQ-003 guard: fail CI if .specsmith/ JSON drifts from docs/ Markdown.
name: Sync check
runs-on: ubuntu-latest
permissions:
contents: read
steps:
- uses: actions/checkout@v6
- uses: actions/setup-python@v6
Expand All @@ -90,7 +102,10 @@ jobs:

validate-strict:
# YAML governance schema guard: duplicate IDs, orphan tests, missing fields.
name: Validate strict
runs-on: ubuntu-latest
permissions:
contents: read
steps:
- uses: actions/checkout@v6
- uses: actions/setup-python@v6
Expand All @@ -107,10 +122,11 @@ jobs:

api-surface:
# REQ-140 guard: regenerates the public CLI surface and fails the build
# if the live output drifts from the committed fixture. Catches accidental
# command additions / removals in PRs without forcing every contributor
# to remember to run `specsmith api-surface > tests/fixtures/api_surface.json`.
# if the live output drifts from the committed fixture.
name: API surface
runs-on: ubuntu-latest
permissions:
contents: read
steps:
- uses: actions/checkout@v6
- uses: actions/setup-python@v6
Expand Down
161 changes: 161 additions & 0 deletions .specsmith/requirements.json
Original file line number Diff line number Diff line change
Expand Up @@ -2981,5 +2981,166 @@
"test_ids": [
"TEST-340"
]
},
{
"id": "REQ-341",
"title": "Terminal Awareness Skill in Skills Catalog",
"description": "specsmith.skills MUST include a \terminal-awareness skill in the CROSS_PLATFORM domain covering: (1) shell detection from Python and from the shell itself; (2) PowerShell 5 vs 7 syntax differences (null-coalescing, ternary, parallel ForEach-Object, encoding, &&/|| availability); (3) cmd.exe rules (no PowerShell cmdlets in pipelines, % variables, ^ continuation); (4) bash/zsh/fish patterns (background PID capture, trap cleanup, timeout); (5) Python subprocess spawn with PID tracking using communicate(timeout) and DEVNULL stdin; (6) PowerShell Start-Process -PassThru PID tracking with WaitForExit; (7) a cross-platform command equivalents table; (8) a cleanup checklist for spawned processes.",
"source": "ARCHITECTURE.md §37 Skills Catalog",
"status": "implemented",
"test_ids": [
"TEST-341"
]
},
{
"id": "REQ-342",
"title": "Shell-Aware Command Generation",
"description": "Agents operating on behalf of specsmith MUST detect the active shell before emitting shell commands. PowerShell cmdlets (Write-Host, Get-ChildItem, Start-Process, etc.) MUST NOT be emitted when the active shell is bash, zsh, fish, or cmd.exe. bash-isms (, export, $!) MUST NOT be emitted in PowerShell or cmd.exe contexts. The terminal-awareness skill provides the detection and equivalents reference that agents MUST consult.",
"source": "ARCHITECTURE.md §37 Skills Catalog",
"status": "implemented",
"test_ids": [
"TEST-342"
]
},
{
"id": "REQ-343",
"title": "Subprocess Spawn with PID Tracking and Cleanup",
"description": "specsmith process execution (specsmith exec, run_tracked) MUST spawn subprocesses using communicate(timeout=N) with stdin=DEVNULL to prevent hanging. Spawned PIDs MUST be written to .specsmith/pids/<pid>.json so specsmith ps and specsmith abort can list and kill them. On timeout, the implementation MUST call proc.kill() then proc.communicate() to drain pipes and avoid zombie processes. On Windows, CREATE_NEW_PROCESS_GROUP MUST be set for clean signal forwarding.",
"source": "ARCHITECTURE.md §37 Skills Catalog",
"status": "implemented",
"test_ids": [
"TEST-343"
]
},
{
"id": "REQ-344",
"title": "specsmith.esdb Namespace Re-exports chronomemory v0.1.1",
"description": "src/specsmith/esdb/__init__.py MUST re-export the full chronomemory v0.1.1 public API surface under the specsmith.esdb namespace: ChronoStore, ChronoRecord, WalEvent, open_store, EsdbBridge, EsdbRecord, EsdbStatus, DepGraph, DependencyEdge, RollbackReport, invalidate, ContextPack, ContextPackCompiler, ContextPackEntry, RustChronoStore, RustRecord, RUST_BACKEND, plus module-level references to query and metrics. specsmith.esdb.bridge MUST expose EsdbBridge, ContextPackCompiler, DepGraph, RUST_BACKEND, query, and metrics.",
"source": "ARCHITECTURE.md §36 specsmith.esdb Namespace",
"status": "implemented",
"test_ids": [
"TEST-344"
]
},
{
"id": "REQ-345",
"title": "LLM Context MUST Use query.what_is_known Not store.query(rag_filter)",
"description": "All specsmith code paths that inject ESDB ChronoRecords into LLM context (retrieval index building, context seed generation, context orchestrator eviction decisions) MUST use query.what_is_known(store) instead of store.query(rag_filter=True). query.what_is_known excludes infrastructure record kinds (edge, rollback_event, token_metric, skill_run) in addition to applying the confidence >= 0.6 filter. Infrastructure records MUST NEVER appear in agent-facing context.",
"source": "ARCHITECTURE.md §36 specsmith.esdb Namespace",
"status": "implemented",
"test_ids": [
"TEST-345"
]
},
{
"id": "REQ-346",
"title": "specsmith save --force Propagates Force to Push",
"description": "specsmith save MUST accept a --force flag that propagates to the underlying run_push() call, bypassing the gitflow direct-to-main guard and any other push safety checks. The push MUST use git push --force-with-lease (not --force) to avoid overwriting concurrent remote changes. --force has no effect when --no-push is also passed. When --force is omitted, all existing safety checks apply unchanged.",
"source": "ARCHITECTURE.md §38 VCS Force Operations",
"status": "implemented",
"test_ids": [
"TEST-346"
]
},
{
"id": "REQ-347",
"title": "specsmith pull --discard Hard-Resets to Remote Branch",
"description": "specsmith pull MUST accept a --discard flag. When passed, the implementation MUST: (1) run git fetch origin <branch> to bring the remote ref current; (2) run git reset --hard origin/<branch> to hard-reset the working tree; (3) report success with the branch name. All local uncommitted changes are discarded. This replaces the normal git pull (which preserves local state) when a clean reset to remote is required.",
"source": "ARCHITECTURE.md §38 VCS Force Operations",
"status": "implemented",
"test_ids": [
"TEST-347"
]
},
{
"id": "REQ-348",
"title": "specsmith pull --clean Removes Untracked Files After Discard",
"description": "When specsmith pull --clean is passed, the implementation MUST perform the same hard-reset sequence as --discard and additionally run git clean -fd to remove all untracked files and directories. The success message MUST note that untracked files were removed. --clean implies --discard; passing --clean without --discard MUST produce the same result.",
"source": "ARCHITECTURE.md §38 VCS Force Operations",
"status": "implemented",
"test_ids": [
"TEST-348"
]
},
{
"id": "REQ-349",
"title": "gh-ci-polling Skill Prohibits Sleep-Based CI Waiting",
"description": "specsmith.skills MUST include a gh-ci-polling skill in the GOVERNANCE domain documenting gh run watch as the correct CI-wait primitive. The skill MUST explicitly prohibit Start-Sleep, sleep, and time.sleep as CI wait mechanisms. It MUST provide: (1) the canonical gh run watch pattern for bash and PowerShell; (2) non-blocking gh run list --json conclusion status check; (3) the one acceptable polling loop (with state check, minimum 15-second interval) for when gh run watch is unavailable; (4) gh run view --log-failed for immediate failure triage.",
"source": "ARCHITECTURE.md §37 Skills Catalog",
"status": "implemented",
"test_ids": [
"TEST-349"
]
},
{
"id": "REQ-350",
"title": "Epistemic Metadata Passthrough in Sync Pipeline",
"description": "specsmith sync MUST pass through platform, boundary, and confidence fields from YAML requirement sources into the .specsmith/requirements.json machine-state entries when those fields are present in the YAML. These fields are used by generate_requirements_md to render them into REQUIREMENTS.md and by belief.py to parse Platform/Boundary/Confidence metadata. Absent fields MUST be omitted from the JSON entry (not written as null).",
"source": "ARCHITECTURE.md §YAML-Native Governance Layer",
"status": "implemented",
"test_ids": [
"TEST-350"
]
},
{
"id": "REQ-351",
"title": "specsmith checkpoint Governance Anchor Command",
"description": "specsmith MUST provide a checkpoint CLI command that emits a compact GOVERNANCE ANCHOR summarising the current project state: project name (from scaffold.yml), AEE phase with readiness percentage, audit health and failed check count, REQ count, TEST count, ESDB record count with chain validity, up to 3 recent WI- identifiers from LEDGER.md, and the last preflight acceptance line. With --json it MUST emit a JSON payload containing ts, project, phase, phase_label, phase_pct, health, audit_failed, req_count, test_count, esdb_records, esdb_chain_valid, recent_wis, last_preflight, and anchor fields. Without --json it MUST emit a human-readable bordered GOVERNANCE ANCHOR block with a footer instructing agents to include it verbatim in any context summary. All data gathering MUST be best-effort (exceptions silently swallowed) so the command never fails even on projects with no ESDB or LEDGER.",
"source": "ARCHITECTURE.md §Session Governance Protocol",
"status": "implemented",
"test_ids": [
"TEST-351"
]
},
{
"id": "REQ-352",
"title": "M006 Session Governance Migration Auto-injects Protocol into AGENTS.md",
"description": "specsmith MUST include migration M006 (version=6) that detects whether AGENTS.md contains any of the sentinel strings 'specsmith checkpoint', 'Session Governance Protocol', 'GOVERNANCE ANCHOR', or 'governance heartbeat'. When none are present, M006 MUST back up AGENTS.md to .specsmith/agents.md.m006.bak and inject the full Session Governance Protocol section (heartbeat every 8-10 turns, preflight gate, drift detection checklist, checkpoint-in-summary rule, session end). M006 MUST be idempotent (re-running when section is present is a no-op), non-destructive (original always backed up), and registered in MigrationRegistry so it runs automatically via specsmith migrate-project and specsmith upgrade --full.",
"source": "ARCHITECTURE.md §Session Governance Protocol",
"status": "implemented",
"test_ids": [
"TEST-352"
]
},
{
"id": "REQ-353",
"title": "Modern Web Framework Project Types",
"description": "specsmith MUST support the following modern web framework project types in addition to the existing web-frontend and fullstack-js types: nextjs-app (Next.js / React with SSR/SSG, next lint, jest/playwright), nuxt-app (Nuxt.js / Vue, vitest, playwright), sveltekit-app (SvelteKit, vitest, playwright), remix-app (Remix React, vitest, playwright), astro-site (Astro static/SSR, vitest, playwright). Each MUST have a corresponding ToolSet entry in the tool registry with appropriate lint, typecheck, test, security, build, and format tools. Each MUST appear in _TYPE_LABELS with a human-readable label.",
"source": "ARCHITECTURE.md §Implemented Specsmith System",
"status": "implemented",
"test_ids": [
"TEST-353"
]
},
{
"id": "REQ-354",
"title": "CodityAdapter Scaffolds AI Code Review CI Workflow",
"description": "specsmith MUST provide a CodityAdapter registered as 'codity' in the integrations registry. CodityAdapter.generate() MUST detect the VCS host from scaffold.yml content ('gitlab' keyword → gitlab, 'azure' keyword → azure, else github) and from directory heuristics (.gitlab-ci.yml → gitlab, azure-pipelines.yml → azure). For github it MUST write .github/workflows/codity-review.yml; for gitlab it MUST write .gitlab-ci-codity.yml; for azure it MUST write .azure-pipelines/codity-review.yml. All variants MUST install the Codity CLI via the official install script, run 'codity review --staged', and require CODITY_ACCESS_TOKEN. GitLab and Azure variants MUST additionally call 'codity config set-pat --provider <vcs>'. generate() MUST also write docs/codity-setup.md (one-time setup checklist) and append a TODO checklist to LEDGER.md if it exists. The adapter MUST be discoverable via specsmith integrate codity.",
"source": "ARCHITECTURE.md §39",
"status": "implemented",
"test_ids": [
"TEST-354",
"TEST-355"
]
},
{
"id": "REQ-355",
"title": "AGENTS.md Template Includes Codity.ai Pre-commit Rule",
"description": "The AGENTS.md Jinja2 template (agents.md.j2) MUST include a 'Codity.ai Code Review' section that instructs agents: if 'codity doctor' exits 0 (Codity is configured), run 'codity review --staged' before any commit touching production code; HIGH-severity findings are blocking; MEDIUM-severity findings require inline acknowledgement in the commit message; setup is via 'specsmith integrate codity --project-dir .'.",
"source": "ARCHITECTURE.md §39",
"status": "implemented",
"test_ids": [
"TEST-357"
]
},
{
"id": "REQ-356",
"title": "codity-ai-review Governance Skill in Skills Catalog",
"description": "specsmith MUST include a 'codity-ai-review' SkillEntry in the governance domain skills catalog. The skill MUST document: Codity CLI install command (curl install script), codity login (magic-link browser auth), codity init (per-repo initialisation), daily commands (review --staged, scan --staged, test-gen --staged, doctor), the AGENTS.md blocking rule (HIGH severity = commit blocked, MEDIUM = acknowledgement required), CI integration via specsmith integrate codity, GitHub App setup, GitLab PAT setup (codity config set-pat --provider gitlab), and Azure DevOps PAT setup. The skill MUST be tagged with codity, ai-review, code-review, security, test-gen, ci, github, gitlab, azure, staged, pre-commit and discoverable via specsmith skill list.",
"source": "ARCHITECTURE.md §39",
"status": "implemented",
"test_ids": [
"TEST-356"
]
}
]
Loading
Loading