Signum has multiple documentation surfaces, but they do not have equal authority.
- Canonical pipeline behavior: root
commands/signum.md - Platform-specific overlays:
platforms/*/commands/signum.md - Derived docs:
README.md,docs/how-it-works.md,docs/reference.md, roadmap docs
If a derived doc disagrees with root commands/signum.md, treat the command file as the source of truth. Platform overlays may add surface-specific behavior, but they should be read as explicit deviations from the root pipeline rather than independent definitions of core Signum behavior.
platforms/claude-code/commands/signum.mdcurrently adds Phase 5: RECONCILE after PACK.- This is an overlay-only deviation, not canonical core pipeline behavior.
- Treat it as valid only while it remains listed in
docs/overlay-deviations.json.
/signum <task description>
Signum parses the task description and runs the full 4-phase pipeline automatically.
/signum add a health check endpoint that returns 200 OK
Pipeline: contractor → baseline → engineer (1 attempt) → scope gate → mechanic + Claude review → proofpack. Estimated cost: ~$0.10-0.20.
/signum add user authentication with JWT tokens
Pipeline: contractor → baseline → engineer (up to 3 repair attempts) → scope gate → mechanic + holdouts + Claude + Codex (security) + Gemini (performance) → synthesizer → proofpack. Estimated cost: ~$0.30-0.60.
/signum migrate user table from MongoDB to PostgreSQL
Pipeline: same as medium but contractor flags high risk with risk signals and holdout scenarios. All 3 model reviews weighted equally in synthesis. Estimated cost: ~$0.50-1.00.
# Start a pipeline
/signum refactor the payment module
# ...interrupt (Ctrl+C or close session)...
# Reopen and run the same command
/signum refactor the payment module
# Signum checks contracts/index.json.activeContractId and asks: resume from Phase 2, or restart?
CONTRACT → EXECUTE → AUDIT → PACK
Contractor agent (haiku) scans the codebase and produces contract.json under the active contract artifact root (.signum/contracts/<contractId>/), with a root .signum/contract.json compatibility path during the migration.
Hard stop if openQuestions is non-empty — the user must answer before proceeding.
- Baseline capture — orchestrator runs lint/typecheck/tests BEFORE any changes and saves
baseline.jsonunder the active contract artifact root. - Engineer agent (sonnet) implements the contract. Repair loop: up to 3 attempts of implement → check acceptance criteria → fix failures.
- Scope gate — deterministic check that all modified files are within
inScopeorallowNewFilesUnder. Pipeline stops on scope violation.
Outputs under the active contract artifact root: baseline.json, combined.patch, execute_log.json.
Five independent verification layers:
- Mechanic (bash, zero LLM) — runs linter, typechecker, tests. Compares with baseline to detect regressions vs pre-existing failures.
- Holdout validation — runs hidden acceptance criteria the Engineer never saw (edge cases, negative tests from contract).
- Claude reviewer (opus agent) — semantic review of contract + diff + mechanic results.
- Codex reviewer (CLI, security-focused) — analyzes diff for security defects using
review-template-security.md. - Gemini reviewer (CLI, performance-focused) — analyzes diff for performance defects using
review-template-performance.md.
Synthesizer agent applies deterministic rules:
- AUTO_OK: no regressions + all reviews APPROVE + 2+ reviews parsed + holdouts pass
- AUTO_BLOCK: any regression (NEW failure vs baseline) OR any REJECT OR any CRITICAL finding
- HUMAN_REVIEW: everything else (mixed signals, only 1 review, CONDITIONAL verdicts, holdout failures)
Pre-existing failures (checks that failed in baseline AND still fail) no longer auto-block.
Assembles proofpack.json under the active contract artifact root — a self-contained evidence bundle with embedded artifact contents, SHA-256 checksums, and confidence score.
Canonical run artifacts live under the active contract artifact root .signum/contracts/<contractId>/. Root .signum/ stays auto-added to .gitignore and now mainly holds registry/state surfaces plus compatibility views during the contract-dir migration. The contract, pre-execute metadata, execute outputs, selected audit/pack file artifacts (contract.json, spec_quality.json, spec_validation.json, clover_report.json, intent_check.json, approval.json, contract-hash.txt, contract-engineer.json, contract-policy.json, execution_context.json, baseline.json, combined.patch, execute_log.json, iteration_delta.patch, mechanic_report.json, holdout_report.json, policy_violations.json, policy_scan.json, audit_iteration_log.json, repair_brief.json, flaky_tests.json, audit_summary.json, proofpack.json, anti_entropy_report.json), and active run directories (reviews/, iterations/, receipts/, runs/, snapshots/) are canonical under that contract directory.
| File | Phase | Contents |
|---|---|---|
contract.json |
Contract | Goal, scope, acceptance criteria, holdout scenarios, risk level |
baseline.json |
Execute | Pre-change lint/typecheck/test exit codes |
combined.patch |
Execute | Full git diff of all changes |
execute_log.json |
Execute | Attempt history, check results, status |
mechanic_report.json |
Audit | Lint, typecheck, test results with baseline comparison and regression flags |
holdout_report.json |
Audit | Holdout scenario pass/fail counts |
reviews/claude.json |
Audit | Claude opus semantic review |
reviews/codex.json |
Audit | Codex CLI security review (or unavailable marker) |
reviews/gemini.json |
Audit | Gemini CLI performance review (or unavailable marker) |
audit_summary.json |
Audit | Synthesized decision with consensus reasoning and confidence scores |
proofpack.json |
Pack | Self-contained evidence bundle with embedded artifacts, checksums, and confidence |
anti_entropy_report.json |
Pack | Advisory anti-entropy follow-up findings; report-only, does not change pipeline decision |
Canonical contract directories typically contain:
contract.jsonproofpack.jsonanti_entropy_report.jsonaudit_summary.jsonapproval.jsonexecution_context.jsonreviews/iterations/receipts/runs/<runId>/snapshots/
| Field | Type | Description |
|---|---|---|
schemaVersion |
"3.0"–"3.8" |
Schema version |
glossaryVersion |
string | Version from project.glossary.json at contract creation time (optional, omitted when file absent) |
goal |
string | What to build (min 10 chars) |
inScope |
string[] | Items in scope (min 1) |
allowNewFilesUnder |
string[] | Directories where new files may be created (optional) |
outOfScope |
string[] | Explicitly excluded items |
acceptanceCriteria |
object[] | AC-N items with verify commands |
holdoutScenarios |
object[] | Hidden ACs not shown to Engineer (optional) |
riskLevel |
low|medium|high |
Deterministic risk assessment |
riskSignals |
string[] | Why risk level was assigned |
openQuestions |
string[] | Must be empty to proceed |
contextInheritance |
object | Project context references (optional) |
contextInheritance.projectRef |
string|null | Path to project.intent.md, "not_found", null (waiver), or absent (legacy) |
contextInheritance.projectIntentSha256 |
string | SHA-256 of project.intent.md at contract creation |
contextInheritance.contextSnapshotHash |
string | SHA-256 hex digest over concatenated byte contents of all staleIfChanged files in array order, computed at contract creation time |
contextInheritance.staleIfChanged |
string[] | Upstream artifact paths tracked for staleness; at minimum includes project.intent.md when loaded |
contextInheritance.stalenessStatus |
"fresh"|"warning"|"stale" |
Current staleness state: fresh=hash matches, warning=soft mismatch, stale=hash differs and policy=block |
contextInheritance.stalenessPolicy |
"block"|"warn" |
Action when upstream hash differs: block=halt pipeline (BLOCK), warn=continue with warning (default: "warn") |
dependsOnContractIds |
string[] | ContractIds that must complete before this contract executes (user-declared, optional) |
supersedesContractIds |
string[] | ContractIds this contract replaces (user-declared, optional) |
supersededByContractId |
string | ContractId of the contract that replaces this one (optional) |
interfacesTouched |
string[] | Named interfaces, APIs, or module boundaries this contract modifies (optional) |
ambiguityCandidates |
object[] | Typed findings from ambiguity review pass: {text, location, severity} (optional, v3.7+) |
contradictionsFound |
object[] | Typed findings from contradiction review: {claim_a, claim_b, type} (optional, v3.7+) |
clarificationDecisions |
object[] | Decisions made during critique: {question, decision, rationale} (optional, v3.7+) |
assumptionProvenance |
object[] | Source tracking for assumptions: {id, text, source, confidence} (optional, v3.7+) |
readinessForPlanning |
object | Go/no-go gate: {verdict: "go"|"no-go", summary: string} (optional, v3.7+) |
Optional file at PROJECT_ROOT/project.glossary.json. When present, contractor reads it and sets glossaryVersion in the contract.
{
"version": "1.0.0",
"canonicalTerms": ["term1", "term2", "..."],
"aliases": {
"forbidden-synonym": "canonical-term",
"another-synonym": "another-canonical"
}
}| Field | Type | Description |
|---|---|---|
version |
string | Glossary version string (mirrors glossaryVersion in contract) |
canonicalTerms |
string[] | Approved terminology for this project |
aliases |
object | Map of forbidden synonyms to their canonical replacements |
All Phase 1 quality checks are standalone shell scripts in lib/. Each follows the same interface:
lib/<check>.sh <contract.json> [--flag value ...]
stdout: {"check":"<name>","status":"ok|warn|block|skip|error","summary":"...","findings":[...]}
exit 0: check completed (any status)
exit 1+: infra error (bad args, missing jq, corrupt input)
| Script | Purpose | Extra args |
|---|---|---|
lib/glossary-check.sh |
Forbidden synonym scan | --glossary <path> |
lib/terminology-check.sh |
Cross-contract synonym proliferation | --index <path> --glossary <path> |
lib/overlap-check.sh |
inScope overlap between active contracts | --index <path> |
lib/assumption-check.sh |
Assumption contradiction detection | --index <path> |
lib/adr-check.sh |
ADR relevance for inScope paths | --project-root <dir> |
lib/staleness-check.sh |
Upstream artifact staleness (pure, no mutation) | --project-root <dir> |
lib/prose-check.sh |
Prose quality gate (banned phrases, quantifiers, passive voice) | — |
The orchestrator (commands/signum.md) calls each script, reads JSON output, merges findings into spec_quality.json, and applies mutations/blocking decisions. Scripts never modify contract.json or spec_quality.json directly.
Runs during Phase 1 spec quality gate (after the adr_relevance_check). Skipped when contextInheritance.staleIfChanged is absent or empty.
When staleIfChanged is a non-empty array, the check always executes:
- Concatenates the byte contents of all files listed in
staleIfChanged(in array order) - Computes SHA-256 of the concatenated bytes
- Compares the result to
contextInheritance.contextSnapshotHash
Outcome depends on contextInheritance.stalenessPolicy (default "warn"):
| Hash result | Policy | Outcome |
|---|---|---|
| Matches | any | fresh — pipeline continues |
| Differs | "warn" |
warning — WARN emitted, pipeline continues |
| Differs | "block" |
stale — BLOCK emitted, pipeline stops; re-run Contractor to refresh |
contextInheritance.stalenessStatus is updated in-place in contract.json after the check.
Runs during Phase 1 spec quality gate (Step 1.3.5). Scans the contract's goal, inScope items, and AC description fields for any term appearing in the aliases map (case-insensitive whole-word match). Emits a WARN line for each match with the forbidden term and its canonical replacement. Results are written to glossary_warnings in spec_quality.json. This check is non-blocking — it never fails the pipeline or reduces the numeric spec quality score.
Runs during Phase 1 spec quality gate (Step 1.3.5) after glossary_check. Reads .signum/contracts/index.json, extracts goal text from active contracts, and scans for synonym proliferation (same concept appearing under two different terms across contracts). Emits WARN lines on synonym proliferation. When .signum/contracts/index.json is absent or contains no contracts with active status, the check outputs a skip message and does not block or fail. This check is non-blocking.
Runs during Phase 1 spec quality gate. Reads .signum/contracts/index.json, compares the new contract's inScope against active contracts' inScope arrays. Emits WARN when files overlap with another active contract, listing the overlapping files and the conflicting contract ID. Skips gracefully when index is absent or has no active contracts. Non-blocking.
Runs during Phase 1 spec quality gate after cross_contract_overlap_check. Reads assumptions from the new contract and compares against assumptions of active contracts in index.json. Emits WARN when assumption text contains contradictory terms (e.g., one contract assumes "X is true" while another assumes "X is false"). Non-blocking.
Runs during Phase 1 spec quality gate. Scans for docs/adr/ or docs/decisions/ directories. If ADR files exist and the contract's inScope touches paths that match ADR file globs, emits WARN suggesting the contract reference relevant ADRs. Skips when no ADR directories exist. Non-blocking.
When AUDIT finds MAJOR or CRITICAL issues, it enters an iterative repair loop:
- Engineer fixes findings (fresh agent, clean context)
- Full review cycle re-runs from scratch
- Repeats until convergence or max iterations
| Environment Variable | Default | Description |
|---|---|---|
SIGNUM_AUDIT_MAX_ITERATIONS |
20 |
Maximum audit fix iterations before terminal decision |
SIGNUM_CI_RELAXED |
false |
If "true", HUMAN_REVIEW maps to exit 0 instead of 78 |
Iteration artifacts are stored under the active contract artifact root, for example .signum/contracts/<contractId>/iterations/01/, .signum/contracts/<contractId>/iterations/02/, etc. Each contains the full set of audit artifacts for that pass.
The proofpack includes an iterativeAudit section when >1 iteration was used, with per-iteration summaries, resolved/remaining findings, and the best iteration number.
| Field | Type | Description |
|---|---|---|
schemaVersion |
"4.6" |
Schema version (v4.6 adds iterativeAudit, ciContext, baselineComparison, contractSource) |
signumVersion |
string | Signum version that generated this proofpack |
createdAt |
string | ISO 8601 timestamp of proofpack creation |
runId |
string | signum-YYYY-MM-DD-XXXXXX |
decision |
AUTO_OK|AUTO_BLOCK|HUMAN_REVIEW |
Final verdict |
summary |
string | One-line human-readable summary |
confidence |
object | { overall: 0-100 } — weighted confidence score |
auditChain |
object | { contractSha256, approvedAt, baseCommit } — immutable audit anchors |
contract |
envelope | Redacted contract (holdouts stripped), fullSha256 for original |
diff |
envelope | Patch content (omitted if >100KB) |
baseline |
envelope | Pre-change lint/typecheck/test results |
executeLog |
envelope | Attempt history and check results |
checks.mechanic |
envelope | Lint, typecheck, test with regression flags |
checks.holdout |
envelope | Holdout scenario pass/fail (if applicable) |
checks.reviews.* |
envelope | Per-provider review (dynamic keys) |
checks.auditSummary |
envelope | Synthesized decision with confidence |
iterativeAudit |
object | Iteration metadata (v4.6+, present only when >1 iteration) |
iterativeAudit.iterationsUsed |
integer | Total iterations run |
iterativeAudit.bestIteration |
integer | Iteration with best score |
iterativeAudit.auditIterations |
array | Per-iteration summaries (score, findings count, decision) |
iterativeAudit.resolvedFindings |
array | Findings fixed during iterations |
iterativeAudit.remainingFindings |
array | Findings still present after all iterations |
Each artifact uses the envelope format: { content, sha256, sizeBytes, status, omitReason? }.
status: present— content embeddedstatus: omitted— content null, validate by sha256status: error— generation failed, see omitReason
The synthesizer computes a weighted confidence score (0-100):
| Component | Weight | Source |
|---|---|---|
execution_health |
40% | ACs passed ratio minus repair attempt penalty |
baseline_stability |
30% | Proportion of checks with no regressions |
review_alignment |
30% | Reviewer agreement level (100=unanimous approve, 0=no approvals) |
Each reviewer produces:
{
"verdict": "APPROVE|REJECT|CONDITIONAL",
"findings": [
{
"severity": "CRITICAL|MAJOR|MINOR",
"category": "bug|security|performance|spec-gap|missing-test",
"file": "src/auth.ts",
"line": 42,
"description": "...",
"suggestion": "..."
}
],
"summary": "..."
}| Dependency | Required | Purpose |
|---|---|---|
| Claude Code | Yes | Runtime environment |
| git | Yes | Diff generation, scope gate |
| jq | Yes | JSON validation and assembly |
| python3 | Yes | Review prompt template substitution |
| sha256sum or shasum | Yes | Checksum computation (auto-detected) |
| Codex CLI | No | Security-focused review in AUDIT phase |
| Gemini CLI | No | Performance-focused review in AUDIT phase |
Install jq:
- macOS:
brew install jq - Ubuntu/Debian:
apt install jq - Other: jq downloads
codex: auth expired → run: codex auth
gemini: auth expired → run: gemini login
Signum continues without the provider if auth fails.
External providers are killed after 180 seconds. The review continues with remaining providers. Check reviews/ under the canonical artifact root for provider status.
Normal behavior. Signum detects existing contract.json and offers:
- Resume: continue from Phase 2
- Restart: clear artifacts, start fresh
In jj-managed repositories, the contractor can detect ghost solutions — functions that are semantically superseded but still present in the codebase. This requires jj-supersede:
uv tool install jj-supersedeWhen both jj and jj-supersede are available, the contractor automatically:
- Runs
jj-supersede report --jsonduring CONTRACT phase (step 1.8) - Generates
removalsentries withtype: "function"for superseded functions - Creates non-blocking
cleanupObligationswithaction: "remove_code"
If jj-supersede is not installed or the project is not a jj repo, this step is silently skipped. No configuration needed.
- Verify installation:
claude plugin list | grep signum - Reinstall:
claude plugin install signum@emporium - Open a new Claude Code session (plugins load at session start)