Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
79 changes: 79 additions & 0 deletions canon/methods/discernment-transfer-ladder.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
---
uri: klappy://canon/methods/discernment-transfer-ladder
title: "Method: The Discernment Transfer Ladder — Empirical Rulebook Transfer Across Adjacent Tiers"
audience: canon
exposure: nav
tier: 2
voice: neutral
stability: provisional
target_repo: "outcomes-driven-development"
date: 2026-06-30
derives_from: "canon/principles/rulebook-transfer, canon/principles/discernment-layer, canon/values/axioms, canon/constraints/measure-before-you-object, canon/methods/revision-lens-sequence, canon/principles/verification-requires-fresh-context"
status: proposed
---

# Method: The Discernment Transfer Ladder

> Discernment a frontier model performs once can be written down as a rulebook a lesser model runs — `rulebook-transfer` states the principle. This method is the empirical procedure that *derives* that rulebook from real evidence, and the observation that the same procedure runs at every adjacent rung of a ladder whose top is a human and whose floor is reality itself.

> This method **operationalizes** `klappy://canon/principles/rulebook-transfer` (Tier-1 principle, E0010). It does not restate the theory — it is the empirical *procedure* that derives and validates the rulebooks the principle describes. All theory (articulability, step-size, the execution/stewardship split, the authority asymmetry) defers to that principle.

## Summary

A task that a higher tier performs by judgment can be handed to a lower tier — but only by writing the judgment down, only between **adjacent** tiers, and only for the part of the judgment that is writable at all. The transfer is not assumed; it is **measured and compensated** in a loop. The same loop runs at every rung: human → SOTA model → mid model → small model. Each rung treats the tier above as its reference and the tier below as its student. The ladder bottoms out not at the human but at **reality** — the only sovereign reference (Axiom 1). The output is a per-task **tier-assignment map**: the cheapest tier that clears the fidelity bar for each task, and the rulebook that gets it there.

## The recursion, who authors, and the bottleneck

The theory — n-tier recursion, why hops have a maximum step size, the split between **execution capacity** (run a rulebook from above; extends far down) and **stewardship capacity** (author a rulebook *for the tier below*; scarcer, fewer rungs) — lives in `rulebook-transfer` and is not restated here. Canon names the dominant unresolved question: **star or chain** — does authoring concentrate at the frontier (one steward writes for everyone) or re-instantiate at each tier (each authors for the one below)?

This method answers it not as a *capability* question but as a *constraint* one, per the operating contract's third pillar (operator attention is the bottleneck). The scarcest resource is not model capability — it is the human's stewardship time. So the topology is decided by **availability × proven stewardship**, not tier-distance:

- **The human authors only for the frontier.** Stewardship-capable but maximally scarce, the human spends their authoring time only where it is irreplaceable: writing the rulebooks the top models run (Fable, Opus). Human time cannot reach lower and should not try.
- **The frontier authors for everyone below.** Fable and Opus have *observed* stewardship capacity and *abundant* availability, so the authoring of lower-tier rulebooks (for Sonnet, Haiku) is delegated to them. Authoring labor flows to the highest tier that has both proven stewardship and spare time.
- **Mid-tier stewardship is unproven and not depended on.** There is no evidence yet that Sonnet can author a working rulebook for Haiku. The topology does not assume it: if the frontier has the availability, the chain need not route authoring through the mid tier at all.

The result is a **shallow star**: human → {frontier}, thin and bottleneck-rationed; {frontier} → {everything below}, abundant and delegated. This is canon's star/chain question answered by the constraint — *author as high as the bottleneck forces, and delegate authoring downward to the most-capable tier that has time.*

Evidence so far (this session, honest): the v2 compensation rulebook was authored by the top tier and run directly by Haiku — supporting **frontier → bottom authoring** (the star arm). Whether a mid tier can author for the tier below it — the chain arm — was not tested. The COO project is itself an instance of the top arm: Wickman, Grove, Geary, and Gawande are human experts who wrote their discernment into **books** — frontier-grade rulebooks — and the model now runs them.

## The loop at a single rung

1. **Establish the reference and the answer key — and keep them separate.** Run the task on the higher tier to produce a reference output. Separately establish a held-out, ground-truth answer key checked against reality (or the closest available proxy). The higher tier's output seeds the rulebook's worked examples; it is **not** the answer key. Tuning the student to match the reference transfers the reference's *errors* too.
2. **Measure the gap on two axes.** *Structural* — counts, tiering, budget adherence, edge integrity; cheap and scriptable. *Semantic fidelity* — is the content true to the source/reality; requires an independent grader with fresh context (`verification-requires-fresh-context`). Structural pass without fidelity pass is shape posing as substance.
3. **Write the missing discernment down.** Diagnose the gap and compensate the rulebook with explicit, writable rules: hard budgets and caps, decision tests, anti-fragmentation rules, worked examples at the target grain, and self-count gates the student runs before returning.
4. **Re-run the student AND regression-run the reference.** Confirm the student converged — and that the new constraints did not lower the higher tier's ceiling. A compensation that helps the weak tier can over-constrain the strong one; lift the floor without dropping the ceiling.
5. **Loop over UNSEEN cases, and know when to stop.** Tune on case A, then run the *frozen* rulebook on an unseen case B (train/test split) — convergence on the unseen case is the only honest signal; tuning and testing on the same case is memorization. If after N rounds a capability still will not transfer to a tier, that capability is **not writable-down for that tier**: assign it permanently to a higher tier and stop spending on compensation.

## Ground truth is tiered; reality is sovereign

At each rung the reference is the tier-above's *approximation*, not Truth. The human is the closest rung to reality but is still approximating it. Therefore: validate against **reality** wherever it can be reached (does the checklist actually work in the field; did the distilled procedure produce the right action), and fall back to the tier-above only when reality is out of reach. This is Axiom 1 applied to the ladder — observe before asserting, at every rung.

This is the same loop Gawande prescribes for checklists: *first drafts fall apart in the real world; test, study the failures, revise, re-test until it works.* The method we are encoding is checklist-design applied to the transfer of discernment itself — and the COO foldout that surfaced it contains the very row (`tra-`/`gaw-` test-and-refine) that states it. The method is self-evidencing.

## Output: the tier-assignment map

The loop does not crown one model. It produces a portfolio: for each task or lens, the cheapest tier that clears the fidelity bar, plus the rulebook that gets it there, plus the capabilities that refused to transfer and stay up-tier. Cost selection becomes a measured decision (`measure-before-you-object`), not a guess.

## Evaluation transfers too — as bounded delegated stewardship

Production is not the only thing that moves down the ladder. **Evaluation and ratification** — the act of judging an output, shaping it, and admitting it to a store — transfer by the same loop. But they transfer under a shape canon already names. Per `klappy://canon/decisions/models-do-not-mutate-canon`: *a model may hold delegated, bounded, revocable stewardship over a sub-scope — governing within it, never above it, never over itself.* The empirical loop is how that stewardship is **earned**: the steward's verdicts are measured against the tier above (and reality), its evaluation rulebook is compensated, and only once it converges on unseen cases is the authority granted — and it stays revocable.

Three guardrails travel with the delegation, each already in canon:

- **Within scope, never above it (domain bound).** A steward ratifies only artifacts inside the sub-scope it was granted. The COO agent may ratify COO-domain state; it does not ratify universal canon or another domain's. Higher-tier artifacts carry higher epistemic obligation (`klappy://canon/definitions/epistemic-obligation-and-document-tiers`), and the obligation rises faster than the delegation does.
- **Never over itself (independence).** A steward cannot ratify its own production. The critic cannot be the resolver and the creator cannot be its own critic (`klappy://canon/constraints/critic-cannot-be-resolver`, `klappy://canon/principles/verification-requires-fresh-context`). Delegated evaluation runs in a separate context/instance from the one that produced the artifact.
- **Proportional to stakes (calibration).** The ratification bar scales with the artifact's tier and reversibility (`klappy://odd/challenge/stakes-calibration`). Low-stakes, reversible, in-domain, high-frequency items are delegable to the tier whose eval rulebook clears the bar; high-stakes, irreversible, novel, or cross-domain items escalate up the ladder — ultimately to the human, who alone ratifies Tier-1 canon.

There is therefore **no conflict** with "models do not mutate canon." That decision forbids governing *above* one's scope and *over* oneself; it explicitly permits delegated, bounded, in-domain stewardship. The human keeps two things no rung below inherits: ratification of the top tier (universal canon), and the authority to grant and revoke every steward's scope.

This is not abstract for Covenynt. The COO's own design is this principle instantiated at the business layer: act-with-approval, solo-approve for low-stakes, 2-of-3 partners for high-stakes (`0 Context/about.md`). The agent is a bounded, revocable steward of the COO domain — ratifying the reversible in-domain work itself, escalating the rest. The ladder and the COO approval model are the same structure at two altitudes.

## Worked example (the session that produced this method)

Task: fold one chapter of a COO book into a Kirigami foldout. Reference: Opus, 41 rows / 11 tier-1. Students under rulebook v1: Sonnet 37/9 (converged — adjacent tier), Haiku 88/4 (collapsed — two tiers down, judgment did not transfer; *format* — JSON, edge syntax — did transfer). Compensation v2: budgets, tier-1 test, anti-fragmentation rule, worked example, self-count gate. Students under v2: Haiku 45/10, back in the reference band. Not yet done: the semantic-fidelity gate (Step 2 axis two) and the unseen-case test (Step 5). Predicted non-transferable rung: cross-book reconciliation (L6) stays on the top model tier.

## Open

- **Semantic fidelity unproven.** Convergence so far is structural only. The fidelity grader against a held-out answer key has not run.
- **N is uncalibrated.** How many compensation rounds before declaring a capability non-transferable is not yet known.
- **Reality-grounding for this domain.** For the COO, the ultimate answer key is whether the agent's actions are right in the field — which only the pilot (Brief 07 Step 5 / Brief 08 Phase 2) supplies.
10 changes: 10 additions & 0 deletions canon/principles/rulebook-transfer.md
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,16 @@ Two layers of evidence, each scoped honestly.

A further hazard sharpens both: an articulated rulebook can be fluent and still not describe what its author actually did. A confabulated rulebook reads well and reproduces badly. Fidelity is therefore never inferred from how good the rulebook reads; it is measured by testing that the lesser tier makes the same cuts — consistent with [Code Claims Require Code Observation](klappy://canon/principles/code-claims-require-code-observation): the claim is verified against observation, not against its own prose.

### Second probe — COO canon, 2026-06-30 (real corpus, three tiers)

A live cross-model run folded chapters from four COO books through one lens, graded by a fresh-context grader against held-out truth (`kirigami://docs/eval/coo-corpus-fidelity-run-2026-06-30`). It extends both evidence layers:

**Execution transfer — strengthened.** Under a sufficiently explicit lens a Haiku-class model reached ship-grade fidelity (8/8 held-out questions on the hardest chapter). The *articulability step-size* mechanism was observed directly: under a thin lens Haiku collapsed (88 rows, 4 tier-1 — a starved skeleton); writing the missing discernment down (budgets, tier test, anti-fragmentation rule, worked example, self-count gate) moved the same model into the frontier band (45 rows, 10 tier-1) with no model change. Articulability, not raw capability, was the binding variable — exactly as the principle predicts.

**Stewardship transfer — first data on the chain arm.** The prior evidence was N=1 at the top rung. This probe tested the rung below: a mid-tier model (Sonnet) *authored* a compensating rule for a Haiku failure, and it worked when Haiku ran it (the targeted hallucination and orphan edges disappeared) — but it **over-corrected**, dropping coverage. A frontier model (Opus), given the identical brief, authored the same fix **surgically** — failure removed, coverage preserved. Reading: stewardship *does* transfer below the top rung, but it is **lossy in the tacit dimension** — the mid tier fixes the articulable bug and fumbles the balance the frontier holds. This is a first, concrete data point on the **star-or-chain** question: frontier→bottom authoring is clean (the star arm holds); mid-tier→bottom authoring is real but degraded — consistent with the claim that the delegation ladder has fewer rungs than the execution ladder.

**Still open.** The device tier is untested (all three models are cloud). The multi-rung chain remains thin: one rung of mid-tier stewardship, observed once, lossy. Star-vs-chain is now *informed*, not resolved.

## Open questions

- **Star or chain?** Does stewardship capacity re-instantiate at each tier (a chain — every tier stewards the one below), or does it concentrate at the frontier (a star — one steward authoring for everyone)? The single observed rung does not distinguish them. This is the dominant empirical question for the whole model.
Expand Down
Loading