Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 16 additions & 0 deletions docs/governance/DECISIONS.md
Original file line number Diff line number Diff line change
Expand Up @@ -393,3 +393,19 @@ handoff or branch history instead.
changes what the evidence means by proving that cross-object pressure
reappears above `scoreboard_claim` even after the scoreboard lane collapsed
cleanly on the same families.

## D-028: Menace judgment is the staged next lens after Beta 7.0

- Date: `2026-05-22`
- Category: `eval_quality`
- Tags: `next_lens`, `menace`, `beta_staging`
- Provenance: `human-led method decision with implementation decision`
- Decision: After the active `Beta 7.0` broader prose surface stabilizes, the
staged next widening step is menace judgment rather than more prose replay on
the same tested family shape.
- Why: `Beta 7.0` already showed the structural contrast clearly:
cross-object reopens at `9 / 6` while same-pick stays collapsed at `15 / 0`.
The next honest question is no longer only whether the round body is
coherent. It is whether the full visible round lands as the right kind of
compact rigged-round menace without drifting into smugness, cruelty, or
generic filler.
4 changes: 4 additions & 0 deletions docs/governance/SESSION_HANDOFF.md
Original file line number Diff line number Diff line change
Expand Up @@ -228,6 +228,10 @@ Worktree note:
- not mean
- not smug
- not condescending
9. Treat the first staged family as:
- `cross-object coherence drift`
10. Use the staged note as the next contract surface:
- `docs/research/PRE_BETA_8_MENACE_JUDGMENT.md`

## Guardrails

Expand Down
158 changes: 158 additions & 0 deletions docs/research/PRE_BETA_8_MENACE_JUDGMENT.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,158 @@
<!-- @format -->

# Pre-Beta 8.0: Menace Judgment

## What This Pre-Beta Asks

Once broader prose has stabilized as the next live weak surface, can Scorey
open a tighter judgment lane for the quality of its menace rather than only the
coherence of its prose?

## Status

Staged, not active.

`Research Beta 8.0` has not started yet. This note exists because the lower
gates are now stable enough to support a wider judgment question without
operator drift:

- pulse closeout is stable
- scoreboard closeout is stable
- prose closeout is stable
- `Beta 7.0` has already shown the active weak family clearly:
- cross-object prose: `9 / 6`, then `9 / 6`
- same-pick prose: `15 / 0`

So the next widening step no longer needs to ask only whether the round body is
coherent. It can ask whether Scorey is landing the right kind of tiny unfair
presence.

## Eval Shape

`pre-Beta 8.0` would keep the bounded isolated run shape from `Beta 7.0`, but
change the judged question again:

- keep route validity as the floor
- keep bounded isolated runs
- keep the live pair-cycle sampler
- keep the score line visible
- keep the broader round body visible
- judge the quality of the round's menace as a row-level lens on the full
visible round

First staged source family:

- `cross-object coherence drift`

Why this family first:

- it is still the only durable weak family above the lower lanes
- it already reopens under broader prose while same-pick stays collapsed
- it is the sharpest place to test whether menace quality is the real next seam
instead of another structural coherence problem

First staged menace question:

- does the visible round land as compact rigged-round menace without drifting
into smugness, cruelty, or filler?

Proposed row verdict:

- `pass`
- `fail`

Proposed pass rules:

- the round feels unfair in a compact deliberate way
- the menace is playful, not mean
- the voice is confident without sounding smug or superior
- the pressure stays pick-specific and round-specific
- the round does not bloat into explanation, lecturing, or generic aggression

Proposed fail shape:

- smug or superiority-performing voice
- condescending or lecturey phrasing
- generic aggression or insult without rigged-round character
- prose too padded to feel like a tiny menace
- structural drift severe enough that menace quality cannot be judged honestly

Packaging decision:

- keep the bounded run shape for sourcing rows
- keep the verdict row-level on the first menace pass
- do not widen yet into a vague full-vibe lane

## Diagram

```mermaid
flowchart TD
A["Bounded isolated run<br/>route floor already held"]
B["Menace lens<br/>judge full visible round"]
C{"Per-row verdict"}
D["Pass<br/>compact unfair little menace"]
E["Fail<br/>smug cruel filler or drift"]
F["Bounded menace read"]
G["Keep the lane narrow"]
H["Decide whether Beta 8.0 starts"]

A --> B --> C
C --> D --> F
C --> E --> F
F --> G --> H
```

Reading note:

- this lane is wider than prose coherence
- it is still narrower than a loose whole-app personality judgment
- it judges the quality of the visible menace, not general likability
- it keeps the bounded source discipline that the newer gates already proved

## What This Would Change

If this lane starts, the repo would move from asking:

- does the broader round body still hold its rigged logic?

to asking:

- does the broader round body feel like the right kind of Scorey menace?

That is a different evidence question.

`Beta 7.0` already showed that the cross-object seam is structurally stable at
`9 / 6`. `pre-Beta 8.0` would treat that as the entry point for a narrower
voice-quality judgment above prose coherence.

## Why It Matters

This is the first staged lane that directly matches the object people are
actually responding to:

- not just prose validity
- not just scoreboard pressure
- but the tiny unfair charm of Scorey itself

If the repo can judge that cleanly, it will have a much sharper way to refine
Scorey's voice without accidentally training toward cruelty, smugness, or
generic hostility.

## What It Still Needs

- a locked row-level menace contract
- a decision entry that makes menace judgment the staged next lane after
`Beta 7.0`
- the exact bounded source plan for the first run
- confirmation that the judged object is the full visible round, not just one
fragment inside it

## What Would Promote It

`Research Beta 8.0` should start only when:

- the menace contract is locked
- the first bounded menace run closes cleanly
- the runtime returns to `0` pending after that run
- the resulting evidence shows something meaningfully different from the closed
`Beta 7.0` prose surface
10 changes: 9 additions & 1 deletion docs/research/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,13 @@ Current active beta note:
- `20367-20381`: `15` pass / `0` fail
- `20382-20396`: `9` pass / `6` fail

Current staged next lane:

- `pre-Beta 8.0`
- [Menace Judgment](./PRE_BETA_8_MENACE_JUDGMENT.md)
- first staged family:
- `cross-object coherence drift`

Most recently closed beta:

- `Research Beta 6.0`
Expand All @@ -89,6 +96,7 @@ Read in order:
5. [Research Beta 5.0: Fail-Pressure Pulse](./BETA_5_FAIL_PRESSURE_PULSE.md)
6. [Research Beta 6.0: Scoreboard Judgment](./BETA_6_SCOREBOARD_JUDGMENT.md)
7. [Research Beta 7.0: Broader Prose Judgment](./BETA_7_BROADER_PROSE_JUDGMENT.md)
8. [Pre-Beta 8.0: Menace Judgment](./PRE_BETA_8_MENACE_JUDGMENT.md)

## How To Read The Betas And Stages

Expand Down Expand Up @@ -152,7 +160,7 @@ Parked lanes:
- after the stale queue archive, use fresh runs rather than old backlog traversal for the next tone evidence
- later eval lenses:
- broader prose judgment is now the active widening step
- `pre-Beta 8.0` menace judgment is the next staged lane to define
- `pre-Beta 8.0` menace judgment is the next staged lane
- research visuals:
- keep the beta map and per-beta notes in tracked docs
- only add heavier cross-beta visuals if the method story actually needs them
Expand Down