Skip to content

Commit 3272c74

Browse files
committed
fix(consolidation): codex P0 — swap to Skilling Hilbert + cargo fmt --all
Codex P0 audit (Opus) returned NEEDS-FIX with 3 P0: 1. A12b Butz Hilbert-3D failed bijection at level 4 ([15,15,15] → 2925 not 4095). Swap to A12 Skilling 2004 impl from claude/pr-x4-splat-cascade-design-v2 (verified: 13/13 hilbert tests pass on primary branch). 2. cargo fmt --all -- --check failed with 141 violations across the new linalg/pillar/ogit_bridge files. cargo fmt --all auto-fixed all of them. 3. Hilbert decode doctest had unused import — Skilling impl doesn't have this issue (different import shape). P1 advisory (deferred to follow-on cleanup sweep, per PP-15 Interface-Signal auditor's recommendation): - ~30 public fns missing # Example doctests - 18/22 linalg APIs are free-fn-with-out-buffer instead of Click P-1 methods-on-carriers (e.g., attention_f32(q,k,v,out,...) should be q.attend(k,v,&config) -> Tensor). Bodies reusable verbatim; only signatures change. Estimated as 1-2 day follow-on sprint. - 15 `#![allow(missing_docs)]` suppressions — keep for now; tighten in the docstring sweep. Production-3am scan (PP-13) and P2 codex savant still in flight. Resolves codex P0 audit. CI format/stable + clippy/1.95.0 should re-run green on push.
1 parent 5e266d1 commit 3272c74

21 files changed

Lines changed: 1077 additions & 861 deletions
Lines changed: 147 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,147 @@
1+
# PP-15 Interface-Signal Audit — Verdict
2+
3+
Reviewer: Opus PP-15 Interface-Signal Auditor
4+
Branch: `claude/pr-x4-splat-cascade-design` @ `5e266d19`
5+
Mindset: signal-in-the-interface, no materialization between cascade steps
6+
7+
Scope: ~125 public surfaces across `src/hpc/linalg/`, `src/hpc/pillar/`,
8+
`src/hpc/ogit_bridge/`. Test 1 (Click P-1), Test 2 (signal-in-interface),
9+
Test 3 (cascade composition).
10+
11+
The codebase has **two house dialects**:
12+
13+
1. **carrier-method dialect**`Spd3::sqrt(&self) -> Spd3`,
14+
`CovHighD::sandwich(&self, m: &Self) -> Self`, `MotionBand::sigma(self) -> f32`,
15+
`CognitiveBridge::nearest_basin(&self, …)`. Honors Click P-1.
16+
2. **flat-slice imperative dialect**`fn op_f32(x: &mut [f32], gamma: &[f32], …)`.
17+
The vast majority of `hpc/linalg/*` adopts this; it is **structurally
18+
anti-Click-P-1** because `&[f32]` is not a carrier — every caller has to
19+
pre-materialize an `out` buffer and there is no return type to chain.
20+
21+
The data-flow rule in `.claude/rules/data-flow.md` ("No `&mut self` during
22+
computation. Ever.") rules out the literal `x.layer_norm(...)` mutating-self
23+
Click-P-1 rewrite. The correct Click-P-1 form is therefore **compute-returns-new**:
24+
`fn layer_norm(&self, gamma, beta, eps) -> Self` on a typed carrier
25+
(`ArrayView1<f32>` extension trait or a thin `Tensor1<f32>` newtype).
26+
27+
## Click P-1 violations (free-function-with-carrier-arg)
28+
29+
| File:Line | Current signature | Click-P-1 rewrite |
30+
|---|---|---|
31+
| `src/hpc/linalg/attention.rs:100` | `attention_f32(q, k, v, &mut out, &cfg, b, s)` | `q.attend(k, v, &cfg) -> AttentionOut` where `AttentionOut: Deref<[f32]>` |
32+
| `src/hpc/linalg/attention.rs:221` | `flash_attention_f32(q, k, v, &mut out, &cfg, b, s, block)` | `q.flash_attend(k, v, &cfg, block) -> AttentionOut` |
33+
| `src/hpc/linalg/batched.rs:60` | `batched_gemm_f32(x, y, &mut out, b, m, k, n, α, β)` | `BatchedMat{x,m,k,batch}.matmul(&BatchedMat{y,k,n,batch}, α, β) -> BatchedMat` — shape is part of the type, not loose args |
34+
| `src/hpc/linalg/batched.rs:132` | `batched_gemm_4d_f32(...)` | `BatchedMat4D.matmul(&Self, α, β) -> BatchedMat4D` |
35+
| `src/hpc/linalg/conv.rs:60` | `conv1d_f32(input, kernel, stride, pad, &mut out)` | `Signal1D.conv(&Kernel1D, stride, pad) -> Signal1D` |
36+
| `src/hpc/linalg/conv.rs:123,202,285,392` | `conv2d{,_3x3,_5x5,_im2col}_f32(input, in_shape, kernel, kshape, stride, pad, &mut out)` | `Image.conv2d(&Kernel, stride, pad) -> Image` — shapes ride inside `Image`/`Kernel` |
37+
| `src/hpc/linalg/norm.rs:54` | `layer_norm_f32(&mut x, γ, β, ε)` | `x.layer_norm(γ, β, ε) -> Tensor1` (NOT `&mut self`) |
38+
| `src/hpc/linalg/norm.rs:100` | `rms_norm_f32(&mut x, γ, ε)` | `x.rms_norm(γ, ε) -> Tensor1` |
39+
| `src/hpc/linalg/norm.rs:143` | `group_norm_f32(&mut x, γ, β, groups, ε)` | `x.group_norm(γ, β, groups, ε) -> Tensor1` |
40+
| `src/hpc/linalg/activations_ext.rs:67,93,117,143,168` | `{gelu,gelu_tanh,silu,swish,mish}_f32(&mut x[, β])` | `x.gelu()`, `x.silu()`, … each returning `Tensor1` |
41+
| `src/hpc/linalg/loss.rs:244` | `softmax_xent_backward_f32(logits, targets, &mut grad_out, b, v)` | `logits.softmax_xent_backward(targets) -> GradTensor` |
42+
| `src/hpc/linalg/loss.rs:155` | `cross_entropy_with_logits_batched_f32(logits, targets, b, v) -> f32` | `logits.xent_batched(targets) -> f32` — return-shape is fine, but carrier-method form keeps it composable |
43+
| `src/hpc/linalg/wasserstein.rs:35` | `sinkhorn_knopp_f32(cost, m, n, a, b, ε, iter, tol) -> Vec<f32>` | `CostMatrix.sinkhorn(&a, &b, ε, iter, tol) -> TransportPlan` |
44+
| `src/hpc/linalg/wasserstein.rs:287` | `wasserstein_1_f32(cost, plan, m, n) -> f32` | `plan.cost_against(&CostMatrix) -> f32` |
45+
| `src/hpc/linalg/wasserstein.rs:112` | `hungarian_f32(cost, m) -> Vec<u32>` | `CostMatrix.hungarian() -> Assignment` |
46+
| `src/hpc/linalg/rope.rs:125` | `RopeCache::apply_qk_f32(&self, &mut q, &mut k, positions, b, s, h)` | `q.with_rope(&cache, positions) -> Tensor` — current shape is half-Click-P-1 (carrier IS `self`) but it mutates two *other* slices, so the receiver is wrong |
47+
| `src/hpc/pillar/temporal_sandwich.rs:165` | `sandwich_update_3x3(σ, m) -> [[f32;3];3]` | `Spd3.sandwich(&self, m: &Spd3) -> Spd3` — the typed version *already exists* in `pillar/cov_high_d.rs:124`; this 3×3 hardcoded copy is the violation |
48+
| `src/hpc/pillar/temporal_sandwich.rs:201` | `is_spd_3x3(m) -> bool` | `Spd3.is_spd(&self) -> bool` |
49+
| `src/hpc/pillar/koestenberger.rs:78,116` | `path1_direct_sandwich(σ, m) -> Spd3`, `path2_spectral(σ, m) -> Spd3` | `σ.sandwich_direct(&m)`, `σ.sandwich_spectral(&m)` |
50+
| `src/hpc/pillar/koestenberger.rs:194` | `max_abs_error_spd3(a, b) -> f64` | `Spd3.max_abs_error(&self, other) -> f64` |
51+
| `src/hpc/pillar/signature.rs:91,188,211` | `signature_d2_deg3(path, n) -> [f32;…]`, `sigker_hl(p, q) -> f32`, `brownian_path_d2(rng, n) -> Vec<f32>` | `Path2D.signature_deg3() -> Signature`, `sig_p.kernel(&sig_q) -> f32`, `rng.brownian_d2(n) -> Path2D` |
52+
53+
## Materialization-forced (out: &mut Buffer args that should be typed return)
54+
55+
Every `_f32` function in `linalg/{attention, batched, conv, norm,
56+
activations_ext, loss}` takes `out: &mut [f32]` (or mutates `x` in place).
57+
That is **15 out of ~22 sprint-introduced surfaces** forcing the caller to
58+
pre-size and materialize a buffer for the next step.
59+
60+
Concretely, the cascade `attention → layer_norm → gelu → batched_gemm`
61+
requires 4 pre-allocated buffers and 4 manual length-arithmetic asserts at
62+
the call site, instead of `q.attend(k,v,&cfg).layer_norm(γ,β,ε).gelu().matmul(&w)`.
63+
64+
**Severity heuristic**: this is the dominant failure mode of the sprint
65+
(15/22 surfaces), and the reason consumers will be forced to write the
66+
materialization-glue PP-15 is supposed to prevent.
67+
68+
## Cascade composition gaps (where A.method() doesn't fit B's receiver)
69+
70+
1. **Splat tick**`TileBinning::from_projected(&ProjectedBatch, &Camera) -> Self`
71+
is good (`tile.rs:105`), and `binning.tile_instances(tx,ty) -> &[TileInstance]`
72+
is good. But `rasterize_tile` and `rasterize_frame` (`raster.rs:71,213`)
73+
are free functions taking the binning + a `&mut framebuffer` — they
74+
should be `binning.rasterize(&projected, &camera, bg) -> Framebuffer`.
75+
Net: 3 chain steps that *don't* compose: `binning → rasterize_*(out)`.
76+
2. **NARS revision**`nars_revision(a, b)`, `nars_deduction`, `nars_abduction`
77+
(`nars.rs:322,342,362`) all take **two `NarsTruth` by value, return
78+
`NarsTruth`***no materialization*, *no buffers*, and the return type IS
79+
the next call's receiver. Test 2 and Test 3 pass. Test 1 fails on the
80+
surface (free fn, not method), but this is the strongest form of
81+
"signal-in-the-interface" in the audit. Trivial method rewrite:
82+
`a.revise(b) -> NarsTruth`.
83+
3. **Cognitive cell encode**`bridge.nearest_basin(cell_value, hint) -> u16`
84+
(`cognitive_bridge.rs:335`). The bare `u16` is a stringly-typed key. The
85+
downstream `codec::rdo_cell(basin, …)` (TTL-referenced; not yet
86+
implemented in this branch) will accept a `u16` or `usize` and the
87+
compiler will not catch a basin/family/codebook-index mixup. Return type
88+
should be `BasinHandle(u16)` so the next click in the cascade is
89+
`bridge.family_of(handle)` (the method already takes `u16` at line 311 —
90+
change to `BasinHandle` and the entire ogit_bridge surface becomes
91+
type-safe).
92+
4. **`PillarReport`**`prove_pillar_7() -> PillarReport` honors return-type.
93+
But `PillarReport` has only one method (`print(&self)`, line 172 of
94+
`prove_runner.rs`); there is no `report.assert_passed() -> &Self` or
95+
`.merge(other) -> PillarReport`. So cascading the eleven pillar probes
96+
into a single `PillarSuite` requires manual `Vec<PillarReport>` glue
97+
(and indeed `prove_pillar_8()` at `temporal_sandwich.rs:323` returns
98+
`Vec<PillarReport>`, breaking the chain shape vs. its peers).
99+
100+
## Click P-1 honors (sprint-introduced surfaces that do it right)
101+
102+
- `Spd3::sqrt`, `Spd3::sandwich`, `Spd3::eig`, etc. in `linalg/matrix.rs` — methods on the SPD carrier.
103+
- `CovHighD::sandwich(&self, m: &Self) -> Self` at `pillar/cov_high_d.rs:124`**textbook Click P-1**: typed carrier, typed args, typed return, composes (`a.sandwich(&b).sandwich(&c).frobenius_sq()`).
104+
- `CovHighD::log_spd(&self) -> Self` at `cov_high_d.rs:202` — ditto.
105+
- `MotionBand::sigma(self) -> f32` at `temporal_sandwich.rs:111` — enum-as-carrier.
106+
- `TileBinning::from_projected(&ProjectedBatch, &Camera) -> Self` and `.tile_instances(...) -> &[TileInstance]` — typed-surface returns, no out-buffer.
107+
- `CognitiveBridge::load_embedded() -> Result<Self, OgitError>`, `.codebook() -> &CamCodebook`, `.family_of(idx) -> &FamilyBitmap` — clean carrier-methods. Only `nearest_basin`'s bare `u16` return type lets the chain down.
108+
- `nars_revision/deduction/abduction(a, b) -> NarsTruth` — value-in, value-out, zero materialization; the *cleanest* signal-in-interface in the audit even though the surface is a free fn.
109+
110+
## Net call
111+
112+
Of ~22 sprint-introduced public surfaces in `linalg/{attention, batched,
113+
conv, norm, activations_ext, loss, wasserstein, rope}`, **roughly 18 fail
114+
Test 1 (free-fn-with-carrier-arg) and 15 fail Test 2 (materialization-forced
115+
`out: &mut`)**. Pillar code is split: `pillar/cov_high_d.rs` and
116+
`pillar/ewa_sandwich_3d.rs` honor Click P-1; `pillar/{temporal_sandwich,
117+
koestenberger, signature}` are free-function-on-bare-arrays.
118+
`ogit_bridge` is mostly clean — single fix needed: lift `u16` to `BasinHandle`.
119+
120+
**Severity**: high in count, low-to-medium in difficulty. None of the
121+
violations are algorithmically wrong; they are all skin-deep signature
122+
rewrites where the kernel body is reusable verbatim. A same-day cleanup
123+
sprint could mechanically wrap each `op_f32(&[f32], …, &mut [f32])` in an
124+
extension-trait method on `ArrayView1<f32>` (or a thin `Tensor1` newtype)
125+
returning `Tensor1`. The data-flow rule "No `&mut self` during computation"
126+
forces the new-allocation Click-P-1 form anyway — and that allocation is
127+
already happening at every consumer site, just expressed as
128+
`let mut out = vec![0.0; n]` instead of inside the method.
129+
130+
The **structural** decision needed before the cleanup: is the carrier
131+
`ArrayView1<f32>` (ndarray-native), a new `Tensor1<f32>` newtype, or a
132+
shape-aware `Tensor<D>`? Without that decision the cleanup will recreate
133+
the inconsistency in a different shape. Recommend: pick `Tensor1` /
134+
`Tensor2` / `Tensor4` thin newtypes around `Vec<f32>` + shape tuple, with
135+
`Deref<Target=[f32]>` for SIMD escape-hatch. This matches the
136+
`BatchedMat`/`Image`/`Signal1D` carriers suggested in the violation table.
137+
138+
**Recommended sequence**:
139+
1. Land `BasinHandle(u16)` — 1-hour change, type-safety dividend across `ogit_bridge`.
140+
2. Decide carrier shape (`Tensor1`/`Tensor2`/`Tensor4`).
141+
3. Add extension-trait methods that wrap each `*_f32` free fn — keep the free fns as `#[doc(hidden)]` shims for one release.
142+
4. Migrate `pillar/temporal_sandwich.rs` and `pillar/koestenberger.rs` to call methods on the existing `Spd3` carrier (the typed version already exists in `matrix.rs` — these modules are reimplementing what they could be using).
143+
144+
This is a **follow-on cleanup sprint** (one to two days for a single
145+
worker), not a same-day patch — the carrier-type decision is load-bearing.
146+
147+
## Sentinel: pp15-interface-signal-completed

0 commit comments

Comments
 (0)