From 1a4181fdace0ad6fc27247410201f306fe8918dc Mon Sep 17 00:00:00 2001 From: Cemil ILIK Date: Fri, 29 May 2026 04:08:34 +0300 Subject: [PATCH 1/3] docs(roadmap): close B4 milestone via closure trio + verify master-review remediation B4 (Task loader) flips implementation-complete -> Closed via the closure trio (business retrospective + consolidated security review [Approve] + performance baseline), modelled on the 2026-05-14 B3 closure. T-019 merged 2026-05-16 (PR #31); ADR-0029 Accepted 2026-05-14 (PR #30). The period under review included the 2026-05-22 full-tree master review (verdict: APPROVE the shipped kernel -- 0 code-correctness/security Blockers; issues clustered in CI/doc/ADR) and its remediation PR #32. All 24 Blocker+Major findings were re-verified adversarially against the live tree: 23 confirmed-fixed, 1 partial (MR-009). MR-009 is now fully closed in-branch -- phase-b.md gains a "Miri green = Phase-B exit prerequisite" note (the CI-gate half was already done by PR #32). Closing metrics (reproduced live, HEAD 3ab029f, pinned nightly): - cargo host-test 286/286 (43 hal + 187 kernel + 53 test-hal + 3 doc-tests; was 260 at the T-019 merge, +26 from PR #32); fmt/clippy/kernel-build clean. - QEMU smoke runs the full demo through "tyrne: all tasks complete" with the new "tyrne: image loaded (...)" line; -d int,unimp,guest_errors = 629 events, 100% pre-existing PL011 noise, zero fault classes. - Release perf band p10/p50/p90 = 15.641/17.587/19.150 ms (+5.3-5.7 ms vs B3 -- one-time boot cost of the loader's first post-bootstrap cap_map walks under QEMU TCG; real-hardware projection ~40 us). - Audit log 28 entries (UNSAFE-2026-0027 + 0028 added; 0025/0026 Pending-smoke notes lifted by T-019). Side-effects: current.md refreshed (B4 Closed, milestone -> B5, 260->286, trio in Last reviews); 3 review-type README indexes updated; perf-baseline report added. Next milestone: B5 (syscall boundary) -- ADR-0030 + ADR-0031. Refs: ADR-0013, ADR-0029, ADR-0036 Co-Authored-By: Claude Opus 4.8 (1M context) --- .../perf-baseline-2026-05-28-B4-closure.md | 91 +++++++ .../business-reviews/2026-05-28-B4-closure.md | 222 +++++++++++++++++ .../reviews/business-reviews/README.md | 1 + .../2026-05-28-B4-closure.md | 223 ++++++++++++++++++ .../README.md | 1 + .../security-reviews/2026-05-28-B4-closure.md | 116 +++++++++ .../reviews/security-reviews/README.md | 1 + docs/roadmap/current.md | 20 +- docs/roadmap/phases/phase-b.md | 2 + 9 files changed, 670 insertions(+), 7 deletions(-) create mode 100644 docs/analysis/reports/perf-baseline-2026-05-28-B4-closure.md create mode 100644 docs/analysis/reviews/business-reviews/2026-05-28-B4-closure.md create mode 100644 docs/analysis/reviews/performance-optimization-reviews/2026-05-28-B4-closure.md create mode 100644 docs/analysis/reviews/security-reviews/2026-05-28-B4-closure.md diff --git a/docs/analysis/reports/perf-baseline-2026-05-28-B4-closure.md b/docs/analysis/reports/perf-baseline-2026-05-28-B4-closure.md new file mode 100644 index 0000000..b39b4de --- /dev/null +++ b/docs/analysis/reports/perf-baseline-2026-05-28-B4-closure.md @@ -0,0 +1,91 @@ +# Boot-to-end perf baseline — 2026-05-28 — B4-closure + +Generated by `tools/perf-harness.sh` — multi-run aggregation of the kernel's +`boot-to-end elapsed = X ns` emission (P10 from the [2026-05-06 Track D +review](../reviews/code-reviews/2026-05-06-full-tree/track-d-performance.md)). + +## Inputs + +| Field | Value | +|-------|-------| +| Run timestamp (UTC) | `2026-05-28T20:28:44Z` | +| Iterations requested | 20 | +| Iterations valid | 20 | +| Iterations failed | 0 | +| Per-run timeout | 5 s | +| Build profile | release | +| Kernel ELF | `target/aarch64-unknown-none/release/tyrne-bsp-qemu-virt` | +| Git HEAD | `3ab029f` on `main` | +| QEMU | `QEMU emulator version 10.2.2` | +| QEMU machine | `-M virt -cpu cortex-a72 -m 128M -smp 1` | +| Host `uname -a` | `Darwin MacBookPro.hgw.local 24.6.0 Darwin Kernel Version 24.6.0: Wed Nov 5 21:30:23 PST 2025; root:xnu-11417.140.69.705.2~1/RELEASE_X86_64 x86_64` | +| Wall-clock (full harness run) | 102 s | + +## Methodology + +Each iteration invokes `tools/run-qemu.sh` under a per-run watchdog; +QEMU emits the boot trace through to `tyrne: all tasks complete` plus +the `boot-to-end elapsed = X ns` line, then halts in WFI. The watchdog +kills the QEMU process after the per-run timeout (the kernel never +exits on its own). The integer ns delta is parsed out of stdout. + +Counter source: the kernel's `now_ns()` (`hal::Timer`) reads the EL1 +virtual generic-timer counter and converts to nanoseconds via the +cached `CNTFRQ_EL0` resolution (62 500 000 Hz, 16 ns). Under QEMU TCG +the counter advances based on emulated instructions rather than +wall-clock time, so variance reflects translation-cache behaviour and +host scheduler jitter, not real hardware performance. Each iteration +is a fresh QEMU process; the TCG translation cache is destroyed +between iterations, so every iteration pays the full translation cost. + +Statistics are computed across the valid samples only. Percentile +convention is *nearest-rank* (1-indexed; `idx = ceil(p/100 * n)`). +Stddev is the population formula (`n` divisor) — descriptive. + +**Note on p99 at small `n`.** Under nearest-rank, `p99 = a[ceil(0.99 * +n)]`; for any `n < 100` the index rounds up to `n` and `p99 == max` +by construction. The number is reported as-computed (matching p10 / +p50 / p90's convention) but readers should not over-read it as a +tail-latency signal at small `n`. p99 becomes statistically +informative when `n >= 100`. + +## Metric — boot-to-end elapsed (nanoseconds) + +| Statistic | ns | ms | +|-----------|---:|---:| +| min | 15,624,992 | 15.625 | +| p10 | 15,640,992 | 15.641 | +| p50 | 17,587,008 | 17.587 | +| p90 | 19,150,000 | 19.150 | +| p99 | 21,154,992 | 21.155 | +| max | 21,154,992 | 21.155 | +| mean | 17,586,899 | 17.587 | +| stddev | 1,428,711 | 1.429 | + +## Δ vs prior baseline (B3 closure, release) + +| Statistic | B3 closure (ms) | B4 closure (ms) | Δ ms | +|-----------|---:|---:|---:| +| p10 | 10.311 | 15.641 | +5.330 | +| p50 | 11.884 | 17.587 | +5.703 | +| p90 | 13.823 | 19.150 | +5.327 | + +The +5.3 to +5.7 ms increase is the T-019 task loader running at boot +— the first post-bootstrap exercise of the per-call `Mmu::map` +page-table-walk + TLB-flush sequence under a live MMU, amplified by +QEMU TCG software-MMU emulation. One-time-at-boot; tightly clustered +across percentiles (the signature of a uniform fixed cost, not a +variance regression). Real-hardware projection: ~40 µs on a Cortex-A72. +See the [B4 closure performance review](../reviews/performance-optimization-reviews/2026-05-28-B4-closure.md) +§"Hotspot" for the per-component decomposition. + +## Verdict + +Baseline only — no proposal under measurement. This is the +baseline-of-record for B5+ regression checks against B4's closing +release-build performance. Cite the band above (p10 / p50 / p90) when +comparing later changes against this snapshot. Single-run boot-to-end +claims in PR bodies should be replaced with a fresh harness run when a +non-trivial perf-relevant change lands; see +[`docs/standards/infrastructure.md`](../../standards/infrastructure.md) +§"Performance harness". diff --git a/docs/analysis/reviews/business-reviews/2026-05-28-B4-closure.md b/docs/analysis/reviews/business-reviews/2026-05-28-B4-closure.md new file mode 100644 index 0000000..9327c42 --- /dev/null +++ b/docs/analysis/reviews/business-reviews/2026-05-28-B4-closure.md @@ -0,0 +1,222 @@ +# Business review 2026-05-28 — B4 closure retrospective (post-T-019 + master-review interlude) + +- **Trigger:** milestone-completion. Phase B / Milestone B4 ("Task loader") reached implementation-complete on 2026-05-16 when T-019's [PR #31](https://github.com/HodeTech/Tyrne/pull/31) merged into `main` (merge commit [`7f876af`](https://github.com/HodeTech/Tyrne/commit/7f876af)); this closure trio formally promotes B4 from "implementation-complete" to `Done`. The B4 arc spans one ADR + one task: [ADR-0029](../../../decisions/0029-initial-userspace-image-format.md) ("Initial userspace image format") Accepted 2026-05-14 via [PR #30](https://github.com/HodeTech/Tyrne/pull/30) (merge [`e09755d`](https://github.com/HodeTech/Tyrne/commit/e09755d)); [T-019](../../tasks/phase-b/T-019-task-loader.md) (the `load_image` → `LoadedImage` loader) merged 2026-05-16. **B4 does not mint a runnable task** — the `task_create_from_image` wrapper that turns a `LoadedImage` into a runnable `CapHandle{CapObject::Task(...)}` gates on B5/B6 (kernel mappings in the userspace AS, an EL0-ready context register file, and the syscall entry path), per the [phase-b §B4 §Revision-notes rider](../../../roadmap/phases/phase-b.md#milestone-b4--task-loader). +- **Scope:** the B4 loader arc (ADR-0029 design + T-019 implementation across 7 bisectable commits + 4 review-round follow-ups + doc polish), **plus the dominant out-of-band event of the period — the 2026-05-22 full-tree master review** (run id `2026-05-22-152729`, anchor [`288ddb2`](https://github.com/HodeTech/Tyrne/commit/288ddb2)) and its remediation [PR #32](https://github.com/HodeTech/Tyrne/pull/32) (merge [`50bffe9`](https://github.com/HodeTech/Tyrne/commit/50bffe9)). The master-review interlude is treated as a first-class event here: it is the bulk of both [§"What changed in the plan"](#what-changed-in-the-plan) and [§"What we learned"](#what-we-learned). Two infrastructure changes that also landed in the period are recorded as §Plan-diff items rather than B4 implementation items: the [HodeTech org migration](#what-changed-in-the-plan) (`cemililik/Tyrne` → `HodeTech/Tyrne`, commit [`cd4cb6e`](https://github.com/HodeTech/Tyrne/commit/cd4cb6e)) and the root-README rewrite (commit [`3ab029f`](https://github.com/HodeTech/Tyrne/commit/3ab029f)). +- **Period:** 2026-05-14 (B3 closed; PR #29 closure trio merged; ADR-0029 Accepted same day) → 2026-05-28 (today; B4 formally closed by this trio). +- **Participants:** @cemililik (+ Claude Sonnet 4.6 agent as scribe for the T-019 implementation + review-round arc; the 2026-05-22 master review was executed by a 25-track multi-agent harness with a consolidation agent — see [`master-review/2026-05-22-152729/consolidated.md`](../master-review/2026-05-22-152729/consolidated.md); bot-driven review-rounds on PRs #30 / #31 / #32 from coderabbitai-bot, gemini-code-assist-bot, sourcery-ai-bot, qodo-code-review-bot; Claude Opus 4.8 agent as scribe for this closure trio). + +> **Canonical source for B4 closure metrics.** This artefact + the [security review](../security-reviews/2026-05-28-B4-closure.md) + the [performance baseline](../performance-optimization-reviews/2026-05-28-B4-closure.md) are the source of truth for B4's closing numbers (test counts, ELF section sizes, smoke trace, perf band, audit-log entries and Amendments). Every other location that mentions B4 metrics ([`current.md`](../../../roadmap/current.md), [`phase-b.md`](../../../roadmap/phases/phase-b.md), the [T-019](../../tasks/phase-b/T-019-task-loader.md) review-history row, [`memory-management.md`](../../../architecture/memory-management.md), [`task-loader.md`](../../../architecture/task-loader.md)) is a *summary at its layer of abstraction* and should be read alongside this trio rather than as an independent record. Note in particular: `current.md` still cites the merge-time **260** test count; the post-remediation truth is **286** (see [§Adjustments](#adjustments)). + +--- + +## What landed + +### Tasks promoted to Done + +| Task | Promoted | Description | +|------|----------|-------------| +| [T-019](../../tasks/phase-b/T-019-task-loader.md) | 2026-05-16 (`date_done` frontmatter; PR #31 merge, [`7f876af`](https://github.com/HodeTech/Tyrne/commit/7f876af)) | Task loader: `pub fn load_image(image, pmm, mmu, table, as_arena, parent_as_cap, new_rights, image_base_va, stack_size_pages) -> Result` in [`kernel/src/obj/task_loader.rs`](../../../../kernel/src/obj/task_loader.rs). Consumes an `include_bytes!`-embedded raw-flat userspace blob (per ADR-0029), allocates frames via `Pmm`, creates a fresh AS via `cap_create_address_space`, `copy_nonoverlapping`-copies the image into the new frames, maps the image (`USER \| EXECUTE`) + a stack (`USER \| WRITE`), and returns an opaque `LoadedImage { as_cap, entry_va, stack_top_va, image_bytes, stack_bytes }`. **Does NOT** mint a `CapHandle{CapObject::Task(...)}` — running gates on B5/B6. Leak-path-closure preflight discipline inherited from T-018 and extended for `cap_map`'s up-to-3-intermediate-frame consumption (every rejectable check before the first `pmm.alloc_frame()`; cap-side rollback via `cap_drop(loaded_as_cap)`, not `cap_revoke`). Ten-variant `LoadError` taxonomy (`InvalidImage` / `InvalidStackSize` / `MisalignedImageBaseVa(VirtAddr)` / `InvalidImageBaseVa { base, end }` / `InvalidParentCap(CapError)` / `FrameBudgetExceeded { needed, available }` / `ImageOverlapsAllocatableMemory` / `AddressSpaceCreationFailed(AddressSpaceError)` / `OutOfFrames` / `MapFailed(AddressSpaceError)`). New audit entry [UNSAFE-2026-0027][unsafe-27] for the `copy_nonoverlapping` byte-copy site. DoD fully met (all acceptance-criteria checkboxes ticked; the B5/B6 runnability deferrals are in the task's §Out-of-scope, not unchecked DoD items). Smoke trace gains one new line `tyrne: image loaded (...)` between the address-space-arena banner and the timer banner. | + +[unsafe-27]: ../../../audits/unsafe-log.md#unsafe-2026-0027--task-loader-frame-byte-copy-via-coreptrcopy_nonoverlapping-in-task_loaderload_image + +### ADRs + +| ADR | Action | Notes | +|-----|--------|-------| +| [ADR-0029 — Initial userspace image format](../../../decisions/0029-initial-userspace-image-format.md) | **Accepted 2026-05-14** (Propose `ffd712c` + careful-re-read Accept `f90e382`; pre-Accept review rounds `fd91567` / `d888e0c` / `d31ab24`, round-1 `926f788`; PR #30 merge `e09755d`) | Settles the byte-level userspace image format: a **raw flat binary** embedded into the kernel via `include_bytes!` (entry point at offset 0; filesystem / ELF subset deferred to Phase C/D). T-019 opens at `Draft` with the Propose commit per [ADR-0025 §Rule 1](../../../decisions/0025-adr-governance-amendments.md). The §Revision-notes rider added in the Propose commit records the B4-loads-but-does-not-run split (`task_create_from_image` is a B5/B6 wrapper). | +| [ADR-0036 — QEMU virt is GICv2 / no-IOMMU in v1](../../../decisions/0036-qemu-virt-gicv2-no-iommu-v1.md) | **Accepted 2026-05-22** (via master-review remediation PR #32, commit `8063ee2`) | A **retroactive-recovery / supersession ADR** correcting the GICv3/SMMUv3 statements frozen into [ADR-0004](../../../decisions/0004-target-platforms.md) / [ADR-0006](../../../decisions/0006-workspace-layout.md) / [ADR-0012](../../../decisions/0012-boot-flow-qemu-virt.md). The shipped BSP is a GICv2 driver and `hal::Iommu` is an empty marker trait; the append-only rule had frozen the contradiction (a reader following "the ADR is authoritative" would believe the *wrong* document). ADR-0036 is the corrective record + authorises one-line top-of-file redirect riders on 0004/0006/0012 (append-only-legal — riders do not alter the original bodies). The first new ADR allocated above the Phase-B ceiling. | +| ADR-0008 / ADR-0019 / ADR-0020 | **Append-only §Revision-notes riders (2026-05-22, PR #32 commit `8063ee2` + `59f9309`)** | Riders added during remediation. The load-bearing one is the [ADR-0020](../../../decisions/0020-cpu-trait-v2-context-switch.md) rider recording that the FP callee-saved set (`d8`–`d15`) was implemented in the same arc (not deferred), the context struct is 168 bytes not 104, and the §Neutral "deferred" note is superseded — paired with the `d8`–`d15` enumeration added to the `ContextSwitch` `# Safety` contract (MR-005). | +| ADR-0033 / ADR-0034 | **Still slot-reserved** | Named-but-unallocated placeholders per [ADR-0025 §Rule 1](../../../decisions/0025-adr-governance-amendments.md). Gates unchanged: ADR-0033 (high-half migration) opens with the B5 per-task `TTBR0_EL1` swap; ADR-0034 (kernel-image section permissions) opens with the first attacker-observable EL0 execution (likely B5/B6). | +| ADR-0030 / ADR-0031 (B5) | **Reserved** | Syscall ABI (0030) + initial syscall set (0031). Tentative numbers per [ADR-0013](../../../decisions/0013-roadmap-and-planning.md); they open with the B5 prep arc — see [§Next](#next). The Phase C–I placeholders were renumbered to ≥ ADR-0037 off the live ceiling (MR-001, commit `a6e909d`). | + +### Pull requests merged into `main` + +| PR | Merge | Scope | +|---|---|---| +| #30 | `e09755d` | ADR-0029 Propose + Accept (separate commits per [`write-adr` §10](../../../../.agents/skills/write-adr/SKILL.md)); T-019 opens at `Draft`. Pre-Accept review rounds + round-1. | +| #31 | `7f876af` | T-019 implementation arc. **7 bisectable commits**: `911f2ad` (skeleton) / `5711756` (`load_image` + UNSAFE-2026-0027) / `ae31bc8` (BSP wiring + arch doc + UNSAFE-2026-0025/0026 smoke-verification Amendments) / `196d3fb` (rr1) / `164522d` (rr2) / `5b1f153` (rr3) / `95efd62` (rr4), plus doc polish `74694d4` + follow-ups `5078944` (rr5 — one PMM host test, suite → 260) / `eb14c51` (rr6 — 5 valid findings). | +| #32 | `50bffe9` | **Master-review remediation** (the dominant PR of the period). Commits: `a6e909d` (MR-001 renumber Phase C–I ADR placeholders off the live ceiling) / `8063ee2` (MR-006/005/019/020 supersede GICv3/SMMUv3 via new ADR-0036 + append-only riders) / `fbc3d3f` (MR-002/003/007/008/009 CI honesty: drop the rustup-default-stable lie, align the gate set, stop the RUSTFLAGS clobber, SHA-pin actions + `permissions:` block, Miri made a blocking gate) / `59f9309` (MR-005/011/017/018 hal/bsp/test-hal: `d8`–`d15` context-switch contract, `from_existing_root` audit gap, FakeMmu fidelity, IrqState polarity) / `57bc2e6` (MR-010/018 O(R) PMM overlap helper + failure-path tests) / `348971e` (MR-022/017/018 centralise infallible enqueue, consolidate fakes, `load_image` failure tests) / `24530fb` (MR-012/013/014 arch docs: IPC objects, Cpu/ContextSwitch split, Iommu framing, index) / `4e241d9` (MR-016/019 refresh `current.md` + phase-b task index) / `4141158` (MR-015/004 front-door delivery status + de-hardcode counts + link sweep) / `a2e7257` (D3-005/006/007 X3-003 standards reconcile); review-round commits `ae8fbd7` / `8ceb4fb` / `c843ecd`. | + +Two further non-PR commits landed: `cd4cb6e` (org migration `cemililik/Tyrne` → `HodeTech/Tyrne`) and `3ab029f` (root README clarity/formatting — the current HEAD). + +### Audit-log surface + +The audit log ([`docs/audits/unsafe-log.md`](../../../audits/unsafe-log.md)) now holds **28** `UNSAFE-2026-####` entries (0001–0028; 0012 `Removed`, so **27 Active**). The period's changes: + +- **UNSAFE-2026-0027** introduced (2026-05-14, T-019 commit 2): the `task_loader::load_image` frame byte-copy via `core::ptr::copy_nonoverlapping`. Opened **standalone** rather than as an Amendment of UNSAFE-2026-0026 — same surface shape ("kernel writes into a PMM-allocated frame via raw pointer") but a different source (caller-supplied `.rodata` slice vs constant zeros), operation primitive (`copy_nonoverlapping` vs `write_bytes`), size (`0..=PAGE_SIZE` vs always `PAGE_SIZE`), and ownership-proof chain — fresh entry per [`justify-unsafe`](../../../../.agents/skills/justify-unsafe/SKILL.md) audit-tag scoping. Carries a 2026-05-15 review-round-1 Amendment recording that the non-overlap invariant is now runtime-enforced via the `Pmm::could_yield_pa_overlapping` preflight + `LoadError::ImageOverlapsAllocatableMemory`, and a review-round-2 amendment relocating the integer-to-pointer cast behind `crate::mm::phys_frame_kernel_ptr`. +- **UNSAFE-2026-0028** opened by the **master-review remediation** (2026-05-22, MR-011, PR #32 commit `59f9309`): `QemuVirtAddressSpace::from_existing_root` — wrapping the already-live, populated VMSAv8 L0 root that `mmu_bootstrap` installed into `TTBR0_EL1`. This was the **only production `unsafe fn` in the tree with no audit entry**; worse, its sole call site mis-attributed the unsafety to UNSAFE-2026-0010 (StaticCell) + 0014 (momentary `&mut` to the arena), neither of which covers wrapping a live non-zero root. The remediation opened the entry and narrowed the call-site `Audit:` trailer to the lines 0010/0014 actually cover. The code was always sound; this is an audit-trail completeness fix. +- **UNSAFE-2026-0025** (post-bootstrap `Mmu::map` page-table descriptor writes) and **UNSAFE-2026-0026** (PMM `alloc_frame` zero-fill) had their `Pending QEMU smoke verification` status notes **lifted** via 2026-05-14 Amendments. T-019 is the first runtime exerciser of both post-bootstrap: the boot smoke now runs `load_image`, which calls `cap_create_address_space` (→ per-call `Mmu::map` walks) + the byte-copy into freshly zero-filled frames. The verification gap these two entries carried since B2/B3 is now closed by observed runtime, not paper argument. +- **Still forward-flagged:** UNSAFE-2026-0019 / 0020 / 0021 retain their `Pending QEMU smoke verification` notes for the IRQ-take / deadline-fire path (gates on the first deadline-arming caller — B5's first preemption-relevant syscall per the entries' 2026-05-08 closure-path Amendments). +- Append-only discipline holds across all 28 entries — the master-review unsafe-audit reconciliation (track X3) verified every entry resolves to live code with zero stale entries and zero append-only violations (UNSAFE-2026-0014 alone now carries six amendments across five tasks with no in-place body edit). + +### Test counts at B4 closure + +| Crate | Tests | B3 closure (2026-05-14) | T-019 / PR #31 merge | B4 closure (now) | +|-------|-------|------------------------:|---------------------:|------------------:| +| `tyrne-hal` (lib) | **43** | 42 | 42 | 43 | +| `tyrne-kernel` (lib) | **187** | 141 | 175 | 187 | +| `tyrne-test-hal` (lib) | **53** | 43 | 43 | 53 | +| doc-tests | **3** | 0 | 0 | 3 | +| **Total** | **286 / 286** | **226** | **260** | **286** | + +**Trajectory:** **226** at B3 closure (42 hal + 141 kernel + 43 test-hal) → **260** at the T-019 / PR #31 merge (42 hal + 175 kernel + 43 test-hal; **+34 kernel** from the B4 implementation arc — the loader's preflight/rollback/`LoadError`-per-row coverage) → **286** now (**+26** from the master-review remediation PR #32: **+1 hal, +12 kernel, +10 test-hal, +3 doc-tests**). The remediation delta decomposes as: + +- **+10 test-hal** — the `OutOfFramesMmu` / `BlockMappedMmu` failure-injecting fakes (MR-018, the FakeMmu-fidelity gap) plus their doctests. +- **+12 kernel** — MR-010 PMM failure-path tests (the O(R) interval-arithmetic `could_yield_pa_overlapping` rewrite) + MR-017 `IrqState` polarity test + MR-018 `cap_map` / `load_image` rollback tests + MR-022 (centralised infallible enqueue). +- **+1 hal / +3 doc-tests** — the contract-fidelity additions accompanying the `d8`–`d15` and fake-consolidation work. + +> **Drift note (flag for §Adjustments).** [`current.md`](../../../roadmap/current.md) still cites **260** — accurate at the PR #31 merge, but stale after PR #32 added 26 tests. The README was de-hardcoded to a link (MR-015) and is fine. The 260 → 286 drift is recorded as an Adjustment below. + +**Gates (reproduced live, pinned `nightly-2026-01-15`, HEAD `3ab029f`):** + +- `cargo host-test` — **286 passed / 0 failed**. +- `cargo fmt --check` — clean. +- `cargo host-clippy` (`clippy --all-targets -D warnings`) — clean. +- `cargo kernel-clippy` — clean. +- `cargo kernel-build` — clean. +- Miri — **not run locally** (not installed on the pinned toolchain on this host). Per the PR #32 remediation, Miri is now a **blocking CI gate** (no `continue-on-error`) and a listed required gate in [`infrastructure.md`](../../../standards/infrastructure.md). The master-review gate reproduction recorded 260/260 under Miri with zero detected UB at the `288ddb2` anchor; the +26 remediation tests are host + Miri-eligible. + +#### Smoke trace (release build, B4 closure) + +QEMU 10.2.2, `-M virt -cpu cortex-a72 -m 128M -smp 1`, serial multiplexed to file. Built with `cargo build --release --target aarch64-unknown-none -p tyrne-bsp-qemu-virt` at HEAD `3ab029f`. **This trace is the load-bearing closure evidence** — per the [business master-plan §Acceptance criteria](master-plan.md#acceptance-criteria), a milestone cannot promote past `In Review` to `Done` without a recorded smoke trace; narrative claims of smoke-pass are insufficient (codified after the 2026-05-06 B1 smoke regression, where host tests + Miri + paper-review all passed a kernel that hung at runtime). + +```text +tyrne: hello from kernel_main +tyrne: mmu activated +tyrne: pmm initialized (32603 frames available; 165 reserved) +tyrne: address-space-arena ready (1 / 8 slots used; bootstrap AS root = 0x40091000) +tyrne: image loaded (entry = 0x800000; sp = 0x802000; image bytes 8; stack bytes 4096; AS cap = idx 1) +tyrne: timer ready (62500000 Hz, resolution 16 ns) +tyrne: starting cooperative scheduler +tyrne: task B — waiting for IPC +tyrne: task A -- sending IPC +tyrne: task B — received IPC (label=0xaaaa); replying +tyrne: task A — received reply (label=0xbbbb); done +tyrne: all tasks complete +tyrne: boot-to-end elapsed = 19257008 ns +``` + +**Smoke deltas vs B3 closure:** + +- **New line:** `tyrne: image loaded (entry = 0x800000; sp = 0x802000; image bytes 8; stack bytes 4096; AS cap = idx 1)`, inserted between the address-space-arena banner and the timer banner. This is the B4 marker — the loader runs at boot and reports the populated AS. +- **PMM banner shifted:** `165 reserved` (B3: 161) and `32603 available` (B3: 32607). The **+4 reserved frames** are the embedded image frame + the loader's intermediate page-table frames. +- **Bootstrap AS root shifted** `0x4008d000` → `0x40091000` because the kernel image grew (the loader module + the master-review remediation kernel changes; see [§ELF footprint](#elf-footprint) below). +- Every other line is byte-stable from the B3 trace through to `tyrne: all tasks complete`. + +**Single-run boot-to-end:** ~19.26 ms (release, no `-d` flags). The single-run number is anecdotal; the harness band below is the load-bearing claim. + +**`-d int,unimp,guest_errors` (captured live):** **629 total events; 629/629 (100%) are the pre-existing `PL011 data written to disabled UART` warnings** (one per UART byte while QEMU's PL011 model has `UARTCR.UARTEN = 0` — Tyrne rides QEMU's reset state; queued as a follow-on BSP task since B2). **Zero Translation faults, zero Permission faults, zero "Taking exception" lines, zero unallocated/unimplemented events.** The Δ from B3 (~526) to B4 (629) is exactly the new banner-line UART bytes (the image-loaded line + the slightly longer PMM/AS lines); **no new fault classes** were introduced by T-019 — the first post-bootstrap per-call `Mmu::map` exercise runs clean. + +#### Perf band + +20 iterations, release build, 5 s per-run timeout, HEAD `3ab029f`, QEMU 10.2.2, 20/20 valid, wall 102 s: + +| metric | ns | ms | +|--------|---:|---:| +| min | 15,624,992 | 15.625 | +| p10 | 15,640,992 | 15.641 | +| p50 | 17,587,008 | 17.587 | +| p90 | 19,150,000 | 19.150 | +| p99 | 21,154,992 | 21.155 | +| max | 21,154,992 | 21.155 | +| mean | 17,586,899 | 17.587 | +| stddev | 1,428,711 | 1.429 | + +(`p99 == max` by nearest-rank construction for `n < 100`.) **Δ vs B3 closure** (p10/p50/p90 = 10.311 / 11.884 / 13.823 ms): **+5.33 / +5.70 / +5.33 ms per percentile.** The cause is the T-019 task loader now running at boot — `cap_create_address_space` + per-page `cap_map` page-table walks + the `copy_nonoverlapping` image copy + the stack mapping — the **first post-bootstrap per-call `Mmu::map` exercise**. This is a one-time boot cost amplified by QEMU TCG translation-cache overhead and is `<<` real-hardware activation cost; real hardware is unaffected by the TCG amplification. The [performance baseline leg](../performance-optimization-reviews/2026-05-28-B4-closure.md) decomposes the increase per contributor and records the canonical B4 footprint/timing reference for B5+ regression checks. + +#### ELF footprint + +Release ELF section sizes (`llvm-size -A -d`): `.text` **33,124** / `.rodata` **4,560** / `.bss` **48,320** (total text+rodata+bss = **86,004 B ≈ 84.0 KiB**). **Δ vs B3 closure** (`.text` 24,008 / `.rodata` 3,536 / `.bss` 42,080): `.text` **+9,116 (+38%)** / `.rodata` **+1,024 (+29%)** / `.bss` **+6,240 (+14.8%)**. The `.text` growth is the T-019 `task_loader` module (`load_image` + the 10-variant `LoadError` + the preflight chain + the `intermediate_frame_count` exact-count helper) plus the master-review remediation kernel changes (MR-010's PMM interval-arithmetic rewrite, MR-022's `enqueue_ready` helper). `.rodata` grew with the `LoadError` Debug strings + the `include_bytes!` image; `.bss` grew with the loader / AS working set. No new linker-script reservation beyond the existing `.boot_pt`. (Test-only fakes are `#[cfg(test)]` and not in the release BSP ELF.) Full breakdown in the [performance leg](../performance-optimization-reviews/2026-05-28-B4-closure.md). + +### Documentation surface + +- [`docs/decisions/0029-initial-userspace-image-format.md`](../../../decisions/0029-initial-userspace-image-format.md) — new ADR; raw-flat-binary format choice + the loads-but-does-not-run §Revision-notes rider. +- [`docs/decisions/0036-qemu-virt-gicv2-no-iommu-v1.md`](../../../decisions/0036-qemu-virt-gicv2-no-iommu-v1.md) — new supersession ADR correcting the GICv3/SMMUv3 statements in ADR-0004/0006/0012 (+ append-only top-of-file riders on each). +- [`docs/architecture/task-loader.md`](../../../architecture/task-loader.md) — new chapter (landed with T-019) synthesising the loader sequence, the userspace linker layout the loader assumes, the explicit rollback contract, and the v1 baseline leaks; cross-linked from `memory-management.md` §"Address-space objects" and `boot.md` §Stage 3. The master review (MR-014) added its row to the architecture index (it had been omitted). +- [`docs/audits/unsafe-log.md`](../../../audits/unsafe-log.md) — UNSAFE-2026-0027 added (+ two Amendments); UNSAFE-2026-0028 added by the remediation; UNSAFE-2026-0025 / 0026 `Pending QEMU smoke verification` notes lifted via 2026-05-14 Amendments. +- [`master-review/2026-05-22-152729/`](../master-review/2026-05-22-152729/consolidated.md) — the full-tree review: `consolidated.md` (de-duplicated MR-NNN register), `00-coverage-manifest.md` (251 files / 45,757 lines), 25 `tracks/` outputs, `gate-reproduction.md`. +- Front-door + CI honesty sweep (PR #32): `CLAUDE.md` / `CONTRIBUTING.md` / `README.md` delivery-status corrections + de-hardcoded counts (MR-015), `.github/workflows/ci.yml` + `docs/guides/ci.md` + `docs/standards/infrastructure.md` gate reconciliation (MR-002/003/007/008/009), the ~49 broken-link sweep (`.claude/skills/` + `hal/src/mmu.rs` path-rot, MR-004), and the org migration to `HodeTech/Tyrne`. +- [`docs/analysis/reports/perf-baseline-2026-05-28-B4-closure.md`](../../reports/perf-baseline-2026-05-28-B4-closure.md) — 20-iteration release-build harness band at HEAD `3ab029f`; the baseline-of-record for B5+ regression checks against B4's closing release-build performance. + +## What changed in the plan + +- **B4 closed, but with an interlude no plan anticipated.** B4's loader half landed cleanly on first attempt (T-019 merged 2026-05-16, no re-open arc) — the **fourth consecutive clean milestone closure** (B2, B3, B4 loader; B1 remains the only re-open). But between the T-019 merge and this closure trio, the maintainer inserted a **deliberate full-tree audit pause** — the 2026-05-22 master review — before declaring B4 Closed. This is a structural change to the pace: the methodical phased rhythm gained an on-demand "stop and sweep the whole tree" beat that is *not* milestone-triggered. It is the dominant plan-diff of the period and the source of most of [§"What we learned"](#what-we-learned). + +- **The master review re-sequenced the immediate work from "open B5" to "remediate first."** Rather than proceeding straight from B4 into B5's syscall-ABI ADR, the maintainer paused to land PR #32 (the remediation of the review's 4 Blocker + 18 Major findings) before opening B5. The remediation is the bulk of the period's commit volume. Crucially, the review's verdict was **APPROVE the shipped kernel** — 0 code-correctness Blockers, security PASS, all gates green — so the pause was about *the records and the gate*, not the kernel. This is the inverse of the B1 smoke regression (where the *code* was broken and the docs looked fine); here the code is strong and the *bookkeeping* had drifted. + +- **The Phase C–I ADR-number collision was corrected before any Phase-C work began (MR-001).** The Phase-C and Phase-D plans had reused ADR numbers (0027–0036) already Accepted and live on `main` — an agent told to "write ADR-0027" by `phase-c.md` would have overwritten the live kernel-virtual-memory-layout decision. PR #32 commit `a6e909d` renumbered all Phase C–I placeholders above the live ceiling (≥ ADR-0037), recording the "renumbered, was ADR-00xx" provenance in the ledger Notes column. This is a roadmap-structure change that protects the project's "decisions on the record are trustworthy" guarantee. + +- **The append-only ADR contradiction was resolved via the supersession mechanism that exists for exactly this (ADR-0036).** Three foundational platform ADRs asserted GICv3/SMMUv3 while the build ships GICv2 + an empty `Iommu` stub; the 2026-05-06 review had corrected the *architecture docs* but not the *ADRs* (append-only forbids in-place body edits), freezing the contradiction. ADR-0036 + one-line redirect riders on 0004/0006/0012 close it append-only-legally. The plan now also states the DMA-scoping invariant honestly: "DMA is capability-scoped where hardware permits" is currently aspirational even on QEMU because there is no `Iommu` impl — recorded as a strategic (not implementation) gap. + +- **CI moved from "claims gates it does not run" to honest + complete (MR-002/003/007/008/009).** The "stable" jobs actually ran on the pinned nightly (the `rust-toolchain.toml` override shadowed `rustup default stable`), the documented required-gate set listed `cargo-audit` / `cargo-vet` / QEMU-smoke jobs that did not exist, the coverage job ran `continue-on-error: true` while the header called it required, and a global `RUSTFLAGS: -D warnings` clobbered the per-target `panic=abort` + frame-pointer config so CI compiled a *different binary* than every local build. PR #32 reconciled config + job names + docs, stopped the RUSTFLAGS clobber, SHA-pinned third-party Actions + added a `permissions: contents: read` block, and made **Miri a blocking gate**. This is a gate-integrity change to the project's quality bar, not a feature change. + +- **The `ContextSwitch` safety contract gained `d8`–`d15` (MR-005).** The trait `# Safety` text and ADR-0020 enumerated only the GP callee-saved set; the only correct implementor (the QEMU BSP) saves `d8`–`d15` and is 168 bytes (ADR-0020 said 104). v1 is sound, but a second BSP author (Pi 4 / Jetson — the entire reason the HAL trait exists) implementing to the literal contract would ship a context switch that silently corrupts FP state across every yield. PR #32 added the FP enumeration to the contract + an ADR-0020 §Revision-notes rider. This is a cross-board-correctness contract change with no v1 behaviour change. + +- **Org migration + README rewrite (housekeeping).** The repo migrated `cemililik/Tyrne` → `HodeTech/Tyrne` (commit `cd4cb6e`) and the root README was rewritten for a first-time reader (commit `3ab029f`, current HEAD). Neither touches kernel surface; recorded here for traceability since cross-references and badges shifted. + +- **No B-phase milestone re-shuffling.** B5 (Syscall boundary) becomes active after this closure trio. The phase-b plan's B5 shape stands: ADR-0030 (syscall ABI) + ADR-0031 (initial syscall set), then EL0→EL1 SVC dispatch. The deferred `task_create_from_image` wrapper (phase-b §B4 §3) that turns a `LoadedImage` into a runnable `TaskCap` is unchanged — it gates on B5/B6. + +## What we learned + +### Periodic full-tree review catches a class of drift that per-PR review structurally cannot + +This is the central learning of the period. Tyrne's per-PR discipline is genuinely strong — B2/B3/B4 each closed with multi-round bot + agent review, simulation-table-driven ADRs, and clean smoke. Yet the 2026-05-22 full-tree sweep found **4 Blockers and 18 Majors**, *none* of them kernel-correctness or security defects in the shipped binary. They clustered in three places per-PR review never looks: + +1. **Cross-file consistency across the whole corpus.** The ADR-number collision (MR-001), the GICv3/SMMUv3 frozen contradiction across 0004/0006/0012 + Phase C (MR-006), and the ~49 broken cross-references (MR-004, 42 `.claude/skills/` link-rot + 7 `hal/src/mmu.rs` path-rot) are all invisible to a reviewer scoped to one PR's diff. The `.agents/skills/` rename (B3 arc) did not run a repo-wide `grep -F` sweep — a textbook "post-fix produces stale documentation" miss that *no* per-PR review would surface because the broken links were in files the renaming PR didn't touch. + +2. **The gate vs the claims about the gate.** The CI integrity findings (MR-002/003/007/008) are about the *workflow file and the docs describing it* — exactly the artefacts that change rarely and get reviewed lightly. "The stable job runs on nightly" and "the documented required gates don't exist" can persist for milestones because every PR's CI is *green*, just not testing what its name says. + +3. **Contracts a second implementor will read but no current code exercises.** The `d8`–`d15` gap (MR-005) is sound in v1 because the only implementor is correct; it is a latent cross-board bug that surfaces only when a second BSP is written. Per-PR review of a single-BSP tree has no signal that the *contract text* is wrong — the code passes every test. Five independent master-review tracks corroborated this finding; one track could not have. + +The honest framing: the per-PR discipline is doing its job (the *code* is strong), but it has a blind spot for whole-tree consistency, gate honesty, and unexercised contracts. A periodic full-tree audit is the only instrument that sees those. The maintainer inserting one *before* declaring B4 Closed — rather than after — is the right ordering: it means B4 closes on a tree whose records match reality. + +### The remediation closed 23 of 24 verified Blocker+Major findings; the one residual is a one-line doc gap + +The closure-trio session re-verified all 24 Blocker+Major findings adversarially against the live tree: **23 confirmed fixed, 1 partially fixed.** The single partial is **MR-009 (Miri-as-CI-gate):** Miri *is* now a blocking CI job (no `continue-on-error`) and *is* listed as a required gate in `infrastructure.md` — but the prescription "make Miri green a Phase-B exit prerequisite" is not yet written into [`phase-b.md`](../../../roadmap/phases/phase-b.md)'s §Exit bar or a Phase-B exit checklist. This is the cleanest single B4-closure Adjustment to record: a one-line in-tree doc gap, non-blocking, and the only open item from a 24-finding remediation. The signal is that the remediation was thorough — the residual is a documentation phrasing, not a missing mechanism. + +### The append-only ADR mechanism absorbed a frozen contradiction exactly as designed (ADR-0036) + +The GICv3/SMMUv3 contradiction was the append-only policy working *against* the project: the rule that makes the record trustworthy ("never rewrite a decision") had frozen a *wrong* decision in place, and the conflict-resolution convention ("disagree by writing a superseding ADR") had no forward pointer telling a reader the GICv3 line was wrong. ADR-0036 is the first time the project used the supersession mechanism to correct a *factual* error (rather than a design change) in a foundational ADR, and it confirmed the mechanism scales: a short corrective ADR + append-only one-line redirect riders on the offending ADRs resolves the contradiction without rewriting history. The learning generalises — any future "Accepted ADR contradicts the build" finding has a known, append-only-legal playbook now demonstrated end-to-end. + +### The loader's leak-path-closure discipline scaled from one PMM-consumer to a multi-step composite + +T-018 established "preflight every rejectable check before the first PMM commitment" for a single creation path. T-019 had to scale it to a *composite* path: `cap_create_address_space` (1 root frame) + a `cap_map` loop where each call may allocate up to 3 intermediate page-table frames, across two disjoint VA ranges (image low, stack high). The discipline held — the frame budget is a safe upper bound computed before the first `alloc_frame()`, and the rollback contract names which API reverses which commitment on which `LoadError` variant (notably `cap_drop`, not `cap_revoke`, for the freshly-minted leaf AS cap). The review-round arc tightened it further: round 4 added an alignment preflight at row 1 because a misaligned `image_base_va` previously surfaced as `MapFailed` *after* the root L0 frame was already allocated (a preventable leak on internal-API misuse). The pattern is now load-bearing for B5's `task_create_from_image` wrapper, which will compose a `LoadedImage` with task-creation — another multi-step path that must preflight before its first commit. + +### Bot + agent review-round signal quality remained consistent with prior calibration + +Same pattern as B1/B2/B3: factual-mechanical findings (path drift, stale links, missing barriers, doc-vs-code mismatch, missing test counts) had a high apply rate across PRs #30/#31/#32; architectural/refactor preferences had a lower apply rate, filtered by the CLAUDE.md minimum-surface discipline. The master review re-confirmed the B3 lesson: trust a finding's *direction* more than its *fix prescription* — the prescriptions in the consolidated report were sound, but several (e.g. the exact toolchain-intent decision for CI) required the maintainer to choose the repo-specific resolution rather than apply verbatim. + +## B3 closure §Adjustments — closure status + +The [2026-05-14 B3 closure retrospective](2026-05-14-B3-closure.md#adjustments) listed adjustment items. This audit closes the B4 milestone item and confirms the remainder still trigger-deferred (their triggers have not fired): + +| B3 Adjustment | Status (2026-05-28) | Closing reference | +|---|---|---| +| **B4 milestone — initial userspace image format** | **Closed** | ADR-0029 Accepted 2026-05-14 (PR #30); T-019 Done 2026-05-16 (PR #31). The loader exercises `cap_create_address_space` + `cap_map` at runtime for the first time (the `tyrne: image loaded (...)` smoke line proves it). | +| **B5+ `MemoryRegion` cap variant + per-operation rights set extension** | **Open (trigger-deferred)** | The T-018 review-round F2 `cap_map`/`cap_unmap` rights gap (any AS cap currently grants full mapping authority). Trigger unchanged: opens with the first B5 ADR introducing a per-task AS cap not co-resident with the bootstrap-everything cap. | +| **B-phase BSP task — proper PL011 init** | **Open (trigger-deferred)** | The `-d guest_errors` count is still **100% PL011 noise** (629 at B4 closure; the +103 vs B3's ~526 is exactly the new banner-line bytes). Trigger unchanged: opens when a B-phase task needs a clean MMIO baseline (likely B5's first userspace fault-test). | +| **BSP host-test infrastructure for block-descriptor `unmap` regression** | **Open (trigger-deferred)** | `bsp-qemu-virt` still has no host-test crate. The master review's MR-018 added failure-injecting fakes (`OutOfFramesMmu` / `BlockMappedMmu`) at the *test-hal* layer, which partially de-risks this, but the BSP-side end-to-end block-collision unmap path is still uncovered. Trigger unchanged: opens with the first runtime `cap_unmap` block-descriptor case (B5+ teardown). | +| **High-half kernel migration (ADR-0033 placeholder)** | **Still trigger-deferred** | Gate unchanged: opens when B5 surfaces the per-task `TTBR0_EL1` swap. T-019 reinforced the trigger — its userspace AS holds *only* image + stack (no kernel mappings), so an EL1 exception under that AS would translation-fault on the vector fetch; this is precisely ADR-0033's responsibility. | +| **Kernel-image section permissions (ADR-0034 placeholder)** | **Still trigger-deferred** | Gate unchanged: opens with the first attacker-observable EL0 execution (likely B5/B6). T-019 maps the image `USER \| EXECUTE` globally + stack `USER \| WRITE`; per-section RX/.text + R/.rodata + RW/.data + NX discipline is ADR-0034's job. | + +Net: **1 of 6 closed** (B4 milestone via the loader arc); 5 trigger-deferred (2 BSP hygiene, 1 B5+ rights ADR, 2 future-phase placeholders). No trigger fired in the period for the deferred items — the carry-forward is honest, not neglected. + +## Adjustments + +- [x] **MR-009 residual — add "Miri green = Phase-B exit prerequisite" to the phase-b.md exit bar.** **✅ Closed in-branch 2026-05-28** — the §"Exit-quality prerequisite — Miri" paragraph was added to the top of [`phase-b.md`](../../../roadmap/phases/phase-b.md), stating a green `cargo +nightly miri test` run as a Phase-B *milestone-exit* prerequisite (with weight on `kernel/src/sched/**` + `kernel/src/ipc/**`) and linking [`infrastructure.md` §"Miri as a blocking gate"](../../../standards/infrastructure.md#miri-as-a-blocking-gate). The CI-gate half was already done by PR #32; MR-009 is now fully closed. The one open item from the 24-finding master-review remediation. Miri is already a blocking CI job + a listed required gate in `infrastructure.md`; the only gap is that [`phase-b.md`](../../../roadmap/phases/phase-b.md)'s §Exit bar (or a new Phase-B exit checklist) does not yet state that a green Miri run is an exit prerequisite — consistent with the 2026-04-21 security review that made the scheduler/IPC aliasing discipline the #1 Phase-B blocker. **Trigger:** a one-line edit to `phase-b.md` §Exit bar; cleanest to land as the first commit of the B5 prep arc (alongside the ADR-0030 propose commit) so the Phase-B exit bar is correct before Phase B's final milestones run. Pure documentation change; no code. +- [x] **`current.md` test-count drift 260 → 286.** **✅ Closed in-branch 2026-05-28** — `current.md` was refreshed for the closure: a new 2026-05-28 B4-closure top banner (carrying the 286 count + 629 guest-errors + the perf band), the Last-completed-milestone bullet flipped from "B4 implementation-complete (260 tests)" to "B4 Closed 2026-05-28 (286 tests)", and the four Pathfinder bullets (Active phase / milestone / task + Next review trigger) flipped to B5. The README was already fine (de-hardcoded to a link per MR-015). +- [ ] **B5 opens — ADR-0030 (syscall ABI) + ADR-0031 (initial syscall set).** ADR-0030 settles the register calling convention + error-return convention + the K2-5 `IpcError::InvalidCapability` split into `StaleHandle` / `MissingRight` / `WrongObjectKind`; ADR-0031 settles the initial syscall set (`send`, `recv`, `console_write`, `task_yield`, `task_exit`) — no more in v1. Then EL0→EL1 SVC dispatch, a panic-free syscall dispatcher, validated copy-from/to-user through the active AS, and Capability Debug redaction (K3-9). ADR numbers tentative per [ADR-0013](../../../decisions/0013-roadmap-and-planning.md). **Trigger:** opens with the next ADR-prep arc — the planned slot is ADR-0030 paired with a T-NNN syscall-dispatch task, per [phase-b.md §B5](../../../roadmap/phases/phase-b.md). B5 is the prerequisite for the deferred `task_create_from_image` wrapper (phase-b §B4 §3) that turns a `LoadedImage` into a runnable `CapHandle{CapObject::Task(...)}`, then B6 (first userspace "hello"). +- [ ] **B5+ `MemoryRegion` cap variant + per-operation rights set extension.** Carry forward from B3 unchanged (the T-018 F2 rights gap). **Trigger:** the first B5 ADR that introduces a per-task AS cap not co-resident with the bootstrap-everything cap. +- [ ] **B-phase BSP task — proper PL011 init.** Carry forward unchanged. **Trigger:** when any B-phase work needs a clean `-d guest_errors` baseline (likely B5's first userspace fault-test, where a clean baseline lets the test distinguish real fault classes from the 629-count PL011 noise). +- [ ] **BSP host-test infrastructure for block-descriptor `unmap` regression.** Carry forward; partially de-risked by MR-018's test-hal failure-injecting fakes but the BSP-side end-to-end path is still uncovered. **Trigger:** the first runtime `cap_unmap` block-descriptor case (B5+ teardown). +- [ ] **High-half kernel migration (ADR-0033 placeholder).** Trigger unchanged: opens when B5 surfaces the per-task `TTBR0_EL1` swap (reinforced by T-019's kernel-mappings-absent userspace AS). +- [ ] **Kernel-image section permissions (ADR-0034 placeholder).** Trigger unchanged: opens with the first attacker-observable EL0 execution (likely B5/B6). + +> **Closure-trio actions taken in-branch (2026-05-28).** Of this list, the two *actionable-now* Adjustments were executed during the closure; the rest are forward / trigger-deferred and remain open with their stated triggers. **Done in-branch:** (1) **MR-009** — `phase-b.md` §"Exit-quality prerequisite — Miri"; (2) **`current.md` 260 → 286** — banner + Pathfinder bullets refreshed. **Still open (correctly):** the **B5** ADR-0030/0031 thread (forward milestone work — opens with the B5 prep arc, maintainer-sequenced); and the trigger-deferred carry-forwards — **B5+ `MemoryRegion` cap + per-op rights** (trigger: first per-task AS cap not co-resident with the bootstrap cap), **PL011 init BSP task** (trigger: first B-phase work needing a clean `-d guest_errors` baseline), **BSP host-test crate for block-descriptor `unmap`** (trigger: first runtime `cap_unmap` block case), **ADR-0033** (trigger: B5 per-task `TTBR0_EL1` swap), **ADR-0034** (trigger: first attacker-observable EL0 execution). None of these triggers has fired; leaving them open is the honest state, not neglect. + +## Next + +- **Active phase:** B (unchanged). +- **Active milestone:** **B5 — Syscall boundary.** Phase-b plan places B5 after B4; the B5 ADR pair (ADR-0030 + ADR-0031) is the first concrete artefact. The deferred `task_create_from_image` wrapper (phase-b §B4 §3) gates on B5/B6. +- **Active task:** none — B4 closed via this closure trio; B5 prep / first ADR (ADR-0030) opens next. Per [ADR-0025 §Rule 1](../../../decisions/0025-adr-governance-amendments.md), the implementation task opens in the same commit as the first B5 ADR's *Dependency chain* section. The MR-009 phase-b.md exit-bar one-liner is cleanest to land in that same commit. +- **Next review trigger:** **B5 closure trio.** Produced when the first B5 milestone reaches `In Review`. Possible interim triggers: a mini-retro if EL0/syscall bring-up surfaces a learning worth capturing mid-arc; a maintainer-initiated review or a second on-demand full-tree master review if the corpus drifts again before B5 closes. diff --git a/docs/analysis/reviews/business-reviews/README.md b/docs/analysis/reviews/business-reviews/README.md index 3e022eb..3cd0f48 100644 --- a/docs/analysis/reviews/business-reviews/README.md +++ b/docs/analysis/reviews/business-reviews/README.md @@ -34,3 +34,4 @@ A business review may point at outcomes from those other reviews as part of "wha | 2026-05-07 | B1 closure retrospective (post-T-014) — fresh closure trio replacing the 2026-04-28 trio's load-bearing role; T-014 + ADR-0026 fixed the smoke regression; α/β/γ closed comprehensive-review Track-E/J/A/B/F/G/I non-blockers | [2026-05-07-B1-closure.md](2026-05-07-B1-closure.md) | | 2026-05-09 | B2 closure retrospective — MMU activation + kernel-half mapping (T-016); ADR-0027 + `MapperFlush` flush-token discipline; closed cleanly on first attempt (no smoke-regression arc) | [2026-05-09-B2-closure.md](2026-05-09-B2-closure.md) | | 2026-05-14 | B3 closure retrospective — Address-space abstraction (T-017 PMM + T-018 `AddressSpace` kernel object); ADR-0035 + ADR-0028; five-round PR #28 review arc + cross-cutting `MmuError::BlockMapped` + `.claude/skills/` → `.agents/skills/` migration | [2026-05-14-B3-closure.md](2026-05-14-B3-closure.md) | +| 2026-05-28 | B4 closure retrospective — Task loader (T-019 `load_image` → `LoadedImage`); ADR-0029 + the 2026-05-22 master-review interlude (4 Blocker / 18 Major full-tree audit) + PR #32 remediation closing 23/24 verified findings; ADR-0036 supersession of GICv3/SMMUv3; UNSAFE-2026-0027 + 0028 added, 0025/0026 lifted; HodeTech org migration | [2026-05-28-B4-closure.md](2026-05-28-B4-closure.md) | diff --git a/docs/analysis/reviews/performance-optimization-reviews/2026-05-28-B4-closure.md b/docs/analysis/reviews/performance-optimization-reviews/2026-05-28-B4-closure.md new file mode 100644 index 0000000..efbf81b --- /dev/null +++ b/docs/analysis/reviews/performance-optimization-reviews/2026-05-28-B4-closure.md @@ -0,0 +1,223 @@ +# Performance baseline 2026-05-28 — B4 closure (post-T-019 + master-review remediation) + +- **Concern:** Did the T-019 task loader (`load_image` → `LoadedImage`; first runtime exerciser of the post-bootstrap per-call `Mmu::map` path) plus the 2026-05-22 master-review remediation (PR #32) shift the kernel image footprint, RAM use, or boot-to-end timing versus the [post-T-018 B3 closure baseline](2026-05-14-B3-closure.md)? +- **Scope:** All committed code on `main` from [`47b0a86`](https://github.com/HodeTech/Tyrne/commit/47b0a86) (PR #28 merge — T-018 / B3 closure baseline) through [`3ab029f`](https://github.com/HodeTech/Tyrne/commit/3ab029f) (HEAD — README clarity/formatting). The arc covers: ADR-0029 (Initial userspace image format) Accepted 2026-05-14 via [PR #30](https://github.com/HodeTech/Tyrne/pull/30) (`e09755d`); T-019 (`load_image` + `LoadedImage` + 10-variant `LoadError` + the exact `intermediate_frame_count` budget helper + `core::ptr::copy_nonoverlapping` byte-copy site → UNSAFE-2026-0027) via [PR #31](https://github.com/HodeTech/Tyrne/pull/31) (`7f876af`, 7 bisectable commits + 6 review-round/polish commits); the [2026-05-22 master review](../master-review/2026-05-22-152729/consolidated.md) interlude (run id `2026-05-22-152729`, anchor `288ddb2`); and the master-review remediation via [PR #32](https://github.com/HodeTech/Tyrne/pull/32) (`50bffe9`) — which added ADR-0036 (QEMU virt is GICv2 / no-IOMMU in v1), the MR-010 PMM interval-arithmetic rewrite of `could_yield_pa_overlapping`, MR-022's centralised infallible enqueue helper, MR-011's UNSAFE-2026-0028 (`from_existing_root` audit entry), MR-005's d8–d15 `ContextSwitch` contract fix, and the failure-injecting test fakes (MR-018). Then the `cemililik/Tyrne → HodeTech/Tyrne` org migration (`cd4cb6e`) and the README rewrite (`3ab029f`). +- **Hypothesis:** Re-baseline artefact, not a hypothesis-driven optimisation cycle. Per the [master plan's pre-flight](master-plan.md#pre-flight-hypothesis), no concrete improvement target is set — the goal is to record the post-T-019 / post-remediation baseline so future hypothesis-driven cycles (B5 syscall ABI, B6 first userspace, B5+ ASID isolation, first real-hardware BSP) have a fresh reference point. Implicit non-hypothesis: T-019 should add bounded `.text` (the loader module — `load_image` page-loop + the 10-variant `LoadError` + the preflight chain + the exact `intermediate_frame_count` helper), bounded `.rodata` (the `LoadError` `Debug` strings + the embedded 8-byte image), bounded `.bss` (the loader/AS working set), and a measurable boot-to-end timing cost (the first post-bootstrap `cap_create_address_space` + per-page `cap_map` page-table walks + `copy_nonoverlapping` image copy + stack mapping, all exercised at boot under a live MMU translation regime). The master-review remediation should add bounded test-only and production surface (MR-010 PMM interval rewrite, MR-022 enqueue helper) with no new fault class. +- **Reviewer:** @cemililik (+ Claude Opus 4.8 (1M context) agent acting in the Baseline + Hotspot + Reporter roles below; Proposal / Measurement / Regression-check sections are short because no proposal is being measured this cycle). +- **Target:** QEMU `virt`, aarch64, Cortex-A72 model, single core, 128 MiB RAM (`-M virt -cpu cortex-a72 -m 128M -smp 1`; unchanged). +- **Build:** release profile (`cargo build --release --target aarch64-unknown-none -p tyrne-bsp-qemu-virt`). + +> **Canonical source for B4 closure metrics.** This artefact + the [business retrospective](../business-reviews/2026-05-28-B4-closure.md) + the [security review](../security-reviews/2026-05-28-B4-closure.md) are the source of truth for B4's closing footprint / timing numbers. Other locations referencing kernel image size, test counts, or boot-to-end timing ([`current.md`](../../../roadmap/current.md), [`phase-b.md`](../../../roadmap/phases/phase-b.md), the [T-019](../../tasks/phase-b/T-019-task-loader.md) review-history rows) are *summaries at their layer of abstraction*; corrections start here. **Note:** `current.md` still cites 260 host tests (the post-T-019/PR-#31 count); the post-remediation count is 286 — see §"Regression check" for the drift flag. + +--- + +## Baseline + +### Methodology + +- ELF section sizes via `llvm-size -A -d` (workspace pinned nightly-2026-01-15 toolchain), release profile (`-C opt-level=3`). The `-A` (all-sections, sysv style) `-d` (decimal) form is used so the per-section byte counts compose directly into the trajectory table; the value reported is the section's allocated size (test-only `#[cfg(test)]` fakes are not in the release BSP ELF). +- Boot-to-end timing measured via [`tools/perf-harness.sh`](../../../../tools/perf-harness.sh) (P10 wall-clock harness; see [`infrastructure.md` §"Performance harness"](../../../standards/infrastructure.md#performance-harness)). 20 iterations, 5 s per-run timeout, release build, single host (Darwin 24.6.0 / x86_64; Apple Silicon Rosetta 2 emulating x86_64 for QEMU TCG translation). Each iteration is a fresh QEMU process; the TCG translation cache is destroyed between iterations. +- Single-run smoke trace recorded for the canonical record; `-d int,unimp,guest_errors` event count captured for the fault-class regression check. +- The harness report lives at [`docs/analysis/reports/perf-baseline-2026-05-28-B4-closure.md`](../../reports/perf-baseline-2026-05-28-B4-closure.md) (20-iteration release-build band at HEAD `3ab029f`, QEMU 10.2.2). +- Counter source: the kernel's `now_ns()` (`hal::Timer`) reads the EL1 virtual generic-timer counter and converts to nanoseconds via the cached `CNTFRQ_EL0` resolution (62 500 000 Hz, 16 ns). Under QEMU TCG the counter advances based on emulated instructions rather than wall-clock time, so variance reflects translation-cache behaviour and host scheduler jitter, not real-hardware performance. + +### Metric 1 — Kernel image size + +| Section | post-T-018 (2026-05-14 B3 closure baseline) | post-T-019 + remediation (2026-05-28) | Δ bytes | Δ % | +|---------|---------------------------------------------|----------------------------------------|--------:|----:| +| `.text` | 24,008 | **33,124** | **+9,116** | **+38.0 %** | +| `.rodata` | 3,536 | **4,560** | **+1,024** | **+29.0 %** | +| `.bss` | 42,080 | **48,320** | **+6,240** | **+14.8 %** | + +Total kernel image (`.text + .rodata + .bss`) = **86,004 B ≈ 84.0 KiB**, up from B3's 69,624 B ≈ 68.0 KiB — a **+17,380-byte (+25.0 %)** combined growth. This is the largest single-milestone `.text` jump of the project so far (B3's was +1,624); B4 is the first milestone whose `.text` growth is dominated by a genuinely new *runtime* subsystem (the loader) rather than by data-structure scaffolding. + +**Observations:** + +- **`.text` grew by +9,116 bytes (+38.0 %)** across two distinct contributors: + - **T-019 task loader** (the bulk): `load_image`'s body (the eight-step state machine — argument preflight → cap preflight → VA-range + frame-budget preflight → image-PA-overlap preflight → `cap_create_address_space` → image page-loop with `copy_nonoverlapping` + `cap_map` → stack page-loop → `LoadedImage` construct), the **10-variant `LoadError` enum** with its `Debug` impl, the **exact `intermediate_frame_count` helper** (VMSAv8 4-level index decomposition at shifts 21/30/39 — replaced the prior hard-coded `6` constant per PR #31 review-round 3 Finding 1), and the rollback machinery (`cap_drop` cap-side cleanup + the leaf-frame/partial-mapping undo). The loader is the first kernel module whose body is a multi-step fallible state machine with per-page loops, so its compiled footprint is materially larger than the B3-era cap-wrapper preflights. + - **Master-review remediation kernel changes** (the remainder): MR-010's O(R) interval-arithmetic rewrite of `could_yield_pa_overlapping` (replaced the O(range_frames × R) per-frame loop the X2-001 finding flagged); MR-022's centralised infallible `enqueue_ready` helper; MR-005's d8–d15 FP callee-saved enumeration threading into the `ContextSwitch` contract (the BSP context-switch asm was already correct — the change is contract/doc-side plus the test that pins it, so its `.text` contribution is small). +- **`.rodata` grew by +1,024 bytes (+29.0 %)** — the `LoadError`'s 10 variant `Debug` strings (each carries a static name string for the panic-free typed-error discipline) + the `include_bytes!`-style embedded 8-byte placeholder image (`[0x40, 0x05, 0x80, 0x52, 0xc0, 0x03, 0x5f, 0xd6]` = `mov w0, #42; ret` per [ADR-0029 §Revision notes](../../../decisions/0029-initial-userspace-image-format.md#2026-05-16--placeholder-byte-sequence-documented-intent-vs-accept-state-literal)) + the new banner-line format string. +- **`.bss` grew by +6,240 bytes (+14.8 %)** — the loader/AS working set: the BSP wiring now drives a real `cap_create_address_space` + two `cap_map` calls at boot, which exercises (and therefore reserves the static footprint behind) the loader's stack-resident working state and the AS arena's now-actually-populated second slot path. **No new linker-script reservation beyond the existing `.boot_pt`** — the growth lives in the regular `.bss` section, consistent with the B3 `AddressSpaceArena`/`Pmm` static convention. + +**A6 → … → B4 cumulative trajectory** (relative to the A6 baseline of `.text 13,940` / `.rodata 1,960` / `.bss 17,872`; this table extends the [B3 closure trajectory table](2026-05-14-B3-closure.md#metric-1--kernel-image-size) with the B4 column): + +| Section | A6 | B1 (2026-04-28) | post-T-014 (2026-05-07) | post-T-015 (2026-05-07) | post-T-016 (2026-05-09) | post-T-018 (2026-05-14) | **post-T-019 + remediation (2026-05-28)** | A6 → today total Δ | +|---------|---:|---:|---:|---:|---:|---:|---:|---| +| `.text` | 13,940 | 21,908 | 21,792 | 22,020 | 22,384 | 24,008 | **33,124** | **+19,184 (+137.6 %)** | +| `.rodata` | 1,960 | 2,784 | 2,928 | 2,928 | 2,944 | 3,536 | **4,560** | **+2,600 (+132.7 %)** | +| `.bss` | 17,872 | 22,248 | 22,256 | 22,256 | 40,208 | 42,080 | **48,320** | **+30,448 (+170.4 %)** | + +The total kernel image (text + rodata + bss) is now **~84.0 KiB**, up from ~68.0 KiB at B3 closure. For the first time the `.text` per-milestone delta (+9,116) is larger than the `.bss` per-milestone delta (+6,240) — B4 is a code-heavy milestone (the loader is real executable logic), whereas B2/B3 were data-structure-heavy milestones (the `.boot_pt` reservation, the `Pmm`/`AddressSpaceArena` statics). The `.bss` still dominates the *cumulative* trajectory because B2's `.boot_pt` 16 KiB reservation is the single largest contribution to date. + +### Metric 2 — Test count + +| Crate | B3 closure (2026-05-14) | T-019 / PR #31 merge | **B4 closure (2026-05-28)** | B3 → B4 Δ | +|-------|------------------------:|---------------------:|----------------------------:|-----------| +| `tyrne-hal` (lib) | 42 | 42 | **43** | +1 | +| `tyrne-kernel` (lib) | 141 | 175 | **187** | +46 | +| `tyrne-test-hal` (lib) | 43 | 43 | **53** | +10 | +| doc-tests | 0 | 0 | **3** | +3 | +| **Total** | **226** | **260** | **286** | **+60** | + +The +60 net delta decomposes into two distinct arcs: + +- **+34 from the B4 implementation arc** (B3's 226 → 260 at the T-019 / PR #31 merge): all +34 in `tyrne-kernel`, the new `task_loader::tests` module — the eight §Simulation rows each mapped to a host test per the [`write-adr` skill §Procedure step 5 row-to-verification discipline](../../../../.agents/skills/write-adr/SKILL.md) (argument/cap/VA-range/frame-budget/image-PA-overlap preflights, the `intermediate_frame_count_*` helper unit tests, the happy-path metadata assertions, the tail-zeroing check, and the rollback-on-failure tests via the `Pmm::force_alloc_failure_after` test-only injection). +- **+26 from the master-review remediation PR #32** (260 → 286): +1 hal, +12 kernel, +10 test-hal unit, +3 doc-tests. The **+10 test-hal** are the `OutOfFramesMmu` / `BlockMappedMmu` failure-injecting fakes (MR-018) plus their doctests; the **+12 kernel** are the MR-010 PMM failure-path tests + the MR-017 `IrqState` polarity test + the MR-018 `cap_map` / `load_image` rollback tests + MR-022. + +All 286 host tests pass at HEAD `3ab029f`; per-crate counts: 43 + 187 + 53 + 3 doc-tests. The per-test breakdown aligns with the business retrospective's [test-counts table](../business-reviews/2026-05-28-B4-closure.md). + +### Metric 3 — Boot-to-end timing + +QEMU smoke at HEAD `3ab029f` (`main`), QEMU 10.2.2, `-M virt -cpu cortex-a72 -m 128M -smp 1`, captured live 2026-05-28. + +#### Source A — single-run release smoke trace (canonical record) + +```text +tyrne: hello from kernel_main +tyrne: mmu activated +tyrne: pmm initialized (32603 frames available; 165 reserved) +tyrne: address-space-arena ready (1 / 8 slots used; bootstrap AS root = 0x40091000) +tyrne: image loaded (entry = 0x800000; sp = 0x802000; image bytes 8; stack bytes 4096; AS cap = idx 1) +tyrne: timer ready (62500000 Hz, resolution 16 ns) +tyrne: starting cooperative scheduler +tyrne: task B — waiting for IPC +tyrne: task A -- sending IPC +tyrne: task B — received IPC (label=0xaaaa); replying +tyrne: task A — received reply (label=0xbbbb); done +tyrne: all tasks complete +tyrne: boot-to-end elapsed = 19257008 ns +``` + +Single run: **~19.26 ms boot-to-end** (release, no `-d` flags). The single-run number is anecdotal; the harness band (Source B) is the load-bearing claim. + +**What changed in the trace vs B3.** The new B4 line is **`tyrne: image loaded (...)`**, inserted between the `address-space-arena` line and the `timer ready` line — this is the loader's success banner, confirming `load_image` ran end-to-end and returned a `LoadedImage` (entry `0x800000`; sp `0x802000` = one-past-the-highest mapped stack VA; image bytes 8; stack bytes 4096 = 1 page; AS cap = idx 1, the new second arena slot). Two other lines shifted numerically: + +- **`pmm initialized`** now reports **165 reserved** (B3: 161) and **32603 available** (B3: 32607). The +4 reserved frames are the embedded image + the loader's page-table frames now counted in the boot reservation set. +- **bootstrap AS root** shifted `0x4008d000` → `0x40091000` because the kernel image grew (the loader added `.text`/`.rodata`), pushing the bootstrap L0 root frame higher. + +Both shifts are direct, expected consequences of the loader landing — no fault class, no behaviour change in the IPC demo (which still runs to `tyrne: all tasks complete`). + +#### Source B — perf-harness band (release, 20 iterations) + +Recorded in [`docs/analysis/reports/perf-baseline-2026-05-28-B4-closure.md`](../../reports/perf-baseline-2026-05-28-B4-closure.md). 20/20 valid, wall-clock 102 s: + +| Statistic | ns | ms | +|-----------|----:|----:| +| min | 15,624,992 | 15.625 | +| p10 | 15,640,992 | 15.641 | +| p50 | 17,587,008 | 17.587 | +| p90 | 19,150,000 | 19.150 | +| p99 | 21,154,992 | 21.155 | +| max | 21,154,992 | 21.155 | +| mean | 17,586,899 | 17.587 | +| stddev | 1,428,711 | 1.429 | + +`p99 == max` by nearest-rank construction for `n < 100` (the index rounds up to `n`); it is reported as-computed and is not a tail-latency signal at this `n` — see the [report's §"Note on p99 at small n"](../../reports/perf-baseline-2026-05-28-B4-closure.md#methodology). + +**Δ vs B3 closure (release, post-T-018 baseline):** + +| Statistic | B3 closure (ms) | B4 closure (ms) | Δ ms | Δ % | +|-----------|---:|---:|---:|---:| +| p10 | 10.311 | 15.641 | **+5.330** | **+51.7 %** | +| p50 | 11.884 | 17.587 | **+5.703** | **+48.0 %** | +| p90 | 13.823 | 19.150 | **+5.327** | **+38.5 %** | +| p99 | 14.372 | 21.155 | +6.783 | +47.2 % | +| stddev | 1.195 | 1.429 | +0.234 | +19.6 % | + +**The boot-to-end band rose +5.33 / +5.70 / +5.33 ms at p10 / p50 / p90.** The increase is tightly clustered (the three percentile deltas are within ~0.4 ms of each other), which is the signature of a *one-time-at-boot* cost added uniformly to every iteration rather than a variance/jitter regression. The §"Hotspot" section decomposes it. + +## Hotspot + +The +5.33 to +5.70 ms p10/p50/p90 increase versus B3 closure is dominated by **one new one-time-at-boot cost: the T-019 task loader running at boot** — the first post-bootstrap exercise of the per-call `Mmu::map` path under a live MMU translation regime, amplified by QEMU TCG's software-MMU emulation overhead. The decomposition below builds on the master-review [X2-performance track's `load_image` cost breakdown](../master-review/2026-05-22-152729/tracks/X2-performance.md#load_image-cost-breakdown-v1-bsp-8-byte-image-1-stack-page), which performed the per-component attribution arithmetic against the code at anchor `288ddb2`; the numbers below are consistent with that pass. + +### Hotspot 1 — `load_image`'s boot-time address-space population (dominant) + +`load_image` is called once at `kernel_entry` with the 8-byte placeholder image and `stack_size_pages = 1`. Its per-boot work decomposes into: + +1. **`cap_create_address_space`** — allocates the new AS's root L0 frame (1 `alloc_frame`, each carrying a 4 KiB zero-fill per UNSAFE-2026-0026's `Pmm` contract), wraps it, publishes into arena slot 1, mints the AS cap via `cap_derive`. Roughly O(1) in-memory plus one zero-fill. +2. **Per-page `cap_map` page-table walks** — for the v1 BSP (8-byte image at VA `0x0080_0000`, 1 stack page), `intermediate_frame_count` resolves to **3** distinct intermediate page-table tables (1 × L1 + 1 × L2 + 1 × L3, shared across the image+stack span since both land in the same 1 GiB / 2 MiB block at this VA). Two `cap_map` calls (image leaf + stack leaf) each drive `walk_or_alloc_table` down three levels, performing `read_volatile`/`write_volatile` descriptor accesses, plus a `MapperFlush::flush` (`TLBI VAE1` + `DSB ISH` + `ISB`) per call. +3. **`copy_nonoverlapping` image copy** — the UNSAFE-2026-0027 byte-copy site: 8 bytes from `.rodata` into the freshly-allocated, zero-filled, identity-mapped image frame. Trivial in isolation. +4. **Stack mapping** — the second `cap_map`, same shape as the image page. + +Total PMM commitment for the v1 boot: **6 `alloc_frame` calls** (1 AS root + 1 image leaf + 1 stack leaf + 3 intermediates), 6 × 4 KiB zero-fills, 2 `cap_map` page-table walks, and 2 TLB flushes. + +The **dominant cost is the volatile page-table descriptor writes + the per-`cap_map` TLB flush under a live MMU in QEMU TCG**, not the zero-fills or the byte copy. Under TCG, every `write_volatile` to a page-table descriptor and every `TLBI VAE1`/`DSB`/`ISB` forces QEMU to invalidate its internal software-TLB and translated basic-block chains that map through the affected pages — an operation that community benchmarks place at 10–100× the cost of the equivalent hardware operation. This is the first time these descriptor-write + flush sequences run *after* the MMU is active (B3's `Pmm::new` + arena publish were in-memory; the bootstrap-AS wrap reused the live L0 root rather than walking it), so the cost surfaces now for the first time. The master-review X2 attribution placed the `walk_and_install_leaf` volatile-write + TLB-invalidation share at ~5–6 ms, the 6-frame zero-fill at ~0.5–1 ms, and the remaining banner/UART/arena items at < 0.5 ms combined — summing to the measured +5.3 to +5.7 ms band, which is consistent with the live B4-closure measurement. + +**Real-hardware projection:** the same 6 `alloc_frame` + 2 `cap_map` + 2 TLB-flush sequence is on the order of **~40 µs on a real Cortex-A72** (the zero-fill dominates at the memory-bandwidth limit; the descriptor writes + hardware TLBI are tens of cycles). The QEMU TCG cost is a software-MMU emulation artifact, not a kernel performance defect. Real-hardware boot-to-end is unaffected by the TCG amplification and is projected to remain in the sub-5 ms range, dominated by UART output bandwidth (see Hotspot 3). + +### Hotspot 2 — Sustained TCG translation-cache churn from the larger `.text` + +The post-B4 kernel image is +17,380 bytes larger (text + rodata + bss combined; +9,116 of it `.text`) than post-B3. QEMU TCG's translation cache is far larger than any Tyrne image, so the cache is not the limit — but the *first-time-translated* footprint grows linearly with `.text`, and the boot-to-end window now includes more first-time-translated code (the loader module's `load_image` state machine + the `intermediate_frame_count` helper + the MR-010 PMM interval rewrite). Each new first-time-translated function adds roughly 0.2–0.5 ms of TCG translation cost per run, and **every iteration pays the full translation cost** because the harness wraps each iteration in a fresh QEMU process (the TCG cache is destroyed between iterations). This is a small contributor relative to Hotspot 1's MMU-emulation cost but is part of why the band shifted uniformly across percentiles. + +**Real-hardware projection:** translation-cache cost is zero on real hardware; the per-iteration cost would be the actual code execution time. + +### Hotspot 3 — UART output bandwidth (unchanged dominant on real hardware) + +The B4 boot emits one additional banner line (`tyrne: image loaded (...)`) plus slightly longer `pmm`/`address-space-arena` lines. The `-d int,unimp,guest_errors` count rose from B3's ~526 to B4's **629** events — the Δ is exactly the new banner-line UART bytes (one `PL011 data written to disabled UART` warning per byte while QEMU's PL011 rides reset state with `UARTCR.UARTEN = 0`). Under QEMU TCG each UART byte is a device-model MMIO emulation path; bounded (< 1 ms aggregate given the 629-event count). On *real* hardware, UART output bandwidth is projected to be the dominant boot-to-end cost (a real PL011 at 9600 baud transmits ~960 bytes/sec; the multi-line boot trace would take tens of milliseconds to flush) — which is why the real-hardware boot-to-end baseline, once it exists, will be UART-bound rather than MMU-bound. + +### Why this is a baseline, not a regression + +- The Δ is **purely QEMU TCG software-MMU + translation overhead** on a single emulated host; no real-hardware regression is projected (the same sequence is ~40 µs on a Cortex-A72). +- The cost is **one-time-at-boot** — the v1 IPC demo's per-iteration steady-state IPC + scheduler costs are unchanged (the activation hook still short-circuits because all demo tasks share the bootstrap AS; the loader-produced AS is not entered at runtime in B4 — running gates on B5/B6). +- The pattern is **consistent with the B3 closure observation** (B3's +6 to +7 ms came from `Pmm::new` + the AS-arena publish; B4's +5.3 to +5.7 ms comes from the loader's *first-ever post-MMU descriptor-write + flush* sequence). Each milestone's QEMU TCG cost is the new boot-path code it adds, amplified by emulation. +- The band is tightly clustered across percentiles (+5.33 / +5.70 / +5.33 ms), the signature of a uniform one-time cost rather than a variance regression; stddev rose only +0.234 ms. +- The harness band is **the canonical measurement convention** per [`infrastructure.md` §"Performance harness"](../../../standards/infrastructure.md#performance-harness); single-run claims are deprecated. + +## Proposal + +None this cycle. This artefact records the baseline for future B5+ regression checks; it does not propose an optimisation. The master-review's one production-side performance finding (X2-001, `could_yield_pa_overlapping`'s O(range_frames × R) inner loop) was already **remediated** in PR #32 (MR-010: O(R) interval-arithmetic rewrite) — so there is no open optimisation hypothesis carried into B4 closure. + +If a B5+ task surfaces a real-hardware perf concern (the projected sub-5 ms boot-to-end claim is testable only when real hardware lands), a follow-up perf review can hypothesis-test the projection. Until then, the QEMU TCG band is informational, not normative. + +**Rejected proposals (with reasoning):** + +- *"Range-map the image + stack instead of per-page `cap_map`."* Rejected — the `cap_map` surface from ADR-0028 is per-page by design (each call maps one frame at one VA); a range-mapping API is a larger ADR-scoped change, not a perf patch, and the per-page cost is dominated by the QEMU TCG flush overhead (one flush per `cap_map`), not by the number of `cap_map` calls (2 in v1). For the v1 8-byte image the saving is at most one TLB flush; the design churn is not justified by the QEMU-only cost. Per the master-plan §Anti-patterns, "rewriting the design under the banner of performance" belongs to an ADR, not a perf review. +- *"Batch the TLB flushes into a single `TLBI VMALLE1` after the whole loader runs."* Rejected for v1 — the per-`cap_map` flush is the correct conservative discipline (the `MapperFlush` token must be discharged before the mapping is relied upon), and the loader-produced AS is not entered at runtime in B4, so there is no correctness pressure to flush. A batched-flush optimisation is a B5+ consideration once `task_create_from_image` makes the AS live; it gates on the same trigger as the activation-hook differ-path measurement. +- *"Shrink `.text` by collapsing the 10-variant `LoadError` into a smaller enum."* Rejected — each `LoadError` variant is load-bearing for the panic-free typed-error discipline (the kernel crate's `#![deny(clippy::panic)]` forces every fallible step to a typed error, and the variants distinguish rollback-discharging cases per [T-019 §Rollback contract](../../tasks/phase-b/T-019-task-loader.md#approach)); collapsing them would trade a few hundred `.rodata` bytes for a loss of diagnostic + rollback precision. The footprint is well within ADR-0029's bounded-loader budget. +- *"Replace the exact `intermediate_frame_count` helper with the prior hard-coded `6` to save `.text`."* Rejected — the exact count was the fix for PR #31 review-round 3 Finding 1 (the constant `6` under-counted for image spans crossing more than one 2 MiB L2 slot — e.g. an 8 MiB image needs 7 intermediates, not 6); reverting it would re-introduce a correctness hazard for B5+ larger images. Correctness over `.text` bytes. + +## Measurement + +Not applicable — no proposal under measurement. + +## Regression check + +- `cargo host-test`: **286 / 286 pass** (43 hal + 187 kernel + 53 test-hal + 3 doc-tests; +60 vs B3 closure, of which +34 is the B4 implementation arc and +26 is the master-review remediation). 0 failed. +- `cargo fmt --check`: clean. +- `cargo host-clippy` (`clippy --all-targets -D warnings`): clean. +- `cargo kernel-clippy` (`-D warnings`): clean — the kernel crate's `#![deny(clippy::panic)]` discipline confirms `load_image` has no `panic!` / `unwrap` / `expect` on the kernel-reachable path. +- `cargo kernel-build`: clean. +- `cargo +nightly miri test`: **not run locally** — Miri is not installed on the pinned nightly-2026-01-15 toolchain on this host. It **is** a blocking CI gate: the `miri (Stacked Borrows)` job runs the host-test suite under Stacked Borrows with no `continue-on-error` and is listed as a required status check (see [`infrastructure.md` §"Miri as a blocking gate"](../../../standards/infrastructure.md#miri-as-a-blocking-gate); MR-009 remediation). The master review reproduced 260/260 under Miri with zero detected UB at anchor `288ddb2`; the +26 remediation tests are additive (failure-injecting fakes + rollback paths) and do not introduce new aliasing surface. +- **QEMU smoke trace:** verbatim in [§"Source A"](#source-a--single-run-release-smoke-trace-canonical-record) above; full demo through `tyrne: all tasks complete`; `-d int,unimp,guest_errors` reports **629 events; 629/629 (100 %) are the pre-existing `PL011 data written to disabled UART` warnings**. Zero non-PL011 events: **zero Translation faults, zero Permission faults, zero "Taking exception" lines, zero unallocated/unimplemented events.** The Δ from B3 (~526) to B4 (629) is exactly the new banner-line + longer-line UART bytes; no new fault class. Critically, **the loader's two real `cap_map` page-table walks produced zero Translation/Permission faults** — the first runtime exercise of the post-bootstrap `Mmu::map` path is fault-clean. +- **Security cross-reference:** see [B4 closure security review](../security-reviews/2026-05-28-B4-closure.md) — no security-sensitive path regressed by perf-relevant changes; UNSAFE-2026-0027 (loader `copy_nonoverlapping`) and UNSAFE-2026-0028 (`from_existing_root`) are the audit-log surface for this period, both adjudicated against `unsafe-policy.md §3`. +- **`unsafe` diff (this is a measurement-only review — no `unsafe` was added by *perf* work):** the period's audit-log surface is functional, not performance-driven. UNSAFE-2026-0027 (T-019 `task_loader` `copy_nonoverlapping` byte-copy) introduced standalone; UNSAFE-2026-0028 (`QemuVirtAddressSpace::from_existing_root`) opened by the MR-011 remediation (the only production `unsafe fn` that previously lacked an audit entry — the call site had mis-attributed its unsafety to 0010+0014). UNSAFE-2026-0025 (post-bootstrap `Mmu::map` descriptor writes) and 0026 (`Pmm::alloc_frame` zero-fill) had their "Pending QEMU smoke verification" notes **lifted** via 2026-05-14 Amendments — T-019 is the first runtime exerciser of both (the boot smoke now runs `load_image` → `cap_create_address_space` + `cap_map`). Total audit-log entries: 28 (0001–0028; 0012 Removed → 27 Active). The MR-010 PMM interval rewrite and MR-022 enqueue helper — the actual perf-relevant code changes this period — added **zero** `unsafe`. +- **No code added to enable the boot-to-end timing increase** — the increase is purely QEMU TCG software-MMU emulation overhead for the new post-bootstrap `cap_map` descriptor-write + flush sequence, not the result of an inefficient implementation. The §"Hotspot" analysis decomposes the contributions; the production hot-path complexity table in the [master-review X2 track](../master-review/2026-05-22-152729/tracks/X2-performance.md#hot-path-complexity-table) confirms every production path is O(1) or bounded. + +**Drift flagged (non-blocking, hand off to the business retro's §Adjustments):** + +- **`current.md` still cites 260 host tests** — the post-T-019/PR-#31 count, captured before the master-review remediation's +26. The post-remediation count is **286**. The README was de-hardcoded to a link per MR-015 and is fine; the residual stale literal is in `current.md`. This is post-remediation drift and is the cleanest single B4-closure footprint Adjustment to record. + +## Verdict + +**Baseline — no proposal under measurement.** + +The B4 closure baseline records a **+5.33 / +5.70 / +5.33 ms p10 / p50 / p90 boot-to-end increase under QEMU TCG** versus the B3 closure baseline, alongside a **+9,116-byte `.text` / +1,024-byte `.rodata` / +6,240-byte `.bss`** image-footprint growth (total ~84.0 KiB, +25 % vs B3). The timing increase is attributable almost entirely to **T-019's task loader running at boot** — the first post-bootstrap exercise of the per-call `Mmu::map` page-table-walk + TLB-flush sequence under a live MMU, amplified by QEMU TCG's software-MMU emulation (community-benchmarked at 10–100× the equivalent hardware cost). The increase is one-time-at-boot, tightly clustered across percentiles, and projected to be ~40 µs on a real Cortex-A72; it is **not** a kernel performance defect. The footprint growth is the loader module + the master-review remediation kernel changes, all within ADR-0029's bounded-loader budget and the project's bounded-footprint discipline. + +This is a **measured non-change** in the master-plan's sense: a re-baseline with no optimisation proposed (the one production perf finding from the master review, X2-001, was already remediated in PR #32). The baseline is recorded as the canonical reference for B5+ regression checks. Cite the band (p10 / p50 / p90 = 15.641 / 17.587 / 19.150 ms; release; 20-iteration harness; QEMU 10.2.2 on Apple Silicon / Rosetta) when comparing later changes against this snapshot. Single-run boot-to-end claims in PR bodies should be replaced with a fresh harness run when a non-trivial perf-relevant change lands; see [`docs/standards/infrastructure.md` §"Performance harness"](../../../standards/infrastructure.md#performance-harness). + +### B3 closure §Forward-flagged items — closure status + +The [2026-05-14 B3 closure baseline §Forward-flagged items](2026-05-14-B3-closure.md#forward-flagged-items) listed 4 items; all 4 remain trigger-deferred (none of their triggers have fired in a way that resolves the flag, though T-019 partially advanced two): + +| B3 forward-flag | Status (2026-05-28) | Note | +|---|---|---| +| Real-hardware perf measurement | **Still trigger-deferred** | The QEMU TCG band remains informational; the projected sub-5 ms real-hardware boot-to-end is still untested. **Trigger:** opens with the first Raspberry Pi 4 BSP. This is still the load-bearing missing data point. | +| `Pmm::new` bitmap-init cost linear in extent | **Still trigger-deferred** | v1's 128 MiB extent keeps the cost trivial; T-019 did not change the extent. **Trigger:** opens with the first BSP with > 128 MiB RAM. | +| `AddressSpaceArena` slot-count revisit (N=8) | **Still trigger-deferred (partially advanced)** | T-019 made the *first runtime use of a second arena slot* (`AS cap = idx 1` in the smoke trace) — the bootstrap occupies slot 0, the loader-produced AS occupies slot 1, so v1 now uses 2 / 8 slots at boot. N=8 still comfortably fits. **Trigger:** opens with the first B5/B6 task that lands actual multi-task userspace AS allocation pressure. | +| Activation-hook cost when AS handles differ | **Still trigger-deferred** | The loader produces an AS but does not enter it (running gates on B5/B6); all demo tasks still share the bootstrap AS, so the activation hook still short-circuits to `None` and the differ-path (`Mmu::activate` + `TLBI VMALLE1` global flush) is still unmeasured at runtime. **Trigger:** opens with the first B5+ task that owns and runs in a non-bootstrap AS. | + +### New B4 forward-flagged items + +- **Per-`cap_map` TLB-flush batching for the loader.** v1 flushes once per `cap_map` (correct conservative discipline). When `task_create_from_image` (B5/B6) makes the loader-produced AS live, a batched `TLBI VMALLE1` after the whole loader runs becomes a candidate optimisation. **Trigger:** opens with the first B5/B6 task that enters a loader-produced AS. +- **Loader cost linear in image size.** v1's 8-byte image needs 6 frames; B6's real `hello` binary + larger B5+ images scale the `alloc_frame` (+ zero-fill) and `cap_map` (+ intermediate-frame) cost linearly, and a multi-2 MiB-L2-slot image is the first caller of the exact `intermediate_frame_count` path. **Trigger:** opens with B6's first real `userland/hello/` binary or any B5+ image larger than one L2 slot. +- **Miri-as-Phase-B-exit-bar codification (one-line in-tree doc gap; carried from the master-review remediation).** MR-009 made Miri a blocking CI job (no `continue-on-error`) and listed it as a required gate in [`infrastructure.md`](../../../standards/infrastructure.md#miri-as-a-blocking-gate), but "Miri green = Phase-B exit prerequisite" is **not yet written into the [phase-b.md](../../../roadmap/phases/phase-b.md) exit bar / a Phase-B exit checklist**. This is non-blocking and is the cleanest single B4-closure Adjustment to record (hand off to the business retro). **Trigger:** the Phase-B exit checklist authoring, naturally folded into the B5 prep arc. diff --git a/docs/analysis/reviews/performance-optimization-reviews/README.md b/docs/analysis/reviews/performance-optimization-reviews/README.md index b25d853..04dec10 100644 --- a/docs/analysis/reviews/performance-optimization-reviews/README.md +++ b/docs/analysis/reviews/performance-optimization-reviews/README.md @@ -27,5 +27,6 @@ A dated file `YYYY-MM-DD-.md` in this folder, following the shape in [` | 2026-05-07 | B1 closure post-T-014 re-baseline — net footprint-neutral (`.text` −116 / `.rodata` +144 / `.bss` +8 bytes) after the T-014 idle-dispatch refactor and the comprehensive-review follow-up sweeps; smoke ~5.8 ms boot-to-end | [2026-05-07-B1-closure.md](2026-05-07-B1-closure.md) | | 2026-05-09 | B2 closure baseline — post-T-016 footprint (`.text +364` / `.rodata +16` / `.bss +17,952` — dominated by `.boot_pt` 16 KiB reservation); first release-build harness band p10/p50/p90 = 4.262 / 4.642 / 6.456 ms; `-d guest_errors` 379 events (all pre-existing PL011 noise) | [2026-05-09-B2-closure.md](2026-05-09-B2-closure.md) | | 2026-05-14 | B3 closure baseline — post-T-017 + T-018 footprint (`.text +1,624` / `.rodata +592` / `.bss +1,872`); release-build harness band p10/p50/p90 = 10.311 / 11.884 / 13.823 ms (+6 to +7 ms vs B2 — pure QEMU TCG translation overhead from new code paths; real-hardware projection sub-5 ms); `-d guest_errors` 526 events (all pre-existing PL011 noise; zero non-PL011) | [2026-05-14-B3-closure.md](2026-05-14-B3-closure.md) | +| 2026-05-28 | B4 closure baseline — post-T-019 + master-review remediation footprint (`.text +9,116` / `.rodata +1,024` / `.bss +6,240`; total ~84.0 KiB, +25 % vs B3); release-build harness band p10/p50/p90 = 15.641 / 17.587 / 19.150 ms (+5.33 to +5.70 ms vs B3 — pure QEMU TCG software-MMU overhead from T-019's first post-bootstrap `cap_map` page-table walks + TLB flushes; real-hardware projection ~40 µs); `-d guest_errors` 629 events (all pre-existing PL011 noise; zero fault classes; loader's two real `cap_map` walks fault-clean) | [2026-05-28-B4-closure.md](2026-05-28-B4-closure.md) | > First full hypothesis-driven cycle is now infrastructure-unblocked — T-009 + T-012 lit up `now_ns()` at EL1 and provide the measurement primitive IPC round-trip latency needs. The B1 closure baseline above records the static-only metrics; future hypothesis-driven cycles will add IPC round-trip wall-clock measurement, stack high-water-mark probes, and `TrapFrame` slimming for ack-and-ignore IRQ handlers. diff --git a/docs/analysis/reviews/security-reviews/2026-05-28-B4-closure.md b/docs/analysis/reviews/security-reviews/2026-05-28-B4-closure.md new file mode 100644 index 0000000..56be905 --- /dev/null +++ b/docs/analysis/reviews/security-reviews/2026-05-28-B4-closure.md @@ -0,0 +1,116 @@ +# Security review 2026-05-28 — B4 closure consolidated pass (post-T-019 + master-review remediation) + +- **Change:** the B4 arc on `main` — [T-019 task loader](../../../analysis/tasks/phase-b/T-019-task-loader.md) merged via PR #31 ([merge `7f876af`](https://github.com/HodeTech/Tyrne/commit/7f876af); 7 bisectable commits `911f2ad`/`5711756`/`ae31bc8`/`196d3fb`/`164522d`/`5b1f153`/`95efd62` + doc/round-fix commits `74694d4`/`5078944`/`eb14c51`), preceded by [ADR-0029](../../../decisions/0029-initial-userspace-image-format.md) (Initial userspace image format, `Accepted` 2026-05-14, PR #30 [merge `e09755d`](https://github.com/HodeTech/Tyrne/commit/e09755d)) — *plus* the **master-review remediation** PR #32 ([merge `50bffe9`](https://github.com/HodeTech/Tyrne/commit/50bffe9)) that closed the 2026-05-22 full-tree review's Blocker+Major backlog (commits `a6e909d` MR-001 / `8063ee2` MR-006/005/019/020 + ADR-0036 / `fbc3d3f` MR-002/003/007/008/009 CI honesty / `59f9309` MR-005/011/017/018 / `57bc2e6` MR-010/018 / `348971e` MR-022/017/018 / `24530fb` MR-012/013/014 / `4e241d9` MR-016/019 / `4141158` MR-015/004 / `a2e7257` D3-005/006/007 + review-round commits `ae8fbd7`/`8ceb4fb`/`c843ecd`), the org migration `cd4cb6e` (cemililik/Tyrne → HodeTech/Tyrne), and the README clarity pass `3ab029f` (HEAD). Period under review: 2026-05-14 → 2026-05-28. +- **Reviewer:** @cemililik (+ Claude Opus 4.8 (1M context) agent acting adversarial across the eight axes the [security-review master plan](master-plan.md) defines). +- **Separation from code review:** standalone consolidated pass scoped to the post-B3 → post-T-019 + post-remediation surface. T-019 itself went through **six PR-rounds** on PR #31 (rounds 1–6; reviews #1–#6 — review #2 P1 surfaced the PA-overlap soundness gap, round 4 P2 surfaced the misaligned-base root-frame-leak path; see [business retrospective §"Review-round arc on PR #31"](../business-reviews/2026-05-28-B4-closure.md)) and the whole shipped tree went through the **2026-05-22 master review** (4 waves, 25 track agents, including a dedicated X1-security pass that returned PASS and an X3-unsafe-audit pass that confirmed the log fully in sync). Those passes were per-PR / whole-tree; this artefact is the *closure-trio* security pass scoped to the B4 milestone, performed with a fresh checklist after a deliberate context switch. Sibling trio legs: [business retrospective](../business-reviews/2026-05-28-B4-closure.md) and [performance baseline](../performance-optimization-reviews/2026-05-28-B4-closure.md). +- **Unsafe audit cross-reference:** [UNSAFE-2026-0027](../../../audits/unsafe-log.md#unsafe-2026-0027--task-loader-frame-byte-copy-via-coreptrcopy_nonoverlapping-in-task_loaderload_image) (**new**; T-019 `task_loader::load_image` frame byte-copy via `core::ptr::copy_nonoverlapping`, with four post-introduction Amendments hardening the safe-API boundary — PA-overlap preflight, `phys_frame_kernel_ptr` helper + VA-range preflight, exact intermediate-frame count, and the misaligned-base alignment preflight); [UNSAFE-2026-0028](../../../audits/unsafe-log.md#unsafe-2026-0028--wrap-an-already-live-populated-vmsav8-l0-root-via-qemuvirtaddressspacefrom_existing_root) (**new**; `QemuVirtAddressSpace::from_existing_root` — audit-trail completion for a pre-existing `unsafe fn`, opened by the MR-011 / X3-001 remediation, second-reviewer-required per unsafe-policy §Review.4); [UNSAFE-2026-0025](../../../audits/unsafe-log.md#unsafe-2026-0025--qemuvirtmmumap--unmap-page-table-descriptor-writes) + [UNSAFE-2026-0026](../../../audits/unsafe-log.md#unsafe-2026-0026--pmm-frame-zeroing-via-coreptrwrite_bytes-in-pmmalloc_frame) (`Pending QEMU smoke verification` status notes **lifted** via 2026-05-14 Amendments — T-019 is the first runtime exerciser of both post-bootstrap `Mmu::map` and `Pmm::alloc_frame`'s zero-fill); the MR-005 `d8`–`d15` FP callee-saved enumeration added to the `ContextSwitch` `# Safety` contract + the ADR-0020 rider. Entries 0001..0028 (with 0012 `Removed`) re-verified against the post-PR-#32 source — append-only discipline holds, no in-place body edits (X3-unsafe-audit pass confirmed this whole-tree; the two new entries land under the Operation / Invariants / Rejected-alternatives shape). + +> **Canonical source for B4 closure metrics.** The [business retrospective](../business-reviews/2026-05-28-B4-closure.md) + this artefact + the [performance baseline](../performance-optimization-reviews/2026-05-28-B4-closure.md) are the source of truth for B4's closing audit / security numbers. Other locations referencing B4 audit state ([`current.md`](../../../roadmap/current.md), [`phase-b.md`](../../../roadmap/phases/phase-b.md), [`docs/audits/unsafe-log.md`](../../../audits/unsafe-log.md) entry-body Amendments, T-019 review-history rows, the [2026-05-22 master-review consolidated report](../master-review/2026-05-22-152729/consolidated.md)) are *summaries at their layer of abstraction*; corrections start here. + +> **Closure-trio smoke gate (inherited, satisfied).** Per [master-plan §Closure-trio coordination cross-reference](master-plan.md), a security review shipped *as a leg of a closure trio* inherits the business master-plan's "no closure-trio without recorded smoke" rule. The verbatim QEMU smoke trace + `-d int,unimp,guest_errors` event count (629, 100% pre-existing PL011-disabled-UART noise, zero Translation/Permission/Taking-exception/unimp events) live in the [business retrospective §"Smoke trace"](../business-reviews/2026-05-28-B4-closure.md); this leg inherits that satisfied gate. The smoke trace is the closure's most important verification artefact — T-019's `load_image` is the **first runtime exerciser** of every audited post-bootstrap raw-pointer site (UNSAFE-2026-0025/0026/0027) and the boot now prints `tyrne: image loaded (entry = 0x800000; sp = 0x802000; image bytes 8; stack bytes 4096; AS cap = idx 1)`. + +--- + +## 1. Capability correctness + +Adversarial frame: *can a caller invoke `load_image` (or any operation it composes) without holding the capability the security model requires? Does the task-loader surface widen any existing capability check, open an authority-leak path, or break a revocation-cascade property the B3 closure verified?* + +- **`load_image` mints a child AS cap exclusively through `cap_create_address_space` — no new cap-minting site.** OK — [`task_loader::load_image`](../../../../kernel/src/obj/task_loader.rs) does not insert into the capability table directly; it composes [`cap_create_address_space`](../../../../kernel/src/mm/address_space.rs) (which the B3 closure verified runs the full eight-step preflight chain: parent lookup → `CapKind::AddressSpace` check → `parent.rights ⊇ DERIVE` → `new_rights ⊆ parent.rights` no-widening → depth preflight → capacity preflight → `pmm.alloc_frame` → `cap_derive`). The new AS cap is wired as a **child of `parent_as_cap`** via `cap_derive` (not `insert_root`), so revoking the parent cascades to it — the same revocation-tree property T-018 established. **Adversarial probe — does T-019's own pre-step re-implement (and possibly weaken) the DERIVE check?** No: the §Simulation row-2 pre-step does a `parent_as_cap` lookup + `CapKind::AddressSpace` sanity check **only** (the `resolve_address_space_cap` helper is kind-only by v1 design), and **delegates** the DERIVE-rights enforcement to `cap_create_address_space`'s step 2a — surfacing it as `LoadError::AddressSpaceCreationFailed(CapError(InsufficientRights))` rather than duplicating a check that could drift. Verified by `task_loader::tests::missing_derive_surfaces_via_address_space_creation_failed`. +- **No-widening holds end-to-end: the loaded AS cap cannot carry more rights than the parent.** OK — `new_rights` is passed through to `cap_create_address_space`'s step 2c (`new_rights ⊆ parent.rights`); a caller cannot use `load_image` as a rights-amplification primitive. The master-review X1-security pass independently re-verified narrowing-only enforcement across `cap_copy`/`cap_derive` (rejecting any non-subset `new_rights` before `pop_free`) — confirming no side effect or allocation precedes the subset check. +- **`load_image` is capability-gated at the right granularity for v1, with the per-operation-rights gap unchanged and re-flagged forward.** **Flagged (non-blocking; deferred to B5+ ADR — carried forward unchanged from B3).** The image / stack mappings inside `load_image` route through [`cap_map`](../../../../kernel/src/mm/address_space.rs), which (per the B3 closure §1 and re-confirmed here against the post-remediation source) checks `CapObject::AddressSpace(_)` **kind only** — the rights bits are not consulted before `mmu.map` mutates the page tables. A holder of *any* AS cap (even one minted with `CapRights::empty()`) gets full mapping authority on that AS. **v1 risk surface:** kernel-init holds the bootstrap AS cap with all rights; no userspace exists; the gap is unreachable from any attacker-observable path. The master-review found the same property (X1-F3: kind-checking is correctly caller-side; X1-005: the IPC-transferred-cap depth re-rooting is a bounded-resource property, not an authority escalation). The forward mitigation is the B5+ ADR pairing `CapRights::{MAP, UNMAP, ACTIVATE}` with `CapKind::MemoryRegion` (the `CapRights` enum exposes no `MAP`/`UNMAP` bit today — confirmed at [`kernel/src/cap/rights.rs`](../../../../kernel/src/cap/rights.rs); there is literally no bit to pass). T-019's BSP smoke call site carries a multi-line comment recording the v1 kind-only contract + the additive-update path (the deliberate reject-with-documentation of PR #31 round-2 finding F3). +- **The `LoadedImage` return value carries a `CapHandle`, not a forged or ambient authority.** OK — `LoadedImage { as_cap, entry_va, stack_top_va, image_bytes, stack_bytes }`'s `as_cap` is the `CapHandle` for the newly-`cap_derive`d AS cap; it is a normal table-local handle with generation-tag staleness protection, indistinguishable from any other minted cap. `LoadedImage` does **not** mint a `CapHandle{CapObject::Task(...)}` and confers no execute-at-EL0 authority — the runnable-task surface (and its attendant authority) opens in B5/B6 (see §2 and §8). **Adversarial probe — can the rollback path leak or strand an authority?** On any post-`cap_create_address_space` failure (`OutOfFrames`/`MapFailed`), the loader's rollback uses `cap_drop(loaded_as_cap)` — **not** `cap_revoke` — which is the correct choice: the freshly-minted AS cap is a childless leaf, `cap_drop` `free_slot`s it (generation bump + `entry = None`), requires only `!has_children` (satisfied) and is rights-agnostic, whereas `cap_revoke` would walk *descendants* (none exist) and additionally demand `CapRights::REVOKE` the caller's `new_rights` may omit. No authority is stranded in the table; the AS-arena slot / L0 root / intermediate page-table frames leak per the documented v1 baseline (a *resource* leak, not an *authority* leak — see §4). +- **B3 + prior-phase capability invariants intact under the remediation churn.** OK — re-verified against post-PR-#32 source: + - The master-review remediation touched `cap`/`ipc`/`mm`/`sched` but **added no capability-minting site and removed no check**: MR-010 rewrote the PMM overlap helper to O(R) interval arithmetic (allocator internals, no cap surface); MR-022 centralised the infallible `enqueue_ready` helper (scheduler internals); MR-017 fixed an `IrqState` polarity inversion *between test fakes* (no production cap path); MR-018 added failure-injecting fakes + `cap_map`/`load_image` rollback tests (test-only). None of these is a capability-check change. + - `cap_copy` peer-depth, `cap_derive` saturating depth cap, `cap_revoke` BFS reachability (kind-agnostic walk; carries a written size-proof + debug-assert/release-break corruption guard per X1-security), `cap_take` move-out ordering, `CapRights::from_raw` reserved-bit masking (X1-P1: masks unknown bits before the ABI exists, with a test), `CapabilityTable::new` build-time capacity assertion — all unchanged. + - **Adversarial probe — did MR-011's new `from_existing_root` audit entry change any capability behaviour?** No: UNSAFE-2026-0028 is an **audit-trail-completeness** fix — the `unsafe fn` and its sole call site (bootstrap-AS wrap in `kernel_entry`) are pre-existing and behaviourally unchanged; the remediation only narrowed the call-site SAFETY block (which had mis-attributed the wrap's unsafety to 0010+0014) and opened the missing entry. No cap path moved. + +## 2. Trust boundaries + +Adversarial frame: *what new boundaries did T-019 introduce, and what untrusted bytes (or values) cross them? Does the master-review remediation reshape any boundary?* + +- **No userspace → kernel boundary opens in B4 — but B4 builds the surface the B5/B6 boundary will guard, and that forthcoming attack surface is flagged.** **Flagged (forward; N/A-until-B5, non-blocking).** v1 still has **no EL0, no syscall, no userspace task** — `load_image` produces a `LoadedImage` *descriptor* and does **not** execute the image. The CPU-privilege boundary (security-model.md boundary 1) therefore does not open in B4; the loader runs entirely in kernel mode at boot consuming a BSP compile-time-constant `&[u8]`. **The forthcoming B5/B6 attack surface, named so the closure carries it:** (a) the EL0→EL1 SVC dispatch + copy-from/to-user path (ADR-0030/0031) is where the *first real* untrusted-input boundary lands; (b) the `task_create_from_image` wrapper that turns this `LoadedImage` into a runnable `CapHandle{CapObject::Task(...)}` (phase-b §B4 §3, deferred to B5/B6) is the point an attacker-observable execution context first exists; (c) until ADR-0033's high-half migration (or an interim shared-kernel-mapping shape) gives the userspace AS kernel mappings, an EL1 exception taken while that AS is active would translation-fault on the vector fetch — a correctness gate B5 must close before the image is ever run. The master-review X1-security pass framed this identically ("v1 has no userspace yet; the boundary code is forward-built for B5+"). +- **The task-loader image is a raw-flat byte stream with ZERO structured-metadata parser — the strongest possible posture for the v1 loader.** OK — per [ADR-0029](../../../decisions/0029-initial-userspace-image-format.md), the embedded blob has no headers, no segment table, no symbol table: offset 0 is the entry instruction; every other byte is `copy_nonoverlapping`-ed verbatim into a freshly-zeroed frame. **There is no parser and no attacker-controllable structured-input surface in v1** (master-review X1-P-equivalent finding). The format choice deliberately defers the parser-as-attack-surface to a B5+ successor ADR (when filesystem-loaded modules eventually land); ADR-0029 §Consequences names this explicitly — "Option 1's complexity sits in the *loop* rather than in *parsing* (which is an attacker-controllable input surface in B5+)". The only *interpreted* loader inputs are scalars (`image_base_va`, `stack_size_pages`, `parent_as_cap`, `new_rights`), each validated with total/saturating arithmetic (`div_ceil`, `is_multiple_of`, `saturating_*`) — re-read for truncation/off-by-one and found clean. +- **The `copy_nonoverlapping` non-overlap invariant is now discharged at runtime, not by BSP-layout trust.** OK — the load-bearing soundness argument for UNSAFE-2026-0027's "source and destination do not overlap" was originally upheld at the BSP-wiring layer (`.rodata`-resident `USERSPACE_IMAGE` ⊆ PMM-reserved kernel-image range per ADR-0027 + ADR-0035). PR #31 review-round 1 P1 (review #2) observed this was not mechanically enforced at the safe `load_image` API boundary — a safe in-crate caller could construct a `Pmm` over an extent overlapping the image slice's PA range, at which point a future `alloc_frame()` could return a frame aliasing the image source and `copy_nonoverlapping` would invoke UB. The fix landed as [`Pmm::could_yield_pa_overlapping`](../../../../kernel/src/mm/pmm.rs) + a §Simulation row-4 preflight that converts the image VA → PA range under v1's identity-mapped kernel AS and rejects with `LoadError::ImageOverlapsAllocatableMemory` **before any state change**. The master-review independently rated this a strong boundary (**X1-P2**: "converts a 'trust the BSP linker script' argument into a typed fail-fast rejection covering root, intermediate, *and* leaf frames"). Pinned by `task_loader::tests::rejects_when_image_overlaps_allocatable_memory` + `accepts_image_disjoint_from_pmm_extent`. **Adversarial probe — does the runtime preflight depend on the identity-map assumption?** Yes, the VA→PA conversion is correct only under ADR-0027's v1 identity-mapped kernel AS; this is the same identity-mapping dependence every kernel-resident raw-pointer site shares (UNSAFE-2026-0025/0026/0027), and the future ADR-0033 high-half migration introduces a `virt_to_phys` helper at the loader's call site as part of a project-wide sweep — not a T-019-local change. +- **The kernel never dereferences a raw userspace pointer through the loader path.** OK — the only raw-pointer dereferences `load_image` performs are (a) the `copy_nonoverlapping` *source* (a Rust slice `&[u8]` with the slice invariant) and (b) the *destination* (a `PhysFrame` the loader itself just got from `pmm.alloc_frame()`, identity-mapped to a kernel-writable VA). Neither is a userspace-supplied pointer; there is no userspace to supply one. The destination cast routes through the [`crate::mm::phys_frame_kernel_ptr`](../../../../kernel/src/mm/mod.rs) helper (added in PR #31 round-2 F1) which centralises the identity-map invariant so the future high-half migration replaces one helper body, not every caller. +- **No new MMIO surface; no cross-task IPC authority change.** OK — T-019 + the remediation add no new device range and no new memory-mapped peripheral. IPC authority semantics are unchanged: the master-review re-verified `ipc_send` enforces `TRANSFER` via a non-mutating `lookup` *before* the irreversible `cap_take`, and `ipc_recv` pre-checks `ReceiverTableFull` before the state-moving `replace` — so a sender cannot transfer authority it does not hold and a full receiver table never drops an in-flight cap. + +## 3. Memory safety + +Adversarial frame: *can a hostile caller, or an unexpected control-flow sequence, cause any audited unsafe region (UNSAFE-2026-0001..0028) to violate the invariants it claims, in light of the new T-019 surface and the master-review remediation? Are the two new unsafe entries policy-conformant?* + +- **UNSAFE-2026-0027 (task-loader frame byte-copy) lands cleanly under [`unsafe-policy.md`](../../../standards/unsafe-policy.md) §3, with four boundary-hardening Amendments.** OK — the entry covers `core::ptr::copy_nonoverlapping(src, dst, chunk.len())` per image/stack page, with six enumerated invariants: (a) **source validity** (the `image: &[u8]` slice invariant bounds the chunk read); (b) **destination alignment + exclusive ownership** (`frame: PhysFrame` is `PAGE_SIZE`-aligned by `PhysFrame::from_aligned`, exclusively owned between `alloc_frame` and `cap_map` — no peer alias); (c) **destination identity-mapped to a kernel-writable VA** (ADR-0027); (d) **destination freshly zero-filled** (UNSAFE-2026-0026's contract makes tail bytes `chunk.len()..PAGE_SIZE` automatically zero — the partial-last-page tail-zeroing is not a second write); (e) **source/destination disjoint** (`.rodata` vs PMM-managed RAM — now *runtime-enforced* per the round-1 Amendment, see §2); (f) **`chunk.len() ≤ PAGE_SIZE` overflow-free**. The "Rejected alternatives" subsection records why `write_volatile`/`volatile_copy_nonoverlapping` (plain RAM needs no MMIO ordering; volatile would falsely advertise it), `from_raw_parts_mut` + slice copy (same `unsafe`, different surface), an Amendment-of-0026 (different operation primitive / source / ownership-proof chain), a HAL `copy_into_frame` method (relocates not removes the `unsafe`), and a stack-staging buffer (4 KiB/iter for zero benefit) were all rejected. The four Amendments (PA-overlap preflight; `phys_frame_kernel_ptr` helper + VA-range preflight; exact `intermediate_frame_count`; misaligned-base alignment preflight) each **tighten the safe-API boundary in front of** the audited operation without changing the operation itself — the canonical shape for append-only hardening. **Adversarial probe — could the exact-intermediate-count fix (round-3 F1) have masked a soundness regression?** No: the old `INTERMEDIATE_FRAME_BUDGET = 6` constant *under-counted* for multi-L2-slot spans, which would have surfaced as `LoadError::MapFailed(OutOfFrames)` mid-loop with the rollback still correct — a *taxonomy* drift, not a soundness break; the fix makes the public `LoadError` match the §Simulation promise. Smoke-verified: the single `copy_nonoverlapping` writes 8 bytes from `.rodata` per boot; host-tested by `task_loader::tests::tail_zeroing_on_partial_last_page` (100-byte image; payload 0..100 matches, 100..4096 zero) under Miri's Stacked Borrows discipline. +- **UNSAFE-2026-0028 (`from_existing_root` wrap of a live non-zero L0 root) is policy-conformant and closes the one audit gap the master-review found.** OK — the entry covers constructing a `QemuVirtAddressSpace` naming the **already-live, already-populated** bootstrap `VMSAv8` L0 root **without** zero-filling it (a contract deliberately distinct from `Mmu::create_address_space`, which requires a zero-filled root). Five invariants: root is the live `__boot_pt_l0` frame `mmu_bootstrap` installed into `TTBR0_EL1`; caller runs strictly after `mmu_bootstrap`; exactly-one-wrapper alias-freedom; **no zero-fill is performed (and none must be — zero-filling a live root would unmap the running kernel out from under itself)**; subsequent `map`/`unmap` rely on the UNSAFE-2026-0025 walker invariants. Rejected alternatives: routing through `create_address_space` (would demand the kernel-erasing zero-fill), making it `safe` (the "live-and-populated" precondition is not type-expressible), attributing to 0010+0014 (the pre-fix mis-attribution — those cover the surrounding StaticCell/arena publish mechanics, not the wrap operation). **Adversarial probe — was this a behaviour change disguised as an audit fix?** No: X3-001 / the X3-unsafe-audit pass confirmed `mmu_bootstrap` populates the exact frame and `kernel_entry` runs post-bootstrap; the contract was always sound and the sole caller always honoured it. The fix is the audit *record* catching up to live code. Security-sensitive (boot + MMU root install) → second-reviewer required per [unsafe-policy §Review.4](../../../standards/unsafe-policy.md), satisfied. +- **UNSAFE-2026-0025 / 0026 `Pending QEMU smoke verification` notes lifted — T-019 is the genuine first runtime exerciser.** OK — both notes lifted via 2026-05-14 Amendments. The B3 closure *predicted* T-018 might be the first `alloc_frame` exerciser; the Amendment honestly corrects this — T-018's bootstrap AS uses `from_existing_root`/`wrap_bootstrap` (it wraps the live `.boot_pt` L0 frame and does **not** call `alloc_frame` for an L0 root), so the **first** kernel-side `alloc_frame` for a translation table came when T-019 minted the *second* AS via `cap_create_address_space`. The smoke now runs `load_image` → `cap_create_address_space` (1 root frame) + per-image-page + per-stack-page + up-to-6 intermediate page-table frames, all through 0026's zero-fill and 0025's per-call `Mmu::map` page-table descriptor writes, with the trace reaching `tyrne: all tasks complete` and `-d int,unimp,guest_errors` showing zero new fault classes. The map-path invariants (root-frame validity inherited via `create_address_space`'s contract → induction through the table-descriptor chain; index bounds `[0,511]`; volatile discipline; leaf-written-last ordering; `MapperFlush` discharge via `cap_map`'s internal `token.flush(mmu)`; host-tested encoders) all hold under runtime evidence. +- **The unmap path stays runtime-unexercised in v1 — host-tests + Miri are the evidence base.** OK (forward-flag, carried). v1 has no userspace caller of `cap_unmap`; the rollback paths inside `load_image` exercise the cap-side cleanup but the v1 demo never arms a block-descriptor unmap. Host-test coverage in `cap_unmap_returns_unmapped_frame` + the `task_loader::tests` rollback tests is the v1 evidence; first runtime exercise gates on B5+ userspace teardown. (Carries forward the B3 closure's "BSP host-test infra for block-descriptor unmap" Adjustment, still trigger-deferred.) +- **The whole audit log is in sync at HEAD; the post-T-019 + post-remediation smoke produces no observable aliasing or UB.** OK — the X3-unsafe-audit master-review pass read all entries with their Amendment chains in full and confirmed **all entries map to real, current code sites; zero stale; zero append-only violations** (the *only* gap at that commit was the missing `from_existing_root` entry, now closed as UNSAFE-2026-0028). At HEAD `3ab029f` the count is 28 entries (0001..0028; 0012 `Removed` → 27 Active). Live gates reproduced this session (pinned nightly-2026-01-15): `cargo host-test` **286 passed / 0 failed**, `cargo fmt --check` clean, `cargo host-clippy` (`--all-targets -D warnings`) clean, `cargo kernel-clippy` clean, `cargo kernel-build` clean. **Miri** is the one verifier not run locally (not installed on the pinned toolchain on this host) — it is the CI gate, and at the master-review commit it passed 260/260 with zero Stacked-Borrows violations / zero detected UB (the +26 tests since are MR-010 PMM failure-path, MR-017 polarity, MR-018 rollback/fake-injection, MR-022 — all Miri-relevant raw-pointer-adjacent paths). The standing residual is X1-002 / MR-009 (Miri-as-CI-gate); see §4 and Verdict. +- **The master-review remediation introduced no new `unsafe`-widening site.** OK — `59f9309` (MR-005 `d8`–`d15` contract) is a **doc/contract** change to the `ContextSwitch` `# Safety` section + an ADR-0020 rider (the shipping BSP already saved `d8`–`d15`; the gap was contract *text* a second BSP author could implement wrong — see §8); MR-011's UNSAFE-2026-0028 is audit-trail-only; MR-010/022/017/018 are safe-Rust allocator/scheduler/test changes. No production `unsafe` block widened. + +## 4. Kernel-mode discipline + +Adversarial frame: *does any new T-019 or remediation code path violate the principle that kernel-mode work be minimal, deterministic, and panic-only-on-impossibility?* + +- **`load_image` is structurally bounded — every step is O(1), O(pages), or the bounded PMM scan.** OK — the loader is: argument preflight (O(1)) → cap preflight (O(1)) → VA-range preflight (O(1)) → frame-budget preflight (O(1), via the exact `intermediate_frame_count` VMSAv8 index decomposition) → PA-overlap preflight (O(R) interval arithmetic per the MR-010 rewrite) → `cap_create_address_space` (O(1) + bounded PMM scan) → image-page loop (O(image_pages), each iteration: `alloc_frame` bounded scan + `copy_nonoverlapping` of ≤ 4 KiB + `cap_map` O(4) walk) → stack-page loop (same shape). No unbounded loop; the loops are bounded by the page count, which the frame-budget preflight bounds by `pmm.stats().free_frames`. The master-review (X1 Axis 4) independently confirmed "no unbounded loops on the hot path" tree-wide and that the MR-010 PMM overlap helper is now O(R) not O(n). +- **Every `load_image` failure path returns a typed `LoadError`, never a panic.** OK — the 10-variant `LoadError` (`InvalidImage`/`InvalidStackSize`/`MisalignedImageBaseVa`/`InvalidParentCap`/`FrameBudgetExceeded`/`InvalidImageBaseVa`/`ImageOverlapsAllocatableMemory`/`AddressSpaceCreationFailed`/`OutOfFrames`/`MapFailed`) covers every fallible step; `cargo kernel-clippy -D warnings` (enforcing `#![deny(clippy::panic)]`) is clean at HEAD. The exhaustiveness regression `load_error_variants_pattern_match_exhaustively` would compile-fail if a variant were silently removed. **Adversarial probe — does the rollback path panic on an "impossible" state?** No: rollback frees the leaf frames + reverses committed mappings + `cap_drop`s the leaf AS cap; the documented v1 baseline leaks (AS-arena slot, L0 root, intermediate L1/L2/L3 frames) are a *resource* trade-off, not a panic. The master-review (X1 Axis 4) confirmed the analogous `ipc_recv_and_yield` deadlock path "is handled, not hung" with symmetric ADR-0032 rollback returning a typed `SchedError::Deadlock`. +- **The leak-path-closure discipline is preserved and extended: every fallible check runs before the first irreversible PMM commitment.** OK — T-019 inherits T-018's creation-side preflight (the depth check runs before `pmm.alloc_frame`) and *adds* the round-4 fix that moved the `image_base_va` alignment check into the **argument preflight (row 1)** before any `cap_create_address_space` call — closing a path where an unaligned base would surface from the first `cap_map` only *after* the root L0 frame was allocated, leaking it. The renamed test `rejects_misaligned_image_base_va_with_pmm_byte_stable` asserts `pmm.stats().free_frames == pmm_before`. This is exactly the "no irreversible commitment before the last fallible check" discipline the B3 closure verified, now extended to the loader's argument surface. +- **No allocation in ISRs; bounded kernel resources hold; DAIF discipline preserved.** OK — T-019 runs at boot in `kernel_entry`, not in any interrupt context; it allocates only PMM frames (typed `OutOfFrames` on exhaustion, never a panic — matching the security-model "Bounded kernel state" invariant). The master-review re-verified the ISR (`irq_entry`) allocates nothing and the bounded-resource invariant holds tree-wide. The MR-022 centralised-`enqueue_ready` helper consolidates the scheduler's infallible-enqueue invariant into one auditable home (addressing X1-F4's "the invariant should have one home") — a defence-in-depth refactor, not a behaviour change. +- **Standing residual: Miri — the only mechanical verifier of the raw-pointer aliasing discipline — is now a *blocking* CI job but not yet written into the Phase-B exit bar.** **Flagged (process; non-blocking for this closure; the cleanest single B4-closure Adjustment to record).** The master-review's MR-009 / X1-002 (Major-for-Phase-B-exit) observed Miri was a *manual* gate. The remediation (`fbc3d3f`) **closed the mechanical half**: Miri is now a blocking CI job (no `continue-on-error`) and is listed as a required gate in [`infrastructure.md`](../../../standards/infrastructure.md). Verified adversarially this session, MR-009 is the **one partial** of the 24 Blocker+Major findings (23 confirmed-fixed): "Miri green = Phase-B exit prerequisite" is **not yet written into the [`phase-b.md`](../../../roadmap/phases/phase-b.md) exit bar / a Phase-B exit checklist" (the exit bar at phase-b.md:3 names the userspace-task milestone but not the Miri gate). This is a one-line in-tree doc gap — non-blocking (Miri *is* enforced mechanically; the gap is the *documented* exit-bar text), and recorded as this closure's single Adjustment (§Forward-flagged items + business-retro §Adjustments). It matters because Miri is the *only* catcher of a future `&mut`-escapes-its-block aliasing regression in the `sched`/`ipc` raw-pointer bridge (the UNSAFE-2026-0012-class UB the bridge was built to remove) — host tests + clippy + the audit log cannot detect it. + +## 5. Cryptography + +Not applicable — no cryptographic primitives, no RNG, no key handling. Unchanged since [`2026-05-14-B3-closure.md`](2026-05-14-B3-closure.md)'s N/A finding; the master-review X1 Axis 5 re-confirmed by grep + dependency-graph read: no hash, cipher, signature, KDF, MAC, or RNG anywhere in the tree, and zero external crates to pull one in transitively. The model's ADR-per-primitive + separate-security-review + redacted-key-types gates remain un-triggered. + +## 6. Secrets and logging + +- **One new serial-output banner introduced by T-019.** OK — `tyrne: image loaded (entry = 0x800000; sp = 0x802000; image bytes 8; stack bytes 4096; AS cap = idx 1)`, inserted between the `address-space-arena ready` line and the `timer ready` line. It contains runtime data: the entry/sp VAs are the userspace-linker-defined base (`0x800000`) + computed stack-top (public layout, ADR-0027-bounded); `image bytes`/`stack bytes` are sizes; **`AS cap = idx 1` is a `CapHandle.index`, not raw capability bits.** The master-review X1 Axis 6 examined this exact banner: "the AS-cap 'idx 1' is a handle index, harmless... not raw rights bits or object contents." There are no raw capability bits to leak in the first place — userspace never sees a cap token; it references a table by handle (`{index, generation}`), which is meaningful only inside one table and is not an unforgeable token. **Adversarial probe — is the userspace-base VA a useful side-channel for a future attacker?** In v1 no userspace exists; the line is logged once at boot to a kernel-only serial console no userspace can observe. In B5+ the kernel-side serial output remains userspace-unreachable; the banner stays a kernel-only diagnostic. +- **The two new audit-log entries + the lifted status notes do not leak sensitive context.** OK — UNSAFE-2026-0027/0028's bodies describe operation primitives, frame-ownership-proof chains, and function/call-site coordinates (e.g. `__boot_pt_l0`, `from_existing_root`, the `copy_nonoverlapping` site) — all inferable from the open-source codebase, no memory-layout secrets an attacker couldn't read from the repo. The 0025/0026 Amendments record smoke-verification status (frame counts, banner text), not secret material. +- **The remediation's `Debug`/log surface is unchanged and clean.** OK — the master-review X1 Axis 6 confirmed production `panic!` sites (scheduler invariant strings, `irq_entry: unhandled IRQ {id}`, `panic_entry`'s `ESR_EL1` + class id) embed no capability state / table contents / frame contents, and `panic_entry` deliberately does not touch `GIC` or mid-transition statics. The forward-watch item (X1-F5: `Capability`'s derived `Debug` + `Message::default()` for the B5 ABI layer) is noted for B5 — confirm cap `Debug` output is never routed to a userspace-observable channel, and the Capability-Debug-redaction (K3-9) item is on the B5 Next list. No v1 leak. + +## 7. Dependencies + +- **Workspace remains zero-extern.** OK — `Cargo.lock` contains exactly four in-tree workspace crates (`tyrne-kernel`, `tyrne-hal`, `tyrne-test-hal`, `tyrne-bsp-qemu-virt`); no `source`/`checksum` lines. ADR-0006's stance is preserved; the master-review X1-P5 rated this "the strongest possible supply-chain position a kernel can hold." T-019 adds no crate (the rights bitfield stays hand-rolled to avoid even a `bitflags` dependency); ADR-0029 §Rejected-alternatives notes both ELF-subset and custom-format alternatives would also have stayed dependency-free. +- **The master-review remediation *hardened* the CI supply-chain hygiene that gates the first future dependency.** OK — `fbc3d3f` (MR-002/003/007/008/009) closed the X1-007 / X1-003 gate-integrity gaps: SHA-pinned third-party GitHub Actions (incl. `taiki-e/install-action`, which executes a downloaded binary — a tag repoint was arbitrary CI code execution), added a top-level `permissions:` block, dropped the `rustup default stable` lie (the pinned nightly shadowed it anyway), aligned the documented required-gate set with the enforced jobs, stopped the RUSTFLAGS clobber, and made Miri a blocking gate. These directly serve security-model adversary #3 (supply-chain tampering) + principle P11 (reproducibility). The dependency-onboarding skill's dead `.claude/skills/` links (X1-008) were swept to `.agents/skills/` (the 2026-05-14 migration target) by the link-sweep commit `4141158`. +- **No `add-dependency` skill invocation in the period.** OK — `cargo-vet`/`cargo-audit` remain dormant-because-no-ops (zero external crates); `infrastructure.md`'s dependency policy is ready-to-arm the moment the first external crate lands. + +## 8. Threat-model impact + +Adversarial frame: *does the B4 arc + the remediation reshape what the system defends against? Are all gaps reconciled with [`security-model.md`](../../../architecture/security-model.md) and honestly documented?* + +- **T-019 is infrastructure for a future security win, not a threat-model shift in itself.** OK — the task loader produces a capability-mediated, freshly-zeroed, per-AS image+stack layout — the substrate per-task isolation (the model's adversary #1, malicious/compromised userspace) will eventually rest on. But v1 enables none of that isolation yet: there is no userspace to isolate. The one concrete shift is that `load_image` makes the per-call `cap_map` page-table-walk path *runtime-reachable* (vs B3's host-test-only), giving the B5+ per-operation-rights ADR a concrete, exercised API to constrain. +- **The PMM zero-fill that the loader now exercises is a real, B5-relevant userspace-isolation defence.** OK — UNSAFE-2026-0026 zeroes every frame before the loader copies into it, so a future userspace task cannot read a previous owner's frame contents (the model's adversary #1 confidentiality property). The master-review X1 Axis 3 called this out specifically: "it zeroes frames before handing them out... the userspace-isolation defense against leaking a previous task's frame contents — directly relevant to the B5+ threat model." T-019 is the first to drive it at runtime. +- **W^X / per-section permissions remain explicitly deferred — and B4 sharpens *why* the deferral must close in B5.** **Flagged (forward; ADR-tracked; non-blocking).** ADR-0029 §Negative + T-019 map the whole image region `USER | EXECUTE` (no per-section `.text` RX / `.rodata` R / `.data` RW discrimination — raw flat carries no section metadata) and the kernel image is still mapped R/W/X (master-review X1-009, `mmu_bootstrap.rs:159`). Both are deferred to the **ADR-0034 (kernel-image section permissions)** placeholder, gated on the first attacker-observable EL0 execution (B5/B6). The positive W^X property that *is* enforced — DEVICE mappings forced non-executable, `DEVICE|EXECUTE` rejected (master-review X1 / C6-P1) — is unaffected. v1's threat model has no userspace, so "all image pages share one flag set" is non-exploitable today; this must close before B5+ runs an attacker-controlled image. +- **ADR-0036 corrects a security-model factual error: QEMU virt is GICv2 / no-IOMMU in v1 — the model's SMMUv3-CI-gate claim is now stale and should be reconciled.** **Flagged (doc-reconciliation; non-blocking; opened by the remediation, carries forward).** [ADR-0036](../../../decisions/0036-qemu-virt-gicv2-no-iommu-v1.md) (Accepted 2026-05-22 via the remediation) corrects ADR-0004/0006/0012's GICv3/SMMUv3 statements: the v1 QEMU virt target is **GICv2 and has no IOMMU**. [`security-model.md`](../../../architecture/security-model.md) §threat-model #7 (line 60: "QEMU `virt` can be launched with SMMUv3 and is used in CI to catch driver-side misbehaviour against SMMU semantics") and §Open questions (line 327: "QEMU `virt` has SMMUv3 and should be the CI gate") now contradict ADR-0036's accepted reality. This is a **documentation contradiction, not a v1 defect** — there is no bus-master driver in v1, so the DMA boundary (model boundary 7 / adversary #7) is N/A today (master-review X1 Axis 8: "DMA/IOMMU: N-A in v1; no bus-master driver exists yet"). But the model's stated CI defence does not exist, and the reconciliation (point security-model.md's SMMU references at ADR-0036, reframe the CI-SMMU-gate claim as a future-Jetson-Orin item) is a forward doc-Adjustment for the security model. Recorded here so the model-reconciliation is tracked. +- **The `d8`–`d15` context-switch contract gap (X1-001 / MR-005) is closed — a cross-board correctness defence the loader's B5+ successors will rely on.** OK — the master-review's most serious whole-tree security finding (the `ContextSwitch` `# Safety` contract + ADR-0020 under-specified the aarch64 callee-saved set, omitting the FP `d8`–`d15`; the shipping BSP saved them but a contract-literal second BSP would silently corrupt FP state across every yield) is remediated by `59f9309`: the `d8`–`d15` FP callee-saved enumeration is now in the `ContextSwitch` `# Safety` contract + an ADR-0020 rider. v1 was always sound (QEMU BSP saved them with a 168-byte compile-time size guard); the fix protects the Pi 4 / Jetson ports the HAL exists for. +- **Reconciliation against security-model.md §Invariants** — each re-confirmed for the B4 surface: + - "No privileged operation without the authorizing capability" — OK (§1: `load_image` composes the cap-gated `cap_create_address_space`/`cap_map`). + - "No ambient authority / capabilities unforgeable, move with consent, narrow on derivation" — OK (§1; no-widening end-to-end; AS cap is a `cap_derive` child). + - "Revocation transitive within a table" — OK (the loaded AS cap cascades on parent revoke; the cross-table CDT open question is unchanged, B5+). + - "The kernel never dereferences raw userspace pointers" — OK (§2: no userspace exists; the only derefs are kernel-owned identity-mapped frames + Rust slices). + - "`unsafe` is audited" — OK (§3: 0027/0028 close the surface; X3 confirmed log in sync). + - "Bounded kernel state / no unbounded allocation" — OK (§4: typed `LoadError` on every path). + - "Fault containment does not leak authority" — partially exercised; v1's fault path is `panic_entry` → halt (no userspace task to suspend yet); the supervisor-endpoint `TaskFault` delivery is B5+ forward work. +- **Phase-deferred placeholders unchanged.** ADR-0033 (high-half migration; gates B5 per-task TTBR0_EL1 swap — and is the prerequisite for ever *running* the `LoadedImage`, since the userspace AS today has no kernel mappings) and ADR-0034 (kernel-image / per-section permissions; gates first attacker-observable EL0 execution) remain slot-reserved. ADR-0033/0034 stay named-but-unallocated placeholders; the Phase C–I ADR placeholders were renumbered to ≥ ADR-0037 off the live ceiling (MR-001). The forthcoming B5 ADRs (ADR-0030 syscall ABI incl. the K2-5 `IpcError::InvalidCapability` split into `StaleHandle`/`MissingRight`/`WrongObjectKind`; ADR-0031 initial syscall set) are tentative numbers per ADR-0013. + +--- + +## Verdict + +**Approve.** + +All eight axes pass. The B4 arc (T-019 / PR #31) lands the task loader — the first runtime consumer of the AddressSpace + PMM scaffolds — without introducing any new attack surface, capability widening, memory-safety hazard, or threat-model shift. `load_image` is capability-gated (a `parent_as_cap` with `DERIVE`, no-widening, a full leak-path-closure preflight chain that runs every rejectable check before the first irreversible `pmm.alloc_frame`), it produces a `LoadedImage` *descriptor* but does **not** mint a runnable task and does **not** execute userspace — so the EL0/syscall trust boundary remains closed until B5/B6. The two new `unsafe` entries are exemplary: UNSAFE-2026-0027 (`copy_nonoverlapping`) carries six enumerated invariants + four append-only boundary-hardening Amendments (the round-1 PA-overlap preflight converting a BSP-trust argument into a typed runtime rejection is a genuine soundness win — master-review X1-P2), and UNSAFE-2026-0028 (`from_existing_root`) closes the single audit gap the whole-tree X3 pass found, with a correctly-distinct never-zero-fill-the-live-root contract and the required second-reviewer sign-off. UNSAFE-2026-0025/0026's `Pending QEMU smoke verification` notes are honestly lifted — T-019 is the *genuine* first runtime exerciser of both, and the smoke trace reaches `tyrne: all tasks complete` with zero new fault classes. + +This closure reconciles cleanly with the 2026-05-22 master review: its dedicated X1-security pass returned **PASS** (0 security Blocker, 0 security Major in the shipping binary), its X3-unsafe-audit pass confirmed the log **fully in sync** (all entries resolve to live code, append-only intact), and the PR #32 remediation closed **23 of 24** Blocker+Major findings as confirmed-fixed under adversarial per-finding re-verification this session — including the security-relevant `d8`–`d15` cross-board contract gap (X1-001/MR-005), the missing `from_existing_root` entry (X1-001-adjacent / MR-011), and the CI supply-chain hygiene (X1-007/003). The audit log is fully in sync (28 entries; 0012 `Removed` → 27 Active; X3 confirmed all resolve to live code, append-only intact). + +### Adjustment (the single B4-closure remediation residual) + +- **MR-009 (Miri = Phase-B exit prerequisite) is the one partial.** Miri is now a *blocking* CI job (no `continue-on-error`) and a documented required gate in `infrastructure.md` — the mechanical half is closed. The remaining gap is purely in-tree doc text: **"Miri green = Phase-B exit prerequisite" is not yet written into the [`phase-b.md`](../../../roadmap/phases/phase-b.md) exit bar / a Phase-B exit checklist.** One-line, non-blocking (Miri *is* enforced), and recorded as this closure's Adjustment because Miri is the *only* mechanical catcher of a future `&mut`-escapes-its-block aliasing regression in the `sched`/`ipc` raw-pointer bridge. Mirrors the 2026-04-21 Phase-A-exit security review that made the aliasing discipline the #1 Phase-B blocker. + +### Forward-flagged items (carry-forward; non-blocking) + +- **`cap_map` / `cap_unmap` per-operation rights gap** — the kind-only check is the v1 design choice (the `CapRights` enum has no `MAP`/`UNMAP` bit to pass); documented inline at `resolve_address_space_cap`'s rustdoc + the T-019 BSP call-site comment. Trigger: B5+ ADR pairing `CapRights::{MAP, UNMAP, ACTIVATE}` with `CapKind::MemoryRegion` (the same ADR introducing `MemoryRegionCap` for frame-ownership tracking). Non-exploitable today (no userspace). +- **B5/B6 is where the first real userspace → kernel trust boundary opens** — the EL0→SVC dispatch + validated copy-from/to-user + the `task_create_from_image` wrapper (turning this `LoadedImage` into a runnable `CapHandle{CapObject::Task(...)}`) are the forthcoming attack surface. Prerequisite: kernel mappings in the userspace AS (ADR-0033 high-half migration, gated on per-task TTBR0_EL1 swap) — the userspace AS today holds only image + stack, so an EL1 exception while it is active would translation-fault on the vector fetch. +- **UNSAFE-2026-0019 / 0020 / 0021 still carry `Pending QEMU smoke verification`** for the IRQ-take / dispatch path — unchanged; gate on the first preemption-using task that arms a real deadline (B5+ `time_sleep_until`). The `cap_unmap` block-descriptor path is also still host-test-only (gate on first B5+ userspace teardown). +- **security-model.md SMMUv3-CI-gate claim is stale post-ADR-0036** — §threat-model #7 + §Open questions describe QEMU virt as having SMMUv3 in CI; ADR-0036 (Accepted 2026-05-22) corrects this to GICv2 / no-IOMMU in v1. Doc-reconciliation Adjustment for the security model (point the SMMU references at ADR-0036, reframe the CI-SMMU-gate as a future-Jetson-Orin item). N/A as a v1 defect (no bus-master driver exists). +- **ADR-0033 high-half migration + ADR-0034 kernel-image/per-section permissions placeholders unchanged** — ADR-0033 opens when B5 surfaces per-task TTBR0 swap (and is the prerequisite for ever running the `LoadedImage`); ADR-0034 opens with the first attacker-observable EL0 execution (W^X for the kernel image + per-section userspace permissions, likely B5/B6). +- **B-phase PL011 init BSP task still open (trigger-deferred)** — the 629 `-d guest_errors` events are still 100% PL011 "data written to disabled UART" noise (Tyrne rides QEMU reset state with `UARTCR.UARTEN=0`); zero security-relevant fault classes. Queued as a follow-on BSP task since B2. + +Next review trigger: the **B5 closure trio** (Syscall boundary — ADR-0030/0031, EL0→EL1 SVC dispatch, panic-free dispatcher, validated copy-from/to-user, Capability Debug redaction K3-9). diff --git a/docs/analysis/reviews/security-reviews/README.md b/docs/analysis/reviews/security-reviews/README.md index bc1d53c..51755c8 100644 --- a/docs/analysis/reviews/security-reviews/README.md +++ b/docs/analysis/reviews/security-reviews/README.md @@ -37,3 +37,4 @@ A security review is a **separate pass** from the code review — it is performe | 2026-05-07 | B1 closure post-T-014 consolidated pass (T-014 + ADR-0026 idle-dispatch supersession of ADR-0022 §Decision-outcome Option A; UNSAFE-2026-0014 third Amendment for `register_idle`; UNSAFE-2026-0019/0020 partial-verification + post-T-014-smoke Amendments; UNSAFE-2026-0021 no-verification Amendment) | Approve — eight axes pass; no new attack surface; smoke trace clean for full ~6 ms boot-to-end run; pre-existing forward-flagged items unchanged | [2026-05-07-B1-closure.md](2026-05-07-B1-closure.md) | | 2026-05-09 | B2 closure consolidated pass (T-016 MMU activation + identity-mapped kernel + `MapperFlush` flush-token discipline; ADR-0027 + ADR-0009 §Revision rider; UNSAFE-2026-0022/0023/0024/0025 introduced + 0023/0024 bootstrap-Amendments + 0022/0023/0024 smoke-verification Amendments) | Approve — eight axes pass; MMU on with identity-only layout; one new forward-flagged item (UNSAFE-2026-0025 per-call `Mmu::map`/`unmap` smoke verification — gates on first B3+ post-bootstrap caller) | [2026-05-09-B2-closure.md](2026-05-09-B2-closure.md) | | 2026-05-14 | B3 closure consolidated pass (T-017 PMM + T-018 `AddressSpace` kernel object + cap-gated wrappers + activation-on-context-switch; ADR-0035 + ADR-0028; UNSAFE-2026-0026 introduced; UNSAFE-2026-0014 5th Amendment for activation hook + BSP closure; UNSAFE-2026-0025 body-correction Amendment for `MmuError::BlockMapped` variant split) | Approve — eight axes pass; AS kernel-object scaffold lands without widening attack surface; PR #28 five-round arc closed three load-bearing memory-safety items pre-merge; one forward-flagged item (`cap_map`/`cap_unmap` per-op rights gap — deferred to B5+ ADR pairing `CapRights::{MAP,UNMAP,ACTIVATE}` with `CapKind::MemoryRegion`) | [2026-05-14-B3-closure.md](2026-05-14-B3-closure.md) | +| 2026-05-28 | B4 closure consolidated pass (T-019 task loader + ADR-0029; master-review PR #32 remediation; UNSAFE-2026-0027 introduced + 4 boundary-hardening Amendments; UNSAFE-2026-0028 introduced via MR-011 / X3-001 audit-trail completion; UNSAFE-2026-0025/0026 `Pending QEMU smoke verification` notes lifted — T-019 first runtime exerciser; MR-005 d8–d15 `ContextSwitch` contract + ADR-0020 rider) | Approve — eight axes pass; capability-gated `load_image` produces a `LoadedImage` but mints no runnable task / opens no EL0 boundary (B5/B6); reconciles with master-review security PASS + audit-log-in-sync; one Adjustment (MR-009 Miri-as-Phase-B-exit-prerequisite is a blocking CI job but not yet written into the phase-b.md exit bar — one-line doc gap, non-blocking) | [2026-05-28-B4-closure.md](2026-05-28-B4-closure.md) | diff --git a/docs/roadmap/current.md b/docs/roadmap/current.md index b34fcf5..46ec344 100644 --- a/docs/roadmap/current.md +++ b/docs/roadmap/current.md @@ -4,6 +4,8 @@ A short pointer file updated as work progresses. For the full plan see [`phases/ --- +> **2026-05-28 update — B4 CLOSED via the closure trio; B5 (syscall boundary) is next.** The B4 milestone (Task loader) is now formally **Closed**. Its closure trio fired today: [business retrospective](../analysis/reviews/business-reviews/2026-05-28-B4-closure.md) + [consolidated security review](../analysis/reviews/security-reviews/2026-05-28-B4-closure.md) (verdict **Approve**, eight axes pass) + [performance baseline](../analysis/reviews/performance-optimization-reviews/2026-05-28-B4-closure.md) (re-baseline, no proposal). This banner supersedes the 2026-05-16 banner below (which recorded B4 as "implementation-complete; closure trio pending"). **The period under review included the 2026-05-22 full-tree [master review](../analysis/reviews/master-review/2026-05-22-152729/consolidated.md)** (verdict: APPROVE the shipped kernel — 0 code-correctness/security Blockers; issues clustered in CI/doc/ADR) **and its remediation PR #32**, which closed **23 of 24** verified Blocker+Major findings; the one residual (MR-009: Miri is a blocking CI gate but "Miri green = Phase-B exit prerequisite" was not yet written into the phase-b.md exit bar) was closed as a B4-closure follow-up action. **Gates at HEAD `3ab029f` (reproduced live 2026-05-28):** `cargo host-test` **286 / 286** (43 hal + 187 kernel + 53 test-hal + 3 doc-tests; the earlier 260 was the pre-remediation count), `fmt` / `host-clippy` / `kernel-clippy` / `kernel-build` clean; QEMU smoke runs the full demo through `tyrne: all tasks complete` with the `tyrne: image loaded (...)` line; `-d int,unimp,guest_errors` = **629 events, 100 % pre-existing PL011-disabled-UART noise, zero fault classes**. Release perf band p10/p50/p90 = **15.641 / 17.587 / 19.150 ms** (+5.3–5.7 ms vs B3 — one-time boot cost of the loader's first post-bootstrap `cap_map` walks under QEMU TCG; real-hardware projection ~40 µs). Audit log at **28** entries (UNSAFE-2026-0027 + 0028 added; 0025/0026 `Pending QEMU smoke verification` notes lifted — T-019 is their first runtime exerciser). **Next:** open B5 — ADR-0030 (syscall ABI + `IpcError` split) + ADR-0031 (initial syscall set), then EL0→EL1 SVC dispatch. The [B4 closure trio](../analysis/reviews/business-reviews/2026-05-28-B4-closure.md) is the canonical source for these metrics. +> > **2026-05-16 update — T-019 merged; B4 implementation-complete; closure trio pending.** PR #31 merged into `main` at commit `7f876af` ("Merge pull request #31 from cemililik/t-019-task-loader"), landing T-019 (task loader) on `main`. The branch arc continued past the review-round-4 commit named in the 2026-05-15 banner below with two further follow-up commits: `5078944` (review-round 5 — added one PMM host test, taking the suite to **260/260**) and `eb14c51` (review-round 6 — 5 valid findings). T-019 status flips `In Review → Done` (`date_done: 2026-05-16`). **Host-test count at HEAD: 260/260** (42 hal + 175 kernel + 43 test-hal); the 2026-05-15 banner's "259/259" was accurate when written, before the round-5 PMM test landed. B4 is now **implementation-complete**; the **B4 closure trio (business + security + performance reviews) has NOT yet fired** and is the next review trigger (the maintainer sequences it separately). This banner resolves the pre-merge "In Review" state recorded below — that banner is retained as a point-in-time record. > > **2026-05-15 update — T-019 implementation In Review on PR #31; branch `t-019-task-loader` (pre-merge snapshot; superseded by the 2026-05-16 banner above).** Implementation arc lands across **7 bisectable commits** ending at `95efd62` (review-round 4 follow-up — the final substantive commit of the arc; subsequent commits on the branch are post-review doc/style polish): (1) `911f2ad` — `task_loader` module skeleton; (2) `5711756` — `load_image` + UNSAFE-2026-0027; (3) `ae31bc8` — BSP wiring + arch doc + UNSAFE-2026-0025/0026 smoke-verification Amendments; (4) `196d3fb` — review-round 1 follow-up (PA-overlap preflight + `ImageOverlapsAllocatableMemory` variant + `mov w0, #42` placeholder bytes correction); (5) `164522d` — review-round 2 follow-up (`phys_frame_kernel_ptr` helper + VA-range preflight + `InvalidImageBaseVa` variant + BSP `CapRights::empty()` justification); (6) `5b1f153` — review-round 3 follow-up (`intermediate_frame_count` exact-count helper replaces off-by-one constant; VA-range check reordered before frame-budget; doc sync); (7) `95efd62` — review-round 4 follow-up (alignment preflight at row 1 + new `MisalignedImageBaseVa(VirtAddr)` variant closing a preventable root-frame-leak path on internal-API misuse; `accepts_image_disjoint_from_pmm_extent` test made deterministic via `.rodata` static; `FrameBudgetExceeded` variant doc-comment refreshed). The `LoadError` taxonomy is **10 variants**: `InvalidImage`, `InvalidStackSize`, `MisalignedImageBaseVa(VirtAddr)`, `InvalidImageBaseVa { base, end }`, `InvalidParentCap(CapError)`, `FrameBudgetExceeded { needed, available }`, `ImageOverlapsAllocatableMemory`, `AddressSpaceCreationFailed(AddressSpaceError)`, `OutOfFrames`, `MapFailed(AddressSpaceError)`. **First runtime exerciser of UNSAFE-2026-0025 (post-bootstrap `Mmu::map` page-table descriptor writes) + UNSAFE-2026-0026 (PMM `alloc_frame` zero-fill) + UNSAFE-2026-0027 (new — task-loader `copy_nonoverlapping`)**; all three smoke-verified at runtime via the `tyrne: image loaded (entry = 0x800000; sp = 0x802000; image bytes 8; stack bytes 4096; AS cap = idx 1)` boot line. New `docs/architecture/task-loader.md` chapter synthesises the loader sequence + rollback contract + v1 baseline leaks. Tests at HEAD: **259/259** host-test count (round-4 commit reframes the misaligned-VA test rather than adding new ones; the distinctness assertion gained 2 sub-cases but those land inside an existing test) — *correction: 259 was accurate at this commit; review-round 5 (`5078944`) later added one PMM test, so the count at the PR #31 merge is **260/260** per the 2026-05-16 banner above*. All gates clean: `cargo fmt --check`, `cargo host-test`, `cargo host-clippy -D warnings`, `cargo kernel-clippy -D warnings`, `cargo kernel-build`. Smoke trace byte-stable through full demo to `tyrne: all tasks complete`; `-d int,unimp,guest_errors` reports only the pre-existing 629 PL011-disabled-UART warnings. PR #31 awaiting reviewer pass; on merge, B4 implementation half closes and B5 (syscall ABI per ADR-0030) opens for the runnability prerequisites. @@ -52,14 +54,18 @@ A short pointer file updated as work progresses. For the full plan see [`phases/ --- - **Active phase:** B — opened 2026-04-21. **B0 closed 2026-04-27**; **B1 closed 2026-05-07**; **B2 closed 2026-05-09**; **B3 closed 2026-05-14** via PR #29's closure trio (business + security + performance baseline; merge commit `b425dc1`). All four closures lifted `Done` after a verbatim QEMU smoke trace + clean `-d guest_errors` count per the [business master-plan §Acceptance criteria](../analysis/reviews/business-reviews/master-plan.md#acceptance-criteria) rule. **The 2026-04-28 implementation-complete claim for B1 was rolled back on 2026-05-06 by the smoke regression and re-issued 2026-05-07 as a smoke-verified Done** — that remains the only re-open arc to date; B2 and B3 both closed cleanly on first attempt. -- **Active milestone:** **B4 — Task loader (implementation-complete 2026-05-16; closure trio pending).** Opened 2026-05-14 with the ADR-0029 propose commit; ADR-0029 `Accepted` 2026-05-14 (merged via PR #30); T-019 implementation merged to `main` 2026-05-16 via PR #31 (merge commit `7f876af`). [phase-b.md §B4](phases/phase-b.md#milestone-b4--task-loader): load a userspace binary into a fresh AS, set entry point + initial SP, and produce a `LoadedImage` descriptor (the `task_create_from_image` wrapper that turns it into a runnable `TaskCap` gates on B5/B6 per phase-b §B4 §Revision-notes). The binary is statically embedded in the kernel image (`include_bytes!`); filesystem / dynamic loading is Phase C / D. Per [phase-b plan §B4 §4](phases/phase-b.md#milestone-b4--task-loader), the loader produces a populated address space but does **not** run it — running gates on B6's syscall-ABI work via B5. **The B4 closure trio (business + security + performance reviews) has NOT yet fired** — it is the next review trigger, sequenced separately by the maintainer. -- **Active task:** **T-019 — Task loader: Done 2026-05-16** (merged to `main` via PR #31, merge commit `7f876af`; branch `t-019-task-loader` retired; ADR-0029 Accepted 2026-05-14 and merged via PR #30; implementation arc landed across the bisectable commit chain through review-round 6 — see the dated banner above for the chain summary, noting the pre-merge banner pre-dates review-rounds 5–6). Implementation scope per [T-019 §Acceptance criteria](../analysis/tasks/phase-b/T-019-task-loader.md#acceptance-criteria): `pub fn load_image(image, pmm, mmu, table, as_arena, parent_as_cap, new_rights, image_base_va, stack_size_pages) -> Result` lives in `kernel/src/obj/task_loader.rs`; returns a `LoadedImage { as_cap, entry_va, stack_top_va, image_bytes, stack_bytes }` opaque descriptor — **not** a `CapHandle{CapObject::Task(...)}` (runnability prerequisites — kernel mappings in userspace AS + EL0 context + syscall entry — gate on B5/B6 per phase-b §B4 §Revision-notes); leak-path-closure preflight discipline (every rejectable check before first `pmm.alloc_frame`; cap-side rollback uses `cap_drop(loaded_as_cap)`, not `cap_revoke`, for the freshly-minted leaf cap); typed **10-variant** `LoadError` enum (`InvalidImage` / `InvalidStackSize` / `MisalignedImageBaseVa(VirtAddr)` / `InvalidImageBaseVa { base, end }` / `InvalidParentCap(CapError)` / `FrameBudgetExceeded { needed, available }` / `ImageOverlapsAllocatableMemory` / `AddressSpaceCreationFailed(AddressSpaceError)` / `OutOfFrames` / `MapFailed(AddressSpaceError)`); host tests pin every row of the T-019 §Approach §Simulation table per the [`write-adr` skill row-to-verification mapping discipline](../../.agents/skills/write-adr/SKILL.md#procedure); smoke trace gains exactly one new banner line; no userspace execution (B6 trigger). UNSAFE-2026-0025 / 0026's `Pending QEMU smoke verification` notes lifted via 2026-05-14 Amendments (T-019 BSP wiring is the first runtime exerciser of both paths post-bootstrap); audit entry UNSAFE-2026-0027 opened standalone for the loader's `core::ptr::copy_nonoverlapping` byte-copy site, with 2026-05-15 Amendments recording: (a) the `Pmm::could_yield_pa_overlapping` PA-overlap preflight runtime-enforcing the non-overlap invariant (review-round 1), (b) the `crate::mm::phys_frame_kernel_ptr` helper centralising the identity-mapping invariant for future ADR-0033 high-half migration (review-round 2 F1), (c) the row-3 VA-range preflight + frame-budget reordering and the `intermediate_frame_count` exact-count helper (review-round 3 F1+F2), (d) the row-1 alignment preflight + `MisalignedImageBaseVa(VirtAddr)` variant closing the preventable root-frame-leak path on misaligned `image_base_va` (review-round 4 P2). -- **In review:** none. (T-019's PR #31 merged 2026-05-16; the B4 closure trio is pending but is a review *trigger*, not an in-flight task.) +- **Active milestone:** **B5 — Syscall boundary (opens next).** B4 (Task loader) was formally **Closed 2026-05-28** via its closure trio (see the top banner + the [B4 closure retrospective](../analysis/reviews/business-reviews/2026-05-28-B4-closure.md)). B5 per [phase-b.md §B5](phases/phase-b.md): ADR-0030 (syscall ABI + `IpcError::InvalidCapability` split into `StaleHandle` / `MissingRight` / `WrongObjectKind`) + ADR-0031 (initial syscall set: `send`, `recv`, `console_write`, `task_yield`, `task_exit`), then EL0→EL1 SVC dispatch, a panic-free syscall dispatcher, validated copy-from/to-user through the active AS, and `Capability` `Debug` redaction. B5 is the prerequisite for the deferred [`task_create_from_image`](phases/phase-b.md#milestone-b4--task-loader) wrapper (B4 §3) that turns a `LoadedImage` into a runnable `CapHandle{CapObject::Task(...)}`, then B6 (first userspace "hello"). +- **Active task:** none — B4 closed via the 2026-05-28 closure trio; B5 prep / first ADR (ADR-0030 syscall ABI) opens next. Per [ADR-0025 §Rule 1](../decisions/0025-adr-governance-amendments.md), the implementation task opens in the same commit as the first B5 ADR's *Dependency chain* section. **Last task Done: T-019 — Task loader, 2026-05-16** (PR #31, merge `7f876af`; branch `t-019-task-loader` retired): `pub fn load_image(...) -> Result` in `kernel/src/obj/task_loader.rs` produces a `LoadedImage { as_cap, entry_va, stack_top_va, image_bytes, stack_bytes }` of a freshly populated userspace AS — **not** a runnable `CapHandle{CapObject::Task(...)}` (runnability gates on B5/B6); 10-variant `LoadError`, leak-path-closure preflight chain, UNSAFE-2026-0027 byte-copy entry. DoD fully met (B5/B6 deferrals are §Out-of-scope, not unchecked DoD items). +- **In review:** none. (The B4 closure trio fired 2026-05-28 and is committed/awaiting maintainer review; it is not an in-flight task.) - **In progress:** none. -- **Working branch:** none / awaiting B4 closure trio. Development branches off `main` per the PR pattern; no task branch is currently active and no rebase is pending. -- **Last completed milestone:** **B4 — Task loader, implementation-complete 2026-05-16** via PR #31 merge (`7f876af`); the **B4 closure trio (business + security + performance) has NOT yet fired** — until it does, B4 is "implementation-complete", not formally "Closed" in the sense B0–B3 were. The closure trio is the next review trigger (see "Next review trigger" below); the maintainer sequences it separately. Required task Done: T-019 (Done 2026-05-16). Headline numbers at merge: **260 host tests** (42 hal + 175 kernel + 43 test-hal); QEMU smoke produces the full demo trace through `tyrne: all tasks complete` with the new `tyrne: image loaded (...)` line; miri 260/260 clean; `-d int,unimp,guest_errors` shows only the pre-existing 629 PL011-disabled-UART warnings. ADR-0029 Accepted (2026-05-14); UNSAFE-2026-0027 opened standalone for the loader byte-copy; UNSAFE-2026-0025 / 0026 lifted their `Pending QEMU smoke verification` notes via 2026-05-14 Amendments (T-019 is the first runtime exerciser of both). **Previous milestone closures (fully Closed with trios):** **B3 — Address space abstraction, closed 2026-05-14** via PR #29's closure trio (merge commit `b425dc1`); **B2 — MMU activation, closed 2026-05-09** via its closure trio; **B1 — Drop to EL1 + exception infrastructure, closed 2026-05-07** via PR #15 merge (`e9fa019`) + the PR #16 closure trio (`95b15aa`); **B0 — Phase A exit hygiene, closed 2026-04-27** via PR #9 merge (`9a66e8b`). +- **Working branch:** none / B5 prep opens next. Development branches off `main` per the PR pattern; no task branch is currently active and no rebase is pending. +- **Last completed milestone:** **B4 — Task loader, Closed 2026-05-28** via the closure trio ([business](../analysis/reviews/business-reviews/2026-05-28-B4-closure.md) + [security](../analysis/reviews/security-reviews/2026-05-28-B4-closure.md) Approve + [performance](../analysis/reviews/performance-optimization-reviews/2026-05-28-B4-closure.md) baseline). Required task Done: T-019 (2026-05-16, PR #31 `7f876af`). Closing numbers (canonical source = the trio): **286 host tests** (43 hal + 187 kernel + 53 test-hal + 3 doc-tests; was 260 at the T-019 merge, **+26** from the 2026-05-22 master-review remediation PR #32); QEMU smoke runs the full demo through `tyrne: all tasks complete` with the `tyrne: image loaded (...)` line; `-d int,unimp,guest_errors` = **629 events** (100 % pre-existing PL011 noise; zero fault classes); release perf band p10/p50/p90 = **15.641 / 17.587 / 19.150 ms**; audit log **28** entries (UNSAFE-2026-0027 + 0028 added; 0025/0026 `Pending QEMU smoke verification` notes lifted by T-019). The 2026-05-22 master review (APPROVE the kernel; CI/doc/ADR findings) + PR #32 remediation (23/24 closed) landed in this period. **Previous closures:** **B3** 2026-05-14 (PR #29 `b425dc1`); **B2** 2026-05-09; **B1** 2026-05-07 (PR #15 `e9fa019` + PR #16 `95b15aa`); **B0** 2026-04-27 (PR #9 `9a66e8b`). - **Last completed tasks:** **T-019 — Done 2026-05-16, merged to `main` via PR #31** (branch `t-019-task-loader`, merge commit `7f876af`) — Task loader: `load_image` produces a `LoadedImage` descriptor of a freshly populated userspace AS (10-variant `LoadError`, leak-path-closure preflight chain, UNSAFE-2026-0027 byte-copy entry); does **not** mint a runnable `TaskCap` (B5/B6 prerequisite). **Earlier:** **T-018 — Done 2026-05-11, live on `main` 2026-05-14 via PR #28** (branch `t-018-address-space-kernel-object`, merge commit `47b0a86`). T-018 implementation: [`AddressSpace`](../../kernel/src/mm/address_space.rs) kernel-object struct + per-type [`AddressSpaceArena`](../../kernel/src/mm/address_space.rs) (ADR-0016 pattern); `CapKind::AddressSpace` + `CapObject::AddressSpace(AddressSpaceHandle)` variants in [`kernel/src/cap/mod.rs`](../../kernel/src/cap/mod.rs); capability-gated wrappers `cap_create_address_space` / `cap_map` / `cap_unmap` with step-by-step preflights (DERIVE rights → no-widening → depth preflight → arena/cap-table capacity → PMM alloc → arena commit → `cap_derive` cap-table insert); `Task` struct extension with `address_space_handle`; activation-on-context-switch hook threaded through `yield_now` / `start` / `ipc_recv_and_yield` / `ipc_send_and_yield` (closure-as-parameter, fires only when outgoing and incoming task ASes differ — short-circuits in v1's bootstrap-shared topology); BSP wiring in [`bsp-qemu-virt/src/main.rs`](../../bsp-qemu-virt/src/main.rs) wraps the already-live bootstrap root via the new `QemuVirtAddressSpace::from_existing_root` `pub unsafe fn` companion. Cross-cutting additions during the review-round arc: `MmuError::BlockMapped` variant (commit `8b9f52e`) so unmap into a bootstrap block descriptor surfaces a distinct typed error from `AlreadyMapped`; `CapabilityTable::depth_of` `pub(crate)` preflight helper closing the PMM-leak path; UNSAFE-2026-0014 fifth Amendment scope-extends the umbrella to the activation hook + BSP-side activation closure (zero new audit entries — additive scope on the existing `&mut Scheduler` momentary-borrow umbrella). Smoke trace gains one new line `tyrne: address-space-arena ready (1 / 8 slots used; bootstrap AS root = 0x4008d000)` immediately after `tyrne: pmm initialized (...)` and before `tyrne: timer ready (...)`. Full demo runs to `tyrne: all tasks complete`; `-d int,unimp,guest_errors` reports only the pre-existing PL011-disabled-UART noise (unchanged baseline). **Earlier:** T-017 — Done 2026-05-10 (PR #27, branch `t-017-physical-memory-manager`) — Physical Memory Manager (`Pmm` bitmap allocator + `FrameProvider` trait + UNSAFE-2026-0026 zero-fill audit). **Earlier:** T-016 — Done 2026-05-08 (branch `t-016-mmu-activation`) — MMU activation, VMSAv8 descriptor encoders, `MapperFlush` flush-token, UNSAFE-2026-0022 / 0023 / 0024 / 0025 introduced. **Earlier:** T-015 — Done 2026-05-07 (PR #17, branch `t-015-endpoint-rollback-cancel-recv`) — `ipc_cancel_recv` recovery primitive + symmetric scheduler+endpoint rollback in `ipc_recv_and_yield`'s Phase 2 Deadlock branch (ADR-0032). **Earlier:** T-014 (2026-05-07 via PR #15), T-012 (2026-04-28 via PR #10), T-013 (2026-04-27 via PR #9). - **Last reviews:** + - [B4 closure retrospective (2026-05-28)](../analysis/reviews/business-reviews/2026-05-28-B4-closure.md) — Task loader (T-019) + the 2026-05-22 master-review interlude + PR #32 remediation (23/24 verified findings closed) + - [B4 closure consolidated security review (2026-05-28)](../analysis/reviews/security-reviews/2026-05-28-B4-closure.md) — Approve, eight axes pass + - [B4 closure performance baseline (2026-05-28)](../analysis/reviews/performance-optimization-reviews/2026-05-28-B4-closure.md) — re-baseline; release band p10/p50/p90 = 15.641 / 17.587 / 19.150 ms + - [Full-tree master review (2026-05-22)](../analysis/reviews/master-review/2026-05-22-152729/consolidated.md) — 25-track audit; APPROVE the shipped kernel; 4 Blocker + 18 Major (all CI/doc/ADR, 0 kernel-correctness/security); remediated via PR #32 - [B1 closure retrospective (2026-05-07)](../analysis/reviews/business-reviews/2026-05-07-B1-closure.md) — fresh closure trio replacing the 2026-04-28 trio's load-bearing role - [B1 closure consolidated security review (2026-05-07)](../analysis/reviews/security-reviews/2026-05-07-B1-closure.md) — Approve, eight axes pass - [B1 closure performance baseline (2026-05-07)](../analysis/reviews/performance-optimization-reviews/2026-05-07-B1-closure.md) — net footprint-neutral re-baseline @@ -85,8 +91,8 @@ A short pointer file updated as work progresses. For the full plan see [`phases/ - [ADR-0026 — Idle dispatch via separate fallback slot](../decisions/0026-idle-dispatch-fallback.md) — `Accepted` (2026-05-06). Supersedes ADR-0022's *idle-task-location* axis only (Option A → Option B: dedicated `Scheduler::idle: Option` slot, dispatched via `ready.dequeue().or(s.idle)` only when the ready queue is empty). ADR-0022's *typed-error* axis (Option G — `SchedError::Deadlock` + `IpcError::PendingAfterResume` + `start`'s panic) stands. Implemented by T-014 (Done 2026-05-07). Includes a queue-state simulation table that ADR-0022 lacked; this discipline (simulation tables on multi-step state-machine ADRs) is the central learning of the [B1 smoke-regression arc](../analysis/reviews/business-reviews/2026-05-06-B1-smoke-regression.md). - [ADR-0032 — Endpoint state rollback + `ipc_cancel_recv` primitive](../decisions/0032-endpoint-rollback-and-cancel-recv.md) — `Accepted` (2026-05-07). Adds a recovery primitive that reverses an `Idle → RecvWaiting` transition, called by `ipc_recv_and_yield`'s Phase 2 Deadlock branch so both *scheduler* and *endpoint* state restore to pre-call shape on `SchedError::Deadlock`. Kernel-internal in v1 (no userspace caller); future consumers are the userspace-driven endpoint destroy drain (B2+), multi-waiter wake (ADR-0019 §Open questions), and preemption-rollback (B5+). Implemented by T-015 (Done 2026-05-07). Includes a Phase-2 Deadlock simulation table; ADR-0017 §Revision notes rider records the additive recovery primitive (user-observable surface unchanged). The Accept commit is the first project-side application of [`write-adr` skill](../../.agents/skills/write-adr/SKILL.md) step 10's *careful re-read* discipline as a separate diff from the Propose commit. - [ADR-0027 — Kernel virtual memory layout (B2 — identity-mapped MMU activation)](../decisions/0027-kernel-virtual-memory-layout.md) — **`Accepted` (2026-05-08)**. B2 commits to identity-only mapping (kernel in `TTBR0_EL1`; `TTBR1_EL1` reserved with `EPD1=1` for future high-half ADR-0033 placeholder when B5 surfaces per-task `TTBR0_EL1` swap), 4 KiB granule + 48-bit VA + 4-level translation, MAIR indices 0/1 for device-nGnRnE / normal-cached, four bootstrap page-table frames in a new `.boot_pt` section, and a typed [`MapperFlush`](../../hal/src/mmu/mod.rs) flush-token discipline at the `Mmu` trait surface (additive change to `map`/`unmap` return types, recorded in ADR-0009 §Revision notes rider via T-016). Includes a five-row §Simulation table walking the SCTLR.M=1 transition (Steps 0–4). **First non-recovery-primitive state-machine ADR drafted under [`write-adr` skill §Simulation](../../.agents/skills/write-adr/SKILL.md) discipline** — ADR-0026's table was the empirical retro-source; ADR-0032's table was the first application but its subject is a recovery primitive; ADR-0027 is the first productive-design state machine to use the rule. Implementation: T-016 (Draft, opens with the Propose commit). Accept landed as a separate commit (`bb0a6ba`) per `write-adr` §10. -- **Next task to open:** **B4 milestone closure trio (now due) + B5 syscall-ABI ADR pair (ADR-0030 + ADR-0031)**. T-019 merged 2026-05-16, closing B4's implementation half; the B4 closure trio (business retro + consolidated security review + performance baseline) is now due per the B3 closure precedent but has **not** yet fired (the maintainer sequences it separately). The next implementation thread is the [Phase B §B4 §3 `task_create_from_image`](phases/phase-b.md#milestone-b4--task-loader) surface (which currently gates on B5/B6 — `LoadedImage` exists but is not yet wrapped into a runnable `CapHandle{CapObject::Task(...)}`); the B5 syscall ABI per ADR-0030 is the prerequisite, followed by B6 (first userspace "hello"). The B-phase plan in [phase-b.md §B5](phases/phase-b.md) describes the milestone shape. -- **Next review trigger:** **B4 milestone closure trio — now due (not yet fired).** T-019 reached `Done` and B4 reached "implementation-complete" at the 2026-05-16 merge, so the trio is now the active review trigger; it has **not** yet been produced. The trio shape mirrors the [2026-05-14 B3 closure](../analysis/reviews/business-reviews/2026-05-14-B3-closure.md): business retrospective + consolidated security review + performance baseline. Possible interim triggers: a mini-retro if a B4 follow-up surfaces a learning worth capturing; a maintainer-initiated review if a non-trivial follow-up surfaces (e.g., the deferred B5+ MemoryRegion cap variant + per-operation rights set extension ADR — the T-018 review-round arc's F2 carry-forward). Forward-flag audit notes: UNSAFE-2026-0025 / 0026's `Pending QEMU smoke verification` notes were lifted via Amendment by T-019 (first post-bootstrap `cap_map` / `cap_create_address_space` runtime exerciser); UNSAFE-2026-0019 / 0020 / 0021 continue to gate on the first deadline-arming caller (B5+). +- **Next task to open:** **B5 — Syscall boundary: ADR-0030 (syscall ABI) + ADR-0031 (initial syscall set).** ADR-0030 settles the register calling convention + error-return convention + the K2-5 `IpcError::InvalidCapability` split (`StaleHandle` / `MissingRight` / `WrongObjectKind`); ADR-0031 settles the initial syscall set (`send`, `recv`, `console_write`, `task_yield`, `task_exit` — no more in v1). Then EL0→EL1 SVC dispatch + a panic-free dispatcher + validated copy-from/to-user + `Capability` `Debug` redaction (K3-9). ADR numbers tentative per [ADR-0013](../decisions/0013-roadmap-and-planning.md). B5 is the prerequisite for the deferred [`task_create_from_image`](phases/phase-b.md#milestone-b4--task-loader) wrapper, then B6 (first userspace "hello"). The [phase-b.md §B5](phases/phase-b.md) plan describes the milestone shape. **(The B4-closure §Adjustments — MR-009's "Miri green = Phase-B exit prerequisite" line in phase-b.md and the `current.md` 260→286 count fix — were applied as part of this closure.)** +- **Next review trigger:** **B5 closure trio** — produced when the first B5 milestone reaches `In Review`. (The B4 closure trio fired 2026-05-28.) Possible interim triggers: a mini-retro if EL0/syscall bring-up surfaces a learning worth capturing mid-arc; a maintainer-initiated review or a second on-demand full-tree master review if the corpus drifts again before B5 closes. Forward-flag audit notes: UNSAFE-2026-0025 / 0026's `Pending QEMU smoke verification` notes were lifted by T-019 (first post-bootstrap `cap_map` / `cap_create_address_space` runtime exerciser); UNSAFE-2026-0019 / 0020 / 0021 continue to gate on the first deadline-arming caller (B5+). ## Notes diff --git a/docs/roadmap/phases/phase-b.md b/docs/roadmap/phases/phase-b.md index 78b0a7d..d93eae1 100644 --- a/docs/roadmap/phases/phase-b.md +++ b/docs/roadmap/phases/phase-b.md @@ -2,6 +2,8 @@ **Exit bar:** A userspace task (a real separate binary, not a kernel-level stub) runs in its own address space, with its own capability table, and can make syscalls back into the kernel. +**Exit-quality prerequisite — Miri.** Before Phase B is declared complete, **a green `cargo +nightly miri test` (Stacked Borrows) run over the host-test suite is a Phase-B exit prerequisite**, not merely a per-PR gate — with particular weight on `kernel/src/sched/**` and `kernel/src/ipc/**`, where the raw-pointer scheduler/IPC bridge trades the borrow checker's compile-time non-aliasing guarantee for the documented no-`&mut`-across-`context_switch` invariant (ADR-0021 / [UNSAFE-2026-0014](../../audits/unsafe-log.md)). Miri is the only mechanical verifier of that invariant; the 2026-04-21 security review made the aliasing discipline the #1 Phase-B blocker. The `miri` CI job is blocking-by-construction (no `continue-on-error`) per [`infrastructure.md` §"Miri as a blocking gate"](../../standards/infrastructure.md#miri-as-a-blocking-gate); this line records the milestone-exit requirement. (Closes master-review finding MR-009; recorded at B4 closure, 2026-05-28.) + **Scope:** Land the Phase A exit-hygiene fixes surfaced by the 2026-04-21 reviews, drop to EL1, activate the MMU with a kernel mapping, introduce per-task address spaces, build a task loader, define the syscall entry / dispatch, run the first userspace "hello world" in EL0. Still single-core; Pi 4 is Phase D; drivers are Phase E. **Out of scope:** Multi-core, real hardware, userspace drivers, network, filesystem, cryptography. From ad61156b747f9ab7afadd6e96444441cf78244c7 Mon Sep 17 00:00:00 2001 From: Cemil ILIK Date: Fri, 29 May 2026 04:23:58 +0300 Subject: [PATCH 2/3] docs(roadmap,analysis): address PR #33 review feedback MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - security review: make the T-019 link direct (../../../analysis/tasks -> ../../tasks/phase-b/T-019-task-loader.md), matching the business/performance sibling reviews. - MR-009 consistency: the Miri-as-Phase-B-exit-prerequisite item was closed in-branch (phase-b.md gained the "Exit-quality prerequisite -- Miri" paragraph), but several artifacts still described it as "not yet written" / a "standing residual". Reconcile every mention to the closed state -- business §What-we-learned + §Adjustments (and the section heading), security §4 + Verdict + Adjustment heading + the audit-log bullet, performance forward-flag, and both README index rows -- so future audits do not reopen it. The remediation now reads 24/24 throughout. - current.md: trim the dense 2026-05-28 banner to a short summary that links the closure trio (the canonical metrics source) instead of reproducing every number, and trim the Last-completed-milestone bullet's metric reproduction to a one-line headline + canonical-source pointer. Reduces future drift risk (overall review comments 1 + 2). Docs-only changeset; 305/305 relative links verified resolving. Co-Authored-By: Claude Opus 4.8 (1M context) --- .../business-reviews/2026-05-28-B4-closure.md | 6 +++--- docs/analysis/reviews/business-reviews/README.md | 2 +- .../2026-05-28-B4-closure.md | 2 +- .../security-reviews/2026-05-28-B4-closure.md | 12 ++++++------ docs/analysis/reviews/security-reviews/README.md | 2 +- docs/roadmap/current.md | 4 ++-- 6 files changed, 14 insertions(+), 14 deletions(-) diff --git a/docs/analysis/reviews/business-reviews/2026-05-28-B4-closure.md b/docs/analysis/reviews/business-reviews/2026-05-28-B4-closure.md index 9327c42..ddd7fc9 100644 --- a/docs/analysis/reviews/business-reviews/2026-05-28-B4-closure.md +++ b/docs/analysis/reviews/business-reviews/2026-05-28-B4-closure.md @@ -170,9 +170,9 @@ This is the central learning of the period. Tyrne's per-PR discipline is genuine The honest framing: the per-PR discipline is doing its job (the *code* is strong), but it has a blind spot for whole-tree consistency, gate honesty, and unexercised contracts. A periodic full-tree audit is the only instrument that sees those. The maintainer inserting one *before* declaring B4 Closed — rather than after — is the right ordering: it means B4 closes on a tree whose records match reality. -### The remediation closed 23 of 24 verified Blocker+Major findings; the one residual is a one-line doc gap +### The remediation closed 23 of 24 verified Blocker+Major findings; the 24th (MR-009) was closed in-branch at this closure -The closure-trio session re-verified all 24 Blocker+Major findings adversarially against the live tree: **23 confirmed fixed, 1 partially fixed.** The single partial is **MR-009 (Miri-as-CI-gate):** Miri *is* now a blocking CI job (no `continue-on-error`) and *is* listed as a required gate in `infrastructure.md` — but the prescription "make Miri green a Phase-B exit prerequisite" is not yet written into [`phase-b.md`](../../../roadmap/phases/phase-b.md)'s §Exit bar or a Phase-B exit checklist. This is the cleanest single B4-closure Adjustment to record: a one-line in-tree doc gap, non-blocking, and the only open item from a 24-finding remediation. The signal is that the remediation was thorough — the residual is a documentation phrasing, not a missing mechanism. +The closure-trio session re-verified all 24 Blocker+Major findings adversarially against the live tree: **23 confirmed fixed, 1 partially fixed.** The single partial is **MR-009 (Miri-as-CI-gate):** Miri *is* now a blocking CI job (no `continue-on-error`) and *is* listed as a required gate in `infrastructure.md` — but the prescription "make Miri green a Phase-B exit prerequisite" was the last open item of the 24-finding remediation — a one-line in-tree doc gap, non-blocking. It was **closed in-branch during this closure** (the §"Exit-quality prerequisite — Miri" paragraph added to [`phase-b.md`](../../../roadmap/phases/phase-b.md)), so the remediation is now 24/24. The signal is that the remediation was thorough — the last residual was a documentation phrasing, not a missing mechanism. ### The append-only ADR mechanism absorbed a frozen contradiction exactly as designed (ADR-0036) @@ -203,7 +203,7 @@ Net: **1 of 6 closed** (B4 milestone via the loader arc); 5 trigger-deferred (2 ## Adjustments -- [x] **MR-009 residual — add "Miri green = Phase-B exit prerequisite" to the phase-b.md exit bar.** **✅ Closed in-branch 2026-05-28** — the §"Exit-quality prerequisite — Miri" paragraph was added to the top of [`phase-b.md`](../../../roadmap/phases/phase-b.md), stating a green `cargo +nightly miri test` run as a Phase-B *milestone-exit* prerequisite (with weight on `kernel/src/sched/**` + `kernel/src/ipc/**`) and linking [`infrastructure.md` §"Miri as a blocking gate"](../../../standards/infrastructure.md#miri-as-a-blocking-gate). The CI-gate half was already done by PR #32; MR-009 is now fully closed. The one open item from the 24-finding master-review remediation. Miri is already a blocking CI job + a listed required gate in `infrastructure.md`; the only gap is that [`phase-b.md`](../../../roadmap/phases/phase-b.md)'s §Exit bar (or a new Phase-B exit checklist) does not yet state that a green Miri run is an exit prerequisite — consistent with the 2026-04-21 security review that made the scheduler/IPC aliasing discipline the #1 Phase-B blocker. **Trigger:** a one-line edit to `phase-b.md` §Exit bar; cleanest to land as the first commit of the B5 prep arc (alongside the ADR-0030 propose commit) so the Phase-B exit bar is correct before Phase B's final milestones run. Pure documentation change; no code. +- [x] **MR-009 — "Miri green = Phase-B exit prerequisite" written into the phase-b.md exit bar.** **✅ Closed in-branch 2026-05-28.** The §"Exit-quality prerequisite — Miri" paragraph was added to the top of [`phase-b.md`](../../../roadmap/phases/phase-b.md), stating a green `cargo +nightly miri test` run as a Phase-B *milestone-exit* prerequisite (weighted on `kernel/src/sched/**` + `kernel/src/ipc/**`) and linking [`infrastructure.md` §"Miri as a blocking gate"](../../../standards/infrastructure.md#miri-as-a-blocking-gate). The CI-gate half was already done by PR #32, so MR-009 — the last open item of the 24-finding master-review remediation — is now fully closed (24/24). Consistent with the 2026-04-21 security review that made the scheduler/IPC aliasing discipline the #1 Phase-B blocker. - [x] **`current.md` test-count drift 260 → 286.** **✅ Closed in-branch 2026-05-28** — `current.md` was refreshed for the closure: a new 2026-05-28 B4-closure top banner (carrying the 286 count + 629 guest-errors + the perf band), the Last-completed-milestone bullet flipped from "B4 implementation-complete (260 tests)" to "B4 Closed 2026-05-28 (286 tests)", and the four Pathfinder bullets (Active phase / milestone / task + Next review trigger) flipped to B5. The README was already fine (de-hardcoded to a link per MR-015). - [ ] **B5 opens — ADR-0030 (syscall ABI) + ADR-0031 (initial syscall set).** ADR-0030 settles the register calling convention + error-return convention + the K2-5 `IpcError::InvalidCapability` split into `StaleHandle` / `MissingRight` / `WrongObjectKind`; ADR-0031 settles the initial syscall set (`send`, `recv`, `console_write`, `task_yield`, `task_exit`) — no more in v1. Then EL0→EL1 SVC dispatch, a panic-free syscall dispatcher, validated copy-from/to-user through the active AS, and Capability Debug redaction (K3-9). ADR numbers tentative per [ADR-0013](../../../decisions/0013-roadmap-and-planning.md). **Trigger:** opens with the next ADR-prep arc — the planned slot is ADR-0030 paired with a T-NNN syscall-dispatch task, per [phase-b.md §B5](../../../roadmap/phases/phase-b.md). B5 is the prerequisite for the deferred `task_create_from_image` wrapper (phase-b §B4 §3) that turns a `LoadedImage` into a runnable `CapHandle{CapObject::Task(...)}`, then B6 (first userspace "hello"). - [ ] **B5+ `MemoryRegion` cap variant + per-operation rights set extension.** Carry forward from B3 unchanged (the T-018 F2 rights gap). **Trigger:** the first B5 ADR that introduces a per-task AS cap not co-resident with the bootstrap-everything cap. diff --git a/docs/analysis/reviews/business-reviews/README.md b/docs/analysis/reviews/business-reviews/README.md index 3cd0f48..805e004 100644 --- a/docs/analysis/reviews/business-reviews/README.md +++ b/docs/analysis/reviews/business-reviews/README.md @@ -34,4 +34,4 @@ A business review may point at outcomes from those other reviews as part of "wha | 2026-05-07 | B1 closure retrospective (post-T-014) — fresh closure trio replacing the 2026-04-28 trio's load-bearing role; T-014 + ADR-0026 fixed the smoke regression; α/β/γ closed comprehensive-review Track-E/J/A/B/F/G/I non-blockers | [2026-05-07-B1-closure.md](2026-05-07-B1-closure.md) | | 2026-05-09 | B2 closure retrospective — MMU activation + kernel-half mapping (T-016); ADR-0027 + `MapperFlush` flush-token discipline; closed cleanly on first attempt (no smoke-regression arc) | [2026-05-09-B2-closure.md](2026-05-09-B2-closure.md) | | 2026-05-14 | B3 closure retrospective — Address-space abstraction (T-017 PMM + T-018 `AddressSpace` kernel object); ADR-0035 + ADR-0028; five-round PR #28 review arc + cross-cutting `MmuError::BlockMapped` + `.claude/skills/` → `.agents/skills/` migration | [2026-05-14-B3-closure.md](2026-05-14-B3-closure.md) | -| 2026-05-28 | B4 closure retrospective — Task loader (T-019 `load_image` → `LoadedImage`); ADR-0029 + the 2026-05-22 master-review interlude (4 Blocker / 18 Major full-tree audit) + PR #32 remediation closing 23/24 verified findings; ADR-0036 supersession of GICv3/SMMUv3; UNSAFE-2026-0027 + 0028 added, 0025/0026 lifted; HodeTech org migration | [2026-05-28-B4-closure.md](2026-05-28-B4-closure.md) | +| 2026-05-28 | B4 closure retrospective — Task loader (T-019 `load_image` → `LoadedImage`); ADR-0029 + the 2026-05-22 master-review interlude (4 Blocker / 18 Major full-tree audit) + PR #32 remediation closing all 24 verified findings (MR-009's exit-bar half closed in-branch here); ADR-0036 supersession of GICv3/SMMUv3; UNSAFE-2026-0027 + 0028 added, 0025/0026 lifted; HodeTech org migration | [2026-05-28-B4-closure.md](2026-05-28-B4-closure.md) | diff --git a/docs/analysis/reviews/performance-optimization-reviews/2026-05-28-B4-closure.md b/docs/analysis/reviews/performance-optimization-reviews/2026-05-28-B4-closure.md index efbf81b..114fbca 100644 --- a/docs/analysis/reviews/performance-optimization-reviews/2026-05-28-B4-closure.md +++ b/docs/analysis/reviews/performance-optimization-reviews/2026-05-28-B4-closure.md @@ -220,4 +220,4 @@ The [2026-05-14 B3 closure baseline §Forward-flagged items](2026-05-14-B3-closu - **Per-`cap_map` TLB-flush batching for the loader.** v1 flushes once per `cap_map` (correct conservative discipline). When `task_create_from_image` (B5/B6) makes the loader-produced AS live, a batched `TLBI VMALLE1` after the whole loader runs becomes a candidate optimisation. **Trigger:** opens with the first B5/B6 task that enters a loader-produced AS. - **Loader cost linear in image size.** v1's 8-byte image needs 6 frames; B6's real `hello` binary + larger B5+ images scale the `alloc_frame` (+ zero-fill) and `cap_map` (+ intermediate-frame) cost linearly, and a multi-2 MiB-L2-slot image is the first caller of the exact `intermediate_frame_count` path. **Trigger:** opens with B6's first real `userland/hello/` binary or any B5+ image larger than one L2 slot. -- **Miri-as-Phase-B-exit-bar codification (one-line in-tree doc gap; carried from the master-review remediation).** MR-009 made Miri a blocking CI job (no `continue-on-error`) and listed it as a required gate in [`infrastructure.md`](../../../standards/infrastructure.md#miri-as-a-blocking-gate), but "Miri green = Phase-B exit prerequisite" is **not yet written into the [phase-b.md](../../../roadmap/phases/phase-b.md) exit bar / a Phase-B exit checklist**. This is non-blocking and is the cleanest single B4-closure Adjustment to record (hand off to the business retro). **Trigger:** the Phase-B exit checklist authoring, naturally folded into the B5 prep arc. +- **Miri-as-Phase-B-exit-bar codification (MR-009) — closed at this closure.** MR-009's CI-gate half was done by PR #32 (Miri is a blocking job, no `continue-on-error`, listed required in [`infrastructure.md`](../../../standards/infrastructure.md#miri-as-a-blocking-gate)); the *documented* exit-bar half is now **closed in-branch** — [phase-b.md](../../../roadmap/phases/phase-b.md) gains the §"Exit-quality prerequisite — Miri" paragraph stating a green `cargo +nightly miri test` run is a Phase-B exit prerequisite. No longer forward-flagged; recorded here for the audit trail (the remediation is now 24/24). diff --git a/docs/analysis/reviews/security-reviews/2026-05-28-B4-closure.md b/docs/analysis/reviews/security-reviews/2026-05-28-B4-closure.md index 56be905..41ed91a 100644 --- a/docs/analysis/reviews/security-reviews/2026-05-28-B4-closure.md +++ b/docs/analysis/reviews/security-reviews/2026-05-28-B4-closure.md @@ -1,6 +1,6 @@ # Security review 2026-05-28 — B4 closure consolidated pass (post-T-019 + master-review remediation) -- **Change:** the B4 arc on `main` — [T-019 task loader](../../../analysis/tasks/phase-b/T-019-task-loader.md) merged via PR #31 ([merge `7f876af`](https://github.com/HodeTech/Tyrne/commit/7f876af); 7 bisectable commits `911f2ad`/`5711756`/`ae31bc8`/`196d3fb`/`164522d`/`5b1f153`/`95efd62` + doc/round-fix commits `74694d4`/`5078944`/`eb14c51`), preceded by [ADR-0029](../../../decisions/0029-initial-userspace-image-format.md) (Initial userspace image format, `Accepted` 2026-05-14, PR #30 [merge `e09755d`](https://github.com/HodeTech/Tyrne/commit/e09755d)) — *plus* the **master-review remediation** PR #32 ([merge `50bffe9`](https://github.com/HodeTech/Tyrne/commit/50bffe9)) that closed the 2026-05-22 full-tree review's Blocker+Major backlog (commits `a6e909d` MR-001 / `8063ee2` MR-006/005/019/020 + ADR-0036 / `fbc3d3f` MR-002/003/007/008/009 CI honesty / `59f9309` MR-005/011/017/018 / `57bc2e6` MR-010/018 / `348971e` MR-022/017/018 / `24530fb` MR-012/013/014 / `4e241d9` MR-016/019 / `4141158` MR-015/004 / `a2e7257` D3-005/006/007 + review-round commits `ae8fbd7`/`8ceb4fb`/`c843ecd`), the org migration `cd4cb6e` (cemililik/Tyrne → HodeTech/Tyrne), and the README clarity pass `3ab029f` (HEAD). Period under review: 2026-05-14 → 2026-05-28. +- **Change:** the B4 arc on `main` — [T-019 task loader](../../tasks/phase-b/T-019-task-loader.md) merged via PR #31 ([merge `7f876af`](https://github.com/HodeTech/Tyrne/commit/7f876af); 7 bisectable commits `911f2ad`/`5711756`/`ae31bc8`/`196d3fb`/`164522d`/`5b1f153`/`95efd62` + doc/round-fix commits `74694d4`/`5078944`/`eb14c51`), preceded by [ADR-0029](../../../decisions/0029-initial-userspace-image-format.md) (Initial userspace image format, `Accepted` 2026-05-14, PR #30 [merge `e09755d`](https://github.com/HodeTech/Tyrne/commit/e09755d)) — *plus* the **master-review remediation** PR #32 ([merge `50bffe9`](https://github.com/HodeTech/Tyrne/commit/50bffe9)) that closed the 2026-05-22 full-tree review's Blocker+Major backlog (commits `a6e909d` MR-001 / `8063ee2` MR-006/005/019/020 + ADR-0036 / `fbc3d3f` MR-002/003/007/008/009 CI honesty / `59f9309` MR-005/011/017/018 / `57bc2e6` MR-010/018 / `348971e` MR-022/017/018 / `24530fb` MR-012/013/014 / `4e241d9` MR-016/019 / `4141158` MR-015/004 / `a2e7257` D3-005/006/007 + review-round commits `ae8fbd7`/`8ceb4fb`/`c843ecd`), the org migration `cd4cb6e` (cemililik/Tyrne → HodeTech/Tyrne), and the README clarity pass `3ab029f` (HEAD). Period under review: 2026-05-14 → 2026-05-28. - **Reviewer:** @cemililik (+ Claude Opus 4.8 (1M context) agent acting adversarial across the eight axes the [security-review master plan](master-plan.md) defines). - **Separation from code review:** standalone consolidated pass scoped to the post-B3 → post-T-019 + post-remediation surface. T-019 itself went through **six PR-rounds** on PR #31 (rounds 1–6; reviews #1–#6 — review #2 P1 surfaced the PA-overlap soundness gap, round 4 P2 surfaced the misaligned-base root-frame-leak path; see [business retrospective §"Review-round arc on PR #31"](../business-reviews/2026-05-28-B4-closure.md)) and the whole shipped tree went through the **2026-05-22 master review** (4 waves, 25 track agents, including a dedicated X1-security pass that returned PASS and an X3-unsafe-audit pass that confirmed the log fully in sync). Those passes were per-PR / whole-tree; this artefact is the *closure-trio* security pass scoped to the B4 milestone, performed with a fresh checklist after a deliberate context switch. Sibling trio legs: [business retrospective](../business-reviews/2026-05-28-B4-closure.md) and [performance baseline](../performance-optimization-reviews/2026-05-28-B4-closure.md). - **Unsafe audit cross-reference:** [UNSAFE-2026-0027](../../../audits/unsafe-log.md#unsafe-2026-0027--task-loader-frame-byte-copy-via-coreptrcopy_nonoverlapping-in-task_loaderload_image) (**new**; T-019 `task_loader::load_image` frame byte-copy via `core::ptr::copy_nonoverlapping`, with four post-introduction Amendments hardening the safe-API boundary — PA-overlap preflight, `phys_frame_kernel_ptr` helper + VA-range preflight, exact intermediate-frame count, and the misaligned-base alignment preflight); [UNSAFE-2026-0028](../../../audits/unsafe-log.md#unsafe-2026-0028--wrap-an-already-live-populated-vmsav8-l0-root-via-qemuvirtaddressspacefrom_existing_root) (**new**; `QemuVirtAddressSpace::from_existing_root` — audit-trail completion for a pre-existing `unsafe fn`, opened by the MR-011 / X3-001 remediation, second-reviewer-required per unsafe-policy §Review.4); [UNSAFE-2026-0025](../../../audits/unsafe-log.md#unsafe-2026-0025--qemuvirtmmumap--unmap-page-table-descriptor-writes) + [UNSAFE-2026-0026](../../../audits/unsafe-log.md#unsafe-2026-0026--pmm-frame-zeroing-via-coreptrwrite_bytes-in-pmmalloc_frame) (`Pending QEMU smoke verification` status notes **lifted** via 2026-05-14 Amendments — T-019 is the first runtime exerciser of both post-bootstrap `Mmu::map` and `Pmm::alloc_frame`'s zero-fill); the MR-005 `d8`–`d15` FP callee-saved enumeration added to the `ContextSwitch` `# Safety` contract + the ADR-0020 rider. Entries 0001..0028 (with 0012 `Removed`) re-verified against the post-PR-#32 source — append-only discipline holds, no in-place body edits (X3-unsafe-audit pass confirmed this whole-tree; the two new entries land under the Operation / Invariants / Rejected-alternatives shape). @@ -42,7 +42,7 @@ Adversarial frame: *can a hostile caller, or an unexpected control-flow sequence - **UNSAFE-2026-0028 (`from_existing_root` wrap of a live non-zero L0 root) is policy-conformant and closes the one audit gap the master-review found.** OK — the entry covers constructing a `QemuVirtAddressSpace` naming the **already-live, already-populated** bootstrap `VMSAv8` L0 root **without** zero-filling it (a contract deliberately distinct from `Mmu::create_address_space`, which requires a zero-filled root). Five invariants: root is the live `__boot_pt_l0` frame `mmu_bootstrap` installed into `TTBR0_EL1`; caller runs strictly after `mmu_bootstrap`; exactly-one-wrapper alias-freedom; **no zero-fill is performed (and none must be — zero-filling a live root would unmap the running kernel out from under itself)**; subsequent `map`/`unmap` rely on the UNSAFE-2026-0025 walker invariants. Rejected alternatives: routing through `create_address_space` (would demand the kernel-erasing zero-fill), making it `safe` (the "live-and-populated" precondition is not type-expressible), attributing to 0010+0014 (the pre-fix mis-attribution — those cover the surrounding StaticCell/arena publish mechanics, not the wrap operation). **Adversarial probe — was this a behaviour change disguised as an audit fix?** No: X3-001 / the X3-unsafe-audit pass confirmed `mmu_bootstrap` populates the exact frame and `kernel_entry` runs post-bootstrap; the contract was always sound and the sole caller always honoured it. The fix is the audit *record* catching up to live code. Security-sensitive (boot + MMU root install) → second-reviewer required per [unsafe-policy §Review.4](../../../standards/unsafe-policy.md), satisfied. - **UNSAFE-2026-0025 / 0026 `Pending QEMU smoke verification` notes lifted — T-019 is the genuine first runtime exerciser.** OK — both notes lifted via 2026-05-14 Amendments. The B3 closure *predicted* T-018 might be the first `alloc_frame` exerciser; the Amendment honestly corrects this — T-018's bootstrap AS uses `from_existing_root`/`wrap_bootstrap` (it wraps the live `.boot_pt` L0 frame and does **not** call `alloc_frame` for an L0 root), so the **first** kernel-side `alloc_frame` for a translation table came when T-019 minted the *second* AS via `cap_create_address_space`. The smoke now runs `load_image` → `cap_create_address_space` (1 root frame) + per-image-page + per-stack-page + up-to-6 intermediate page-table frames, all through 0026's zero-fill and 0025's per-call `Mmu::map` page-table descriptor writes, with the trace reaching `tyrne: all tasks complete` and `-d int,unimp,guest_errors` showing zero new fault classes. The map-path invariants (root-frame validity inherited via `create_address_space`'s contract → induction through the table-descriptor chain; index bounds `[0,511]`; volatile discipline; leaf-written-last ordering; `MapperFlush` discharge via `cap_map`'s internal `token.flush(mmu)`; host-tested encoders) all hold under runtime evidence. - **The unmap path stays runtime-unexercised in v1 — host-tests + Miri are the evidence base.** OK (forward-flag, carried). v1 has no userspace caller of `cap_unmap`; the rollback paths inside `load_image` exercise the cap-side cleanup but the v1 demo never arms a block-descriptor unmap. Host-test coverage in `cap_unmap_returns_unmapped_frame` + the `task_loader::tests` rollback tests is the v1 evidence; first runtime exercise gates on B5+ userspace teardown. (Carries forward the B3 closure's "BSP host-test infra for block-descriptor unmap" Adjustment, still trigger-deferred.) -- **The whole audit log is in sync at HEAD; the post-T-019 + post-remediation smoke produces no observable aliasing or UB.** OK — the X3-unsafe-audit master-review pass read all entries with their Amendment chains in full and confirmed **all entries map to real, current code sites; zero stale; zero append-only violations** (the *only* gap at that commit was the missing `from_existing_root` entry, now closed as UNSAFE-2026-0028). At HEAD `3ab029f` the count is 28 entries (0001..0028; 0012 `Removed` → 27 Active). Live gates reproduced this session (pinned nightly-2026-01-15): `cargo host-test` **286 passed / 0 failed**, `cargo fmt --check` clean, `cargo host-clippy` (`--all-targets -D warnings`) clean, `cargo kernel-clippy` clean, `cargo kernel-build` clean. **Miri** is the one verifier not run locally (not installed on the pinned toolchain on this host) — it is the CI gate, and at the master-review commit it passed 260/260 with zero Stacked-Borrows violations / zero detected UB (the +26 tests since are MR-010 PMM failure-path, MR-017 polarity, MR-018 rollback/fake-injection, MR-022 — all Miri-relevant raw-pointer-adjacent paths). The standing residual is X1-002 / MR-009 (Miri-as-CI-gate); see §4 and Verdict. +- **The whole audit log is in sync at HEAD; the post-T-019 + post-remediation smoke produces no observable aliasing or UB.** OK — the X3-unsafe-audit master-review pass read all entries with their Amendment chains in full and confirmed **all entries map to real, current code sites; zero stale; zero append-only violations** (the *only* gap at that commit was the missing `from_existing_root` entry, now closed as UNSAFE-2026-0028). At HEAD `3ab029f` the count is 28 entries (0001..0028; 0012 `Removed` → 27 Active). Live gates reproduced this session (pinned nightly-2026-01-15): `cargo host-test` **286 passed / 0 failed**, `cargo fmt --check` clean, `cargo host-clippy` (`--all-targets -D warnings`) clean, `cargo kernel-clippy` clean, `cargo kernel-build` clean. **Miri** is the one verifier not run locally (not installed on the pinned toolchain on this host) — it is the CI gate, and at the master-review commit it passed 260/260 with zero Stacked-Borrows violations / zero detected UB (the +26 tests since are MR-010 PMM failure-path, MR-017 polarity, MR-018 rollback/fake-injection, MR-022 — all Miri-relevant raw-pointer-adjacent paths). The one partial finding was X1-002 / MR-009 (Miri-as-CI-gate), now closed in-branch at this closure; see §4 and Verdict. - **The master-review remediation introduced no new `unsafe`-widening site.** OK — `59f9309` (MR-005 `d8`–`d15` contract) is a **doc/contract** change to the `ContextSwitch` `# Safety` section + an ADR-0020 rider (the shipping BSP already saved `d8`–`d15`; the gap was contract *text* a second BSP author could implement wrong — see §8); MR-011's UNSAFE-2026-0028 is audit-trail-only; MR-010/022/017/018 are safe-Rust allocator/scheduler/test changes. No production `unsafe` block widened. ## 4. Kernel-mode discipline @@ -53,7 +53,7 @@ Adversarial frame: *does any new T-019 or remediation code path violate the prin - **Every `load_image` failure path returns a typed `LoadError`, never a panic.** OK — the 10-variant `LoadError` (`InvalidImage`/`InvalidStackSize`/`MisalignedImageBaseVa`/`InvalidParentCap`/`FrameBudgetExceeded`/`InvalidImageBaseVa`/`ImageOverlapsAllocatableMemory`/`AddressSpaceCreationFailed`/`OutOfFrames`/`MapFailed`) covers every fallible step; `cargo kernel-clippy -D warnings` (enforcing `#![deny(clippy::panic)]`) is clean at HEAD. The exhaustiveness regression `load_error_variants_pattern_match_exhaustively` would compile-fail if a variant were silently removed. **Adversarial probe — does the rollback path panic on an "impossible" state?** No: rollback frees the leaf frames + reverses committed mappings + `cap_drop`s the leaf AS cap; the documented v1 baseline leaks (AS-arena slot, L0 root, intermediate L1/L2/L3 frames) are a *resource* trade-off, not a panic. The master-review (X1 Axis 4) confirmed the analogous `ipc_recv_and_yield` deadlock path "is handled, not hung" with symmetric ADR-0032 rollback returning a typed `SchedError::Deadlock`. - **The leak-path-closure discipline is preserved and extended: every fallible check runs before the first irreversible PMM commitment.** OK — T-019 inherits T-018's creation-side preflight (the depth check runs before `pmm.alloc_frame`) and *adds* the round-4 fix that moved the `image_base_va` alignment check into the **argument preflight (row 1)** before any `cap_create_address_space` call — closing a path where an unaligned base would surface from the first `cap_map` only *after* the root L0 frame was allocated, leaking it. The renamed test `rejects_misaligned_image_base_va_with_pmm_byte_stable` asserts `pmm.stats().free_frames == pmm_before`. This is exactly the "no irreversible commitment before the last fallible check" discipline the B3 closure verified, now extended to the loader's argument surface. - **No allocation in ISRs; bounded kernel resources hold; DAIF discipline preserved.** OK — T-019 runs at boot in `kernel_entry`, not in any interrupt context; it allocates only PMM frames (typed `OutOfFrames` on exhaustion, never a panic — matching the security-model "Bounded kernel state" invariant). The master-review re-verified the ISR (`irq_entry`) allocates nothing and the bounded-resource invariant holds tree-wide. The MR-022 centralised-`enqueue_ready` helper consolidates the scheduler's infallible-enqueue invariant into one auditable home (addressing X1-F4's "the invariant should have one home") — a defence-in-depth refactor, not a behaviour change. -- **Standing residual: Miri — the only mechanical verifier of the raw-pointer aliasing discipline — is now a *blocking* CI job but not yet written into the Phase-B exit bar.** **Flagged (process; non-blocking for this closure; the cleanest single B4-closure Adjustment to record).** The master-review's MR-009 / X1-002 (Major-for-Phase-B-exit) observed Miri was a *manual* gate. The remediation (`fbc3d3f`) **closed the mechanical half**: Miri is now a blocking CI job (no `continue-on-error`) and is listed as a required gate in [`infrastructure.md`](../../../standards/infrastructure.md). Verified adversarially this session, MR-009 is the **one partial** of the 24 Blocker+Major findings (23 confirmed-fixed): "Miri green = Phase-B exit prerequisite" is **not yet written into the [`phase-b.md`](../../../roadmap/phases/phase-b.md) exit bar / a Phase-B exit checklist" (the exit bar at phase-b.md:3 names the userspace-task milestone but not the Miri gate). This is a one-line in-tree doc gap — non-blocking (Miri *is* enforced mechanically; the gap is the *documented* exit-bar text), and recorded as this closure's single Adjustment (§Forward-flagged items + business-retro §Adjustments). It matters because Miri is the *only* catcher of a future `&mut`-escapes-its-block aliasing regression in the `sched`/`ipc` raw-pointer bridge (the UNSAFE-2026-0012-class UB the bridge was built to remove) — host tests + clippy + the audit log cannot detect it. +- **Miri — the only mechanical verifier of the raw-pointer aliasing discipline — is now a *blocking* CI job AND a documented Phase-B exit prerequisite.** **Resolved at this closure.** The master-review's MR-009 / X1-002 (Major-for-Phase-B-exit) observed Miri was a *manual* gate. The remediation (`fbc3d3f`) closed the mechanical half: Miri is now a blocking CI job (no `continue-on-error`) and a required gate in [`infrastructure.md`](../../../standards/infrastructure.md). MR-009 was the **one partial** of the 24 Blocker+Major findings at master-review time (the other 23 fixed by PR #32); its remaining *documented* half is now **closed in-branch at this closure** — the new §"Exit-quality prerequisite — Miri" paragraph in [`phase-b.md`](../../../roadmap/phases/phase-b.md) states a green `cargo +nightly miri test` run is a Phase-B exit prerequisite (weighted on `kernel/src/sched/**` + `kernel/src/ipc/**`), so **all 24 Blocker+Major findings are resolved**. This matters because Miri is the *only* catcher of a future `&mut`-escapes-its-block aliasing regression in the `sched`/`ipc` raw-pointer bridge (the UNSAFE-2026-0012-class UB the bridge was built to remove) — host tests + clippy + the audit log cannot detect it. ## 5. Cryptography @@ -98,11 +98,11 @@ Adversarial frame: *does the B4 arc + the remediation reshape what the system de All eight axes pass. The B4 arc (T-019 / PR #31) lands the task loader — the first runtime consumer of the AddressSpace + PMM scaffolds — without introducing any new attack surface, capability widening, memory-safety hazard, or threat-model shift. `load_image` is capability-gated (a `parent_as_cap` with `DERIVE`, no-widening, a full leak-path-closure preflight chain that runs every rejectable check before the first irreversible `pmm.alloc_frame`), it produces a `LoadedImage` *descriptor* but does **not** mint a runnable task and does **not** execute userspace — so the EL0/syscall trust boundary remains closed until B5/B6. The two new `unsafe` entries are exemplary: UNSAFE-2026-0027 (`copy_nonoverlapping`) carries six enumerated invariants + four append-only boundary-hardening Amendments (the round-1 PA-overlap preflight converting a BSP-trust argument into a typed runtime rejection is a genuine soundness win — master-review X1-P2), and UNSAFE-2026-0028 (`from_existing_root`) closes the single audit gap the whole-tree X3 pass found, with a correctly-distinct never-zero-fill-the-live-root contract and the required second-reviewer sign-off. UNSAFE-2026-0025/0026's `Pending QEMU smoke verification` notes are honestly lifted — T-019 is the *genuine* first runtime exerciser of both, and the smoke trace reaches `tyrne: all tasks complete` with zero new fault classes. -This closure reconciles cleanly with the 2026-05-22 master review: its dedicated X1-security pass returned **PASS** (0 security Blocker, 0 security Major in the shipping binary), its X3-unsafe-audit pass confirmed the log **fully in sync** (all entries resolve to live code, append-only intact), and the PR #32 remediation closed **23 of 24** Blocker+Major findings as confirmed-fixed under adversarial per-finding re-verification this session — including the security-relevant `d8`–`d15` cross-board contract gap (X1-001/MR-005), the missing `from_existing_root` entry (X1-001-adjacent / MR-011), and the CI supply-chain hygiene (X1-007/003). The audit log is fully in sync (28 entries; 0012 `Removed` → 27 Active; X3 confirmed all resolve to live code, append-only intact). +This closure reconciles cleanly with the 2026-05-22 master review: its dedicated X1-security pass returned **PASS** (0 security Blocker, 0 security Major in the shipping binary), its X3-unsafe-audit pass confirmed the log **fully in sync** (all entries resolve to live code, append-only intact), and the PR #32 remediation closed **23 of 24** Blocker+Major findings as confirmed-fixed under adversarial per-finding re-verification this session — including the security-relevant `d8`–`d15` cross-board contract gap (X1-001/MR-005), the missing `from_existing_root` entry (X1-001-adjacent / MR-011), and the CI supply-chain hygiene (X1-007/003); the 24th — MR-009's documented Phase-B-exit-bar half — was then closed in-branch at this closure (→ **24/24**; see the Adjustment below). The audit log is fully in sync (28 entries; 0012 `Removed` → 27 Active; X3 confirmed all resolve to live code, append-only intact). -### Adjustment (the single B4-closure remediation residual) +### Adjustment — MR-009 (closed in-branch at this closure) -- **MR-009 (Miri = Phase-B exit prerequisite) is the one partial.** Miri is now a *blocking* CI job (no `continue-on-error`) and a documented required gate in `infrastructure.md` — the mechanical half is closed. The remaining gap is purely in-tree doc text: **"Miri green = Phase-B exit prerequisite" is not yet written into the [`phase-b.md`](../../../roadmap/phases/phase-b.md) exit bar / a Phase-B exit checklist.** One-line, non-blocking (Miri *is* enforced), and recorded as this closure's Adjustment because Miri is the *only* mechanical catcher of a future `&mut`-escapes-its-block aliasing regression in the `sched`/`ipc` raw-pointer bridge. Mirrors the 2026-04-21 Phase-A-exit security review that made the aliasing discipline the #1 Phase-B blocker. +- **MR-009 (Miri = Phase-B exit prerequisite) — closed at this closure.** Miri is now a *blocking* CI job (no `continue-on-error`) and a documented required gate in `infrastructure.md` — the mechanical half was closed by PR #32. The remaining *documented* half is now **closed in-branch at this closure**: [`phase-b.md`](../../../roadmap/phases/phase-b.md) gains the §"Exit-quality prerequisite — Miri" paragraph stating a green `cargo +nightly miri test` run is a Phase-B exit prerequisite. MR-009 is therefore fully closed (the remediation is now 24/24). It mattered because Miri is the *only* mechanical catcher of a future `&mut`-escapes-its-block aliasing regression in the `sched`/`ipc` raw-pointer bridge; mirrors the 2026-04-21 Phase-A-exit security review that made the aliasing discipline the #1 Phase-B blocker. ### Forward-flagged items (carry-forward; non-blocking) diff --git a/docs/analysis/reviews/security-reviews/README.md b/docs/analysis/reviews/security-reviews/README.md index 51755c8..17fd3f4 100644 --- a/docs/analysis/reviews/security-reviews/README.md +++ b/docs/analysis/reviews/security-reviews/README.md @@ -37,4 +37,4 @@ A security review is a **separate pass** from the code review — it is performe | 2026-05-07 | B1 closure post-T-014 consolidated pass (T-014 + ADR-0026 idle-dispatch supersession of ADR-0022 §Decision-outcome Option A; UNSAFE-2026-0014 third Amendment for `register_idle`; UNSAFE-2026-0019/0020 partial-verification + post-T-014-smoke Amendments; UNSAFE-2026-0021 no-verification Amendment) | Approve — eight axes pass; no new attack surface; smoke trace clean for full ~6 ms boot-to-end run; pre-existing forward-flagged items unchanged | [2026-05-07-B1-closure.md](2026-05-07-B1-closure.md) | | 2026-05-09 | B2 closure consolidated pass (T-016 MMU activation + identity-mapped kernel + `MapperFlush` flush-token discipline; ADR-0027 + ADR-0009 §Revision rider; UNSAFE-2026-0022/0023/0024/0025 introduced + 0023/0024 bootstrap-Amendments + 0022/0023/0024 smoke-verification Amendments) | Approve — eight axes pass; MMU on with identity-only layout; one new forward-flagged item (UNSAFE-2026-0025 per-call `Mmu::map`/`unmap` smoke verification — gates on first B3+ post-bootstrap caller) | [2026-05-09-B2-closure.md](2026-05-09-B2-closure.md) | | 2026-05-14 | B3 closure consolidated pass (T-017 PMM + T-018 `AddressSpace` kernel object + cap-gated wrappers + activation-on-context-switch; ADR-0035 + ADR-0028; UNSAFE-2026-0026 introduced; UNSAFE-2026-0014 5th Amendment for activation hook + BSP closure; UNSAFE-2026-0025 body-correction Amendment for `MmuError::BlockMapped` variant split) | Approve — eight axes pass; AS kernel-object scaffold lands without widening attack surface; PR #28 five-round arc closed three load-bearing memory-safety items pre-merge; one forward-flagged item (`cap_map`/`cap_unmap` per-op rights gap — deferred to B5+ ADR pairing `CapRights::{MAP,UNMAP,ACTIVATE}` with `CapKind::MemoryRegion`) | [2026-05-14-B3-closure.md](2026-05-14-B3-closure.md) | -| 2026-05-28 | B4 closure consolidated pass (T-019 task loader + ADR-0029; master-review PR #32 remediation; UNSAFE-2026-0027 introduced + 4 boundary-hardening Amendments; UNSAFE-2026-0028 introduced via MR-011 / X3-001 audit-trail completion; UNSAFE-2026-0025/0026 `Pending QEMU smoke verification` notes lifted — T-019 first runtime exerciser; MR-005 d8–d15 `ContextSwitch` contract + ADR-0020 rider) | Approve — eight axes pass; capability-gated `load_image` produces a `LoadedImage` but mints no runnable task / opens no EL0 boundary (B5/B6); reconciles with master-review security PASS + audit-log-in-sync; one Adjustment (MR-009 Miri-as-Phase-B-exit-prerequisite is a blocking CI job but not yet written into the phase-b.md exit bar — one-line doc gap, non-blocking) | [2026-05-28-B4-closure.md](2026-05-28-B4-closure.md) | +| 2026-05-28 | B4 closure consolidated pass (T-019 task loader + ADR-0029; master-review PR #32 remediation; UNSAFE-2026-0027 introduced + 4 boundary-hardening Amendments; UNSAFE-2026-0028 introduced via MR-011 / X3-001 audit-trail completion; UNSAFE-2026-0025/0026 `Pending QEMU smoke verification` notes lifted — T-019 first runtime exerciser; MR-005 d8–d15 `ContextSwitch` contract + ADR-0020 rider) | Approve — eight axes pass; capability-gated `load_image` produces a `LoadedImage` but mints no runnable task / opens no EL0 boundary (B5/B6); reconciles with master-review security PASS + audit-log-in-sync; MR-009 (Miri-as-Phase-B-exit-prerequisite) closed in-branch — Miri is a blocking CI job and is now written into the phase-b.md exit bar, so all 24 findings are resolved | [2026-05-28-B4-closure.md](2026-05-28-B4-closure.md) | diff --git a/docs/roadmap/current.md b/docs/roadmap/current.md index 46ec344..973f8c7 100644 --- a/docs/roadmap/current.md +++ b/docs/roadmap/current.md @@ -4,7 +4,7 @@ A short pointer file updated as work progresses. For the full plan see [`phases/ --- -> **2026-05-28 update — B4 CLOSED via the closure trio; B5 (syscall boundary) is next.** The B4 milestone (Task loader) is now formally **Closed**. Its closure trio fired today: [business retrospective](../analysis/reviews/business-reviews/2026-05-28-B4-closure.md) + [consolidated security review](../analysis/reviews/security-reviews/2026-05-28-B4-closure.md) (verdict **Approve**, eight axes pass) + [performance baseline](../analysis/reviews/performance-optimization-reviews/2026-05-28-B4-closure.md) (re-baseline, no proposal). This banner supersedes the 2026-05-16 banner below (which recorded B4 as "implementation-complete; closure trio pending"). **The period under review included the 2026-05-22 full-tree [master review](../analysis/reviews/master-review/2026-05-22-152729/consolidated.md)** (verdict: APPROVE the shipped kernel — 0 code-correctness/security Blockers; issues clustered in CI/doc/ADR) **and its remediation PR #32**, which closed **23 of 24** verified Blocker+Major findings; the one residual (MR-009: Miri is a blocking CI gate but "Miri green = Phase-B exit prerequisite" was not yet written into the phase-b.md exit bar) was closed as a B4-closure follow-up action. **Gates at HEAD `3ab029f` (reproduced live 2026-05-28):** `cargo host-test` **286 / 286** (43 hal + 187 kernel + 53 test-hal + 3 doc-tests; the earlier 260 was the pre-remediation count), `fmt` / `host-clippy` / `kernel-clippy` / `kernel-build` clean; QEMU smoke runs the full demo through `tyrne: all tasks complete` with the `tyrne: image loaded (...)` line; `-d int,unimp,guest_errors` = **629 events, 100 % pre-existing PL011-disabled-UART noise, zero fault classes**. Release perf band p10/p50/p90 = **15.641 / 17.587 / 19.150 ms** (+5.3–5.7 ms vs B3 — one-time boot cost of the loader's first post-bootstrap `cap_map` walks under QEMU TCG; real-hardware projection ~40 µs). Audit log at **28** entries (UNSAFE-2026-0027 + 0028 added; 0025/0026 `Pending QEMU smoke verification` notes lifted — T-019 is their first runtime exerciser). **Next:** open B5 — ADR-0030 (syscall ABI + `IpcError` split) + ADR-0031 (initial syscall set), then EL0→EL1 SVC dispatch. The [B4 closure trio](../analysis/reviews/business-reviews/2026-05-28-B4-closure.md) is the canonical source for these metrics. +> **2026-05-28 update — B4 CLOSED via the closure trio; B5 (syscall boundary) is next.** The B4 milestone (Task loader) is formally **Closed** via its closure trio — [business retrospective](../analysis/reviews/business-reviews/2026-05-28-B4-closure.md) + [security review](../analysis/reviews/security-reviews/2026-05-28-B4-closure.md) (**Approve**) + [performance baseline](../analysis/reviews/performance-optimization-reviews/2026-05-28-B4-closure.md) — **which is the canonical source for B4's closing metrics** (not duplicated here). The period also included the 2026-05-22 full-tree [master review](../analysis/reviews/master-review/2026-05-22-152729/consolidated.md) (APPROVE the kernel; findings clustered in CI/doc/ADR, 0 kernel-correctness/security Blockers) and remediation PR #32 which — with MR-009 closed at this closure — resolved **all 24** verified Blocker+Major findings. Headline: gates green at HEAD `3ab029f` (**286** host tests; QEMU smoke clean, 629 guest-errors all pre-existing PL011; release perf band 15.641 / 17.587 / 19.150 ms). This banner supersedes the 2026-05-16 banner below. **Next:** B5 — ADR-0030 (syscall ABI) + ADR-0031 (initial syscall set), then EL0→EL1 SVC dispatch. > > **2026-05-16 update — T-019 merged; B4 implementation-complete; closure trio pending.** PR #31 merged into `main` at commit `7f876af` ("Merge pull request #31 from cemililik/t-019-task-loader"), landing T-019 (task loader) on `main`. The branch arc continued past the review-round-4 commit named in the 2026-05-15 banner below with two further follow-up commits: `5078944` (review-round 5 — added one PMM host test, taking the suite to **260/260**) and `eb14c51` (review-round 6 — 5 valid findings). T-019 status flips `In Review → Done` (`date_done: 2026-05-16`). **Host-test count at HEAD: 260/260** (42 hal + 175 kernel + 43 test-hal); the 2026-05-15 banner's "259/259" was accurate when written, before the round-5 PMM test landed. B4 is now **implementation-complete**; the **B4 closure trio (business + security + performance reviews) has NOT yet fired** and is the next review trigger (the maintainer sequences it separately). This banner resolves the pre-merge "In Review" state recorded below — that banner is retained as a point-in-time record. > @@ -59,7 +59,7 @@ A short pointer file updated as work progresses. For the full plan see [`phases/ - **In review:** none. (The B4 closure trio fired 2026-05-28 and is committed/awaiting maintainer review; it is not an in-flight task.) - **In progress:** none. - **Working branch:** none / B5 prep opens next. Development branches off `main` per the PR pattern; no task branch is currently active and no rebase is pending. -- **Last completed milestone:** **B4 — Task loader, Closed 2026-05-28** via the closure trio ([business](../analysis/reviews/business-reviews/2026-05-28-B4-closure.md) + [security](../analysis/reviews/security-reviews/2026-05-28-B4-closure.md) Approve + [performance](../analysis/reviews/performance-optimization-reviews/2026-05-28-B4-closure.md) baseline). Required task Done: T-019 (2026-05-16, PR #31 `7f876af`). Closing numbers (canonical source = the trio): **286 host tests** (43 hal + 187 kernel + 53 test-hal + 3 doc-tests; was 260 at the T-019 merge, **+26** from the 2026-05-22 master-review remediation PR #32); QEMU smoke runs the full demo through `tyrne: all tasks complete` with the `tyrne: image loaded (...)` line; `-d int,unimp,guest_errors` = **629 events** (100 % pre-existing PL011 noise; zero fault classes); release perf band p10/p50/p90 = **15.641 / 17.587 / 19.150 ms**; audit log **28** entries (UNSAFE-2026-0027 + 0028 added; 0025/0026 `Pending QEMU smoke verification` notes lifted by T-019). The 2026-05-22 master review (APPROVE the kernel; CI/doc/ADR findings) + PR #32 remediation (23/24 closed) landed in this period. **Previous closures:** **B3** 2026-05-14 (PR #29 `b425dc1`); **B2** 2026-05-09; **B1** 2026-05-07 (PR #15 `e9fa019` + PR #16 `95b15aa`); **B0** 2026-04-27 (PR #9 `9a66e8b`). +- **Last completed milestone:** **B4 — Task loader, Closed 2026-05-28** via the closure trio ([business](../analysis/reviews/business-reviews/2026-05-28-B4-closure.md) + [security](../analysis/reviews/security-reviews/2026-05-28-B4-closure.md) Approve + [performance](../analysis/reviews/performance-optimization-reviews/2026-05-28-B4-closure.md) baseline). Required task Done: T-019 (2026-05-16, PR #31 `7f876af`). The trio is the **canonical source for B4's closing metrics**; headline: **286** host tests, QEMU smoke clean (629 guest-errors, all pre-existing PL011, zero fault classes), release perf band 15.641 / 17.587 / 19.150 ms, audit log 28 entries. The 2026-05-22 master review + PR #32 remediation (all 24 Blocker+Major findings resolved, MR-009 closed at this closure) landed in this period. **Previous closures:** **B3** 2026-05-14 (PR #29 `b425dc1`); **B2** 2026-05-09; **B1** 2026-05-07 (PR #15 `e9fa019` + PR #16 `95b15aa`); **B0** 2026-04-27 (PR #9 `9a66e8b`). - **Last completed tasks:** **T-019 — Done 2026-05-16, merged to `main` via PR #31** (branch `t-019-task-loader`, merge commit `7f876af`) — Task loader: `load_image` produces a `LoadedImage` descriptor of a freshly populated userspace AS (10-variant `LoadError`, leak-path-closure preflight chain, UNSAFE-2026-0027 byte-copy entry); does **not** mint a runnable `TaskCap` (B5/B6 prerequisite). **Earlier:** **T-018 — Done 2026-05-11, live on `main` 2026-05-14 via PR #28** (branch `t-018-address-space-kernel-object`, merge commit `47b0a86`). T-018 implementation: [`AddressSpace`](../../kernel/src/mm/address_space.rs) kernel-object struct + per-type [`AddressSpaceArena`](../../kernel/src/mm/address_space.rs) (ADR-0016 pattern); `CapKind::AddressSpace` + `CapObject::AddressSpace(AddressSpaceHandle)` variants in [`kernel/src/cap/mod.rs`](../../kernel/src/cap/mod.rs); capability-gated wrappers `cap_create_address_space` / `cap_map` / `cap_unmap` with step-by-step preflights (DERIVE rights → no-widening → depth preflight → arena/cap-table capacity → PMM alloc → arena commit → `cap_derive` cap-table insert); `Task` struct extension with `address_space_handle`; activation-on-context-switch hook threaded through `yield_now` / `start` / `ipc_recv_and_yield` / `ipc_send_and_yield` (closure-as-parameter, fires only when outgoing and incoming task ASes differ — short-circuits in v1's bootstrap-shared topology); BSP wiring in [`bsp-qemu-virt/src/main.rs`](../../bsp-qemu-virt/src/main.rs) wraps the already-live bootstrap root via the new `QemuVirtAddressSpace::from_existing_root` `pub unsafe fn` companion. Cross-cutting additions during the review-round arc: `MmuError::BlockMapped` variant (commit `8b9f52e`) so unmap into a bootstrap block descriptor surfaces a distinct typed error from `AlreadyMapped`; `CapabilityTable::depth_of` `pub(crate)` preflight helper closing the PMM-leak path; UNSAFE-2026-0014 fifth Amendment scope-extends the umbrella to the activation hook + BSP-side activation closure (zero new audit entries — additive scope on the existing `&mut Scheduler` momentary-borrow umbrella). Smoke trace gains one new line `tyrne: address-space-arena ready (1 / 8 slots used; bootstrap AS root = 0x4008d000)` immediately after `tyrne: pmm initialized (...)` and before `tyrne: timer ready (...)`. Full demo runs to `tyrne: all tasks complete`; `-d int,unimp,guest_errors` reports only the pre-existing PL011-disabled-UART noise (unchanged baseline). **Earlier:** T-017 — Done 2026-05-10 (PR #27, branch `t-017-physical-memory-manager`) — Physical Memory Manager (`Pmm` bitmap allocator + `FrameProvider` trait + UNSAFE-2026-0026 zero-fill audit). **Earlier:** T-016 — Done 2026-05-08 (branch `t-016-mmu-activation`) — MMU activation, VMSAv8 descriptor encoders, `MapperFlush` flush-token, UNSAFE-2026-0022 / 0023 / 0024 / 0025 introduced. **Earlier:** T-015 — Done 2026-05-07 (PR #17, branch `t-015-endpoint-rollback-cancel-recv`) — `ipc_cancel_recv` recovery primitive + symmetric scheduler+endpoint rollback in `ipc_recv_and_yield`'s Phase 2 Deadlock branch (ADR-0032). **Earlier:** T-014 (2026-05-07 via PR #15), T-012 (2026-04-28 via PR #10), T-013 (2026-04-27 via PR #9). - **Last reviews:** - [B4 closure retrospective (2026-05-28)](../analysis/reviews/business-reviews/2026-05-28-B4-closure.md) — Task loader (T-019) + the 2026-05-22 master-review interlude + PR #32 remediation (23/24 verified findings closed) From 823410024f9ef2ab21a830caab8cf9d21bbdbfbb Mon Sep 17 00:00:00 2001 From: Cemil ILIK Date: Fri, 29 May 2026 04:56:26 +0300 Subject: [PATCH 3/3] fix(docs): update documentation links and clarify milestone statuses in phase A and B --- .github/workflows/ci.yml | 6 ++++-- .../tasks/phase-a/T-003-ipc-primitives.md | 4 ++-- .../tasks/phase-a/T-004-cooperative-scheduler.md | 2 +- docs/roadmap/phases/phase-a.md | 4 ++-- docs/roadmap/phases/phase-b.md | 15 ++++++++------- 5 files changed, 17 insertions(+), 14 deletions(-) diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml index a14ca18..1918283 100644 --- a/.github/workflows/ci.yml +++ b/.github/workflows/ci.yml @@ -8,7 +8,8 @@ # - lint-and-host-test, kernel-build, host-stable-check (fast lane) # - miri (required, slow) # The `miri` job runs the host-test suite under Stacked Borrows; it is -# slower (~10–15 min) but a Miri regression is a hard stop. The +# slower (~1–2 min in practice on the current small suite; historically +# budgeted at ~10–15 min) but a Miri regression is a hard stop. The # `coverage` job is INFORMATIONAL only (it sets `continue-on-error: true`) # and must NOT be added to the required-checks list until the post-T-011 # flip removes that flag — see docs/guides/ci.md §"Branch protection". @@ -182,7 +183,8 @@ jobs: # ─── Miri: aliasing validation ────────────────────────────────────────── # Runs the full host-test suite under Miri's Stacked Borrows checker # (see ADR-0021 / UNSAFE-2026-0014). Slower than the fast lane - # (~10–15 min) and requires nightly, so it runs as its own job. A + # (~1–2 min in practice; historically budgeted ~10–15 min) and requires +# nightly, so it runs as its own job. A # Miri regression is a hard stop. miri: name: miri (Stacked Borrows) diff --git a/docs/analysis/tasks/phase-a/T-003-ipc-primitives.md b/docs/analysis/tasks/phase-a/T-003-ipc-primitives.md index c13c1a5..05d2d1b 100644 --- a/docs/analysis/tasks/phase-a/T-003-ipc-primitives.md +++ b/docs/analysis/tasks/phase-a/T-003-ipc-primitives.md @@ -19,7 +19,7 @@ As the Tyrne kernel, I want synchronous rendezvous `send` / `recv` operations on [T-002](T-002-kernel-object-storage.md) delivered the kernel-object storage layer: `Endpoint` and `Notification` live in bounded arenas, capabilities name them through typed handles, and the create / destroy lifecycle is explicit. What the A3 objects lack is *behaviour*: an `Endpoint` has waiter-queue placeholder fields but no `send` or `recv` logic; a `Notification` can set and consume bits but has no `notify` or `wait` path. -T-003 wires the behaviour. The design decisions — pure rendezvous vs. rendezvous + reply-recv fastpath, blocking semantics, message format, whether badges are needed in v1 — are delegated to [ADR-0017](../../../decisions/0017-ipc-primitive-set.md) (and conditionally [ADR-0018](../../../decisions/0018-badge-scheme.md)). ADR-0017 must be Accepted before any implementation code lands. +T-003 wires the behaviour. The design decisions — pure rendezvous vs. rendezvous + reply-recv fastpath, blocking semantics, message format, whether badges are needed in v1 — are delegated to [ADR-0017](../../../decisions/0017-ipc-primitive-set.md) (and conditionally [ADR-0018](../../../decisions/0018-badge-scheme-and-reply-recv-deferral.md)). ADR-0017 must be Accepted before any implementation code lands. **Scope constraint.** Phase A still has no real scheduler: tasks are kernel-level stubs without context switching. A4 must therefore implement IPC *without* a live scheduler: blocking is represented as a wait-queue that the scheduler (A5) will drain, but in A4 host tests the "block" state is exercised by constructing two task stubs that hand-deliver to each other. The scheduler integration happens in A5, not here. @@ -81,7 +81,7 @@ Design is pinned in ADR-0017. At a sketch level: - [ADR-0016: Kernel object storage](../../../decisions/0016-kernel-object-storage.md) — the A3 foundation this task extends. - [ADR-0017: IPC primitive set](../../../decisions/0017-ipc-primitive-set.md) — Accepted 2026-04-21. -- [ADR-0018: Badge scheme](../../../decisions/0018-badge-scheme.md) — deferred; see ADR-0017 §"Open questions" for the deferral rationale and revisit trigger. +- [ADR-0018: Badge scheme](../../../decisions/0018-badge-scheme-and-reply-recv-deferral.md) — deferred; see ADR-0017 §"Open questions" for the deferral rationale and revisit trigger. - [Phase A plan](../../../roadmap/phases/phase-a.md) — A4 sub-breakdown and acceptance criteria. - [T-002](T-002-kernel-object-storage.md) — delivers the `Endpoint` and `Notification` objects this task wires up. - seL4 IPC model — synchronous rendezvous with capability transfer (prior art; badge scheme not adopted in v1). diff --git a/docs/analysis/tasks/phase-a/T-004-cooperative-scheduler.md b/docs/analysis/tasks/phase-a/T-004-cooperative-scheduler.md index 8f0d9f5..a66831a 100644 --- a/docs/analysis/tasks/phase-a/T-004-cooperative-scheduler.md +++ b/docs/analysis/tasks/phase-a/T-004-cooperative-scheduler.md @@ -81,7 +81,7 @@ Design is delegated to ADR-0019 and ADR-0020. At a sketch level: - [ADR-0017: IPC primitive set](../../../decisions/0017-ipc-primitive-set.md) — Accepted; A5 wires its blocking semantics to the scheduler. - [ADR-0019: Scheduler shape](../../../decisions/0019-scheduler-shape.md) *(to be written before implementation)*. -- [ADR-0020: Cpu trait v2 / context-switch extension](../../../decisions/0020-cpu-trait-v2.md) *(to be written before context-switch code lands)*. +- [ADR-0020: Cpu trait v2 / context-switch extension](../../../decisions/0020-cpu-trait-v2-context-switch.md) *(to be written before context-switch code lands)*. - [Phase A plan](../../../roadmap/phases/phase-a.md) — A5 sub-breakdown and acceptance criteria. - [T-003](T-003-ipc-primitives.md) — delivers the IPC waiter states this task wires to the scheduler. - seL4 scheduler model — priority-based, cooperative within a priority band (prior art; full model deferred). diff --git a/docs/roadmap/phases/phase-a.md b/docs/roadmap/phases/phase-a.md index f4eaeea..4337c26 100644 --- a/docs/roadmap/phases/phase-a.md +++ b/docs/roadmap/phases/phase-a.md @@ -92,7 +92,7 @@ Milestone A4 builds the actual IPC paths against the `Endpoint` and `Notificatio --- -## Milestone A4 — IPC primitives +## Milestone A4 — IPC primitives ✓ (done 2026-04-21) Synchronous rendezvous endpoints and asynchronous notifications. Capability transfer with a message is atomic with delivery. @@ -158,7 +158,7 @@ Milestone A6 integrates IPC + scheduling to demonstrate Phase A end-to-end. --- -## Milestone A6 — Two-task IPC demo +## Milestone A6 — Two-task IPC demo ✓ (done 2026-04-21) Integration: the kernel runs a deterministic two-task scenario where Task A sends a capability-gated message to Task B through an endpoint, B replies, and both exit cleanly. diff --git a/docs/roadmap/phases/phase-b.md b/docs/roadmap/phases/phase-b.md index d93eae1..f754242 100644 --- a/docs/roadmap/phases/phase-b.md +++ b/docs/roadmap/phases/phase-b.md @@ -29,7 +29,7 @@ Cleans up the items the 2026-04-21 Phase-A code and security reviews surfaced. E 4. **Architecture docs × 3** via the [`write-architecture-doc`](../../../.agents/skills/write-architecture-doc/SKILL.md) skill: `docs/architecture/kernel-objects.md` (ADR-0016 + Arena pattern), `docs/architecture/ipc.md` (ADR-0017 + ADR-0018 + state machine), `docs/architecture/scheduler.md` (ADR-0019 + ADR-0020 + IPC bridge + UNSAFE-2026-0008). Code review §Documentation follow-up #2. 5. **Timer initialisation** — populate `QemuVirtCpu`'s `Timer` trait impl with `CNTVCT_EL0` (virtual counter, register-family-aligned with the deferred `CNTV_*` deadline-arming registers per ADR-0010) and `CNTFRQ_EL0` reads; wire a free-running counter so IPC round-trip latency and context-switch overhead can be measured. Unlocks the first hypothesis-driven performance-review cycle (baseline at [`2026-04-21-A6-baseline.md`](../../analysis/reviews/performance-optimization-reviews/2026-04-21-A6-baseline.md) is blocked on this). *Note: the original phase-plan wording said "CNTPCT_EL0"; T-009 second-read review surfaced the register-family mismatch and switched to `CNTVCT_EL0`.* 6. **Scheduler / IPC hardening bundle.** Grouped in T-010 with ADR-0022's implementation: - - `const { assert!(N > 0) }` on `SchedQueue::new` and `CapabilityTable::new` so zero-capacity constructions are a build-time error, matching `Arena::new`'s pattern. + - `const { assert!(N > 0) }` on `SchedQueue::new` so zero-capacity constructions are a build-time error, matching `Arena::new`'s pattern. (`CapabilityTable` is not generic over its capacity, so it carries the analogous compile-time guard in a different form — `const { assert!(CAP_TABLE_CAPACITY <= Index::MAX) }` — rather than an `N > 0` assertion.) - `debug_assert_ne!(current_idx, next_idx)` before the split-borrow `unsafe` blocks in `yield_now` / `ipc_recv_and_yield` to catch regressions that stop dequeuing the running task. - Replace `debug_assert!` in the resume path with a hard `Err(SchedError::Ipc(...))` return (see item 2 above). 7. **`TaskArena` local → global StaticCell migration.** Bundled with T-006 (ADR-0021) to avoid two rounds of BSP static-cell churn. Brings `TaskArena` into the same reachability story as `EP_ARENA` / `TABLE_{A,B}` and satisfies the ADR-0016 "arenas belong to the kernel" framing. Post-A6 inline review feedback #15. @@ -40,7 +40,7 @@ Cleans up the items the 2026-04-21 Phase-A code and security reviews surfaced. E ### Acceptance criteria -- ✅ ADR-0021, ADR-0022 Accepted; ADR-0023 deferred (Phase B6+ revocation work; "accept-deferred" path per the original B0 plan). +- ✅ ADR-0021, ADR-0022 Accepted at B0 (ADR-0022 was later partially superseded by ADR-0026 in B1 on its *idle-task-location* axis only; the *typed-error* axis — `SchedError::Deadlock` — that B0 depends on still stands); ADR-0023 deferred (Phase B6+ revocation work; "accept-deferred" path per the original B0 plan). - ✅ No `panic!(...)` remaining in `kernel/src/sched/mod.rs` reachable in production; the `start` / `start_prelude` "empty ready queue" panic survives as a kernel-programming-error guard rendered structurally unreachable by ADR-0022's idle-task-at-boot rule. - ✅ `docs/audits/unsafe-log.md` UNSAFE-2026-0012 entry marked `Removed` with the resolution commit (`f9b72f8` — T-006 / ADR-0021). - ✅ Two architecture docs committed (`scheduler.md` + `ipc.md`) with the `hal.md` Timer subsection update; linked from `docs/architecture/README.md` as Accepted. The originally-projected third doc was subsumed: `kernel-core.md` and `scheduling.md` were collapsed into `scheduler.md` + `ipc.md` per T-008's scope-discipline call. @@ -76,7 +76,7 @@ B1 / B2 / B3 all depend on a panic-free scheduler and a non-UB aliasing story. B Extend the BSP reset stub so that when QEMU delivers us at EL2, we configure `HCR_EL2`, `SPSR_EL2`, `ELR_EL2`, and issue `ERET` to land in EL1. When QEMU delivers at EL1, the stub is a no-op on that axis. -The scope of this milestone was extended on 2026-04-27 (after T-009 — the time-source half of `Timer` — landed in `In Review`) to include the *exception delivery infrastructure* that ADR-0022's first-rider sub-rider gated on. Concretely: **GICv2 distributor + CPU interface** configuration on QEMU virt (GICv2 has no redistributor — that is GICv3 terminology; QEMU virt defaults to GICv2 unless `-machine gic-version=3` is set), an EL1 exception vector table install at `VBAR_EL1`, a thin handler-dispatch loop, and the generic-timer-IRQ wiring that lets `Timer::arm_deadline` and `Timer::cancel_deadline` actually fire interrupts. Without this work, `arm_deadline` / `cancel_deadline` remain `unimplemented!()` and idle's body cannot move from `spin_loop` to `wfi`. +The scope of this milestone was extended on 2026-04-27 (after T-009 — the time-source half of `Timer` — landed in `In Review`) to include the *exception delivery infrastructure* that ADR-0022's first-rider sub-rider gated on. Concretely: **GICv2 distributor + CPU interface** configuration on QEMU virt (GICv2 has no redistributor — that is GICv3 terminology; QEMU virt defaults to GICv2 unless `-machine gic-version=3` is set; this GICv2 / no-IOMMU reality was later ratified by [ADR-0036](../../decisions/0036-qemu-virt-gicv2-no-iommu-v1.md), which corrects the GICv3 / SMMUv3 statements in ADR-0004 / 0006 / 0012), an EL1 exception vector table install at `VBAR_EL1`, a thin handler-dispatch loop, and the generic-timer-IRQ wiring that lets `Timer::arm_deadline` and `Timer::cancel_deadline` actually fire interrupts. Without this work, `arm_deadline` / `cancel_deadline` remain `unimplemented!()` and idle's body cannot move from `spin_loop` to `wfi`. ### Sub-breakdown @@ -143,7 +143,7 @@ B3 builds per-task address spaces on top of this. B5 (syscall trap) reuses the e Multiple per-task translation tables. Capability-gated map / unmap. Activation on context switch (tie-in to A5's context switch, now post-B0 with raw-pointer scheduler API). -**Status: B3 implementation-complete 2026-05-11. Closure trio (business + security + performance) is the next review trigger.** B3 §1 closed 2026-05-10 (T-017 PMM); B3 §§2–7 closed 2026-05-11 (T-018 AddressSpace kernel object). [ADR-0035](../../decisions/0035-physical-memory-manager.md) settles the PMM design (bitmap allocator with hint pointer, 4 KiB metadata for QEMU virt's 32 K frames, reservation-list at init); T-017 implements it. [ADR-0028](../../decisions/0028-address-space-data-structure.md) settles the AddressSpace data-structure (Option A — generic `AddressSpace` wrapping `M::AddressSpace` inline; per-type arena); T-018 implements the kernel object + cap-gated wrappers + activation hook in 6 bisectable commits. +**Status: B3 Closed 2026-05-14** via PR #29's closure trio (business + security + performance baseline; merge commit `b425dc1`). B3 §1 closed 2026-05-10 (T-017 PMM); B3 §§2–7 closed 2026-05-11 (T-018 AddressSpace kernel object). [ADR-0035](../../decisions/0035-physical-memory-manager.md) settles the PMM design (bitmap allocator with hint pointer, 4 KiB metadata for QEMU virt's 32 K frames, reservation-list at init); T-017 implements it. [ADR-0028](../../decisions/0028-address-space-data-structure.md) settles the AddressSpace data-structure (Option A — generic `AddressSpace` wrapping `M::AddressSpace` inline; per-type arena); T-018 implements the kernel object + cap-gated wrappers + activation hook in 6 bisectable commits. ### Sub-breakdown @@ -153,7 +153,7 @@ Multiple per-task translation tables. Capability-gated map / unmap. Activation o 4. ✅ **Map / unmap operations** — wrappers around [`Mmu::map`](../../../hal/src/mmu/mod.rs) / `Mmu::unmap` that validate the caller's capabilities. *T-018 commit 3 (2026-05-11) — `cap_create_address_space` / `cap_map` / `cap_unmap`.* 5. ✅ **TLB invalidation on unmap** — single-core only; multi-core is Phase C. **Already implemented** in [T-016](../../analysis/tasks/phase-b/T-016-mmu-activation.md) at the HAL surface (`MapperFlush::flush(&mmu)` discharges the per-VA invalidate); B3 §4 wires this into the capability-gated unmap path. *T-018 commit 3 (`cap_unmap` discharges the flush token).* 6. ✅ **Activation on context switch** — the context-switch path invokes [`Mmu::activate`](../../../hal/src/mmu/mod.rs) when crossing between tasks with different address spaces. *T-018 commit 4 (scheduler-side hook) + commit 5 (BSP-side activation closure). v1 demo: all tasks share `BOOTSTRAP_ADDRESS_SPACE_HANDLE`; hook short-circuits at runtime; host tests pin the fire path.* -7. ✅ **Tests** — isolation between two address spaces (a map in AS-X is not visible in AS-Y); activation round-trip. *T-018 host tests in `kernel/src/mm/address_space.rs::tests` (17 tests) + scheduler activation-hook tests (3 tests). Workspace total 221/221.* +7. ✅ **Tests** — address-space isolation (each `AddressSpace` gets a distinct arena slot + root and its own per-handle mapping store; the cross-AS "AS-X cannot read AS-Y" guarantee is the QEMU acceptance criterion below, verified *structurally* in the host tests rather than by a single named cross-visibility assertion); activation round-trip. *T-018 host tests in `kernel/src/mm/address_space.rs::tests` (17 at T-018 landing; 22 at HEAD after master-review remediation) + scheduler activation-hook tests (3 tests). Workspace total 221/221 at the 2026-05-11 landing snapshot.* ### Tasks under B3 @@ -177,7 +177,7 @@ Multiple per-task translation tables. Capability-gated map / unmap. Activation o Load a userspace binary into an address space. For B4 the binary is statically embedded in the kernel image (e.g., `include_bytes!`); the filesystem / dynamic loading comes later. -**Status: B4 implementation-complete 2026-05-16 (T-019 merged via PR #31; closure trio pending).** ADR-0029 Accepted; T-019 (task loader) merged to `main`. The closure trio (business + security + performance) is the next review trigger, following the B3 precedent. The §Sub-breakdown / §Acceptance-criteria / §Revision-notes below stand as the design record; the past-tense framing reflects that the implementation has landed. +**Status: B4 Closed 2026-05-28** via the closure trio ([business retrospective](../../analysis/reviews/business-reviews/2026-05-28-B4-closure.md) + [security review](../../analysis/reviews/security-reviews/2026-05-28-B4-closure.md) (Approve) + [performance baseline](../../analysis/reviews/performance-optimization-reviews/2026-05-28-B4-closure.md)), which is the canonical source for B4's closing metrics. ADR-0029 Accepted; T-019 (task loader) merged to `main` 2026-05-16 (PR #31). The §Sub-breakdown / §Acceptance-criteria / §Revision-notes below stand as the design record; the past-tense framing reflects that the implementation has landed. ### Sub-breakdown @@ -280,6 +280,7 @@ When B6 is Done, run a business review. Phase C becomes active after that review | ADR-0033 | Kernel high-half migration | B5+ (placeholder; named-but-unallocated) | named in [ADR-0027](../../decisions/0027-kernel-virtual-memory-layout.md) §Decision outcome (Option D) as the future home of the `TTBR0_EL1`-swap discipline that arrives with userspace. No file today; opens with the first B5 task whose userspace requires per-task address-space switching. Mirrors the slot-naming pattern of ADR-0028 / 0029 / 0030 / 0031. | | ADR-0034 | Kernel-image section permissions (.text RX / .rodata R / .bss/.data RW) | B-late (placeholder; named-but-unallocated) | named in [ADR-0027 §Decision outcome (a)](../../decisions/0027-kernel-virtual-memory-layout.md) as the future home of finer-grained kernel-image permissions. v1 maps the entire 128 MiB RAM range as kernel R/W/X via 2 MiB blocks; T-016 §Out of scope and [`memory-management.md` §"v1 layout"](../../architecture/memory-management.md) defer the re-map. Opens with the first B-phase task whose threat model includes a kernel R/W of `.text` as a meaningful surface — likely paired with the B5+ first userspace destroy that introduces an attacker-controlled execution context. | | ADR-0035 | Physical Memory Manager (B3 prerequisite — bitmap allocator) | B3 (**Accepted 2026-05-09**) | new — drove the realisation that B3's "Address space abstraction" milestone has a foundational prerequisite (a real `FrameProvider` impl over physical RAM) which deserves its own ADR rather than being absorbed into ADR-0028 (address-space data structure). Drives [T-017 (Draft 2026-05-09; moves to In Progress with this Accept)](../../analysis/tasks/phase-b/T-017-physical-memory-manager.md). Bitmap allocator with hint pointer; 4 KiB metadata for QEMU virt's 32 K frames; reservation-list at init + cached for `free_frame` defensive validation per the §Simulation §Step 2 Critical row; forward-portable to high-half kernel without algorithm rewrite. Includes the §Simulation table walking init / alloc / free / exhaustion / recovery state transitions per [`write-adr` skill §Simulation](../../../.agents/skills/write-adr/SKILL.md). Accept landed as a separate commit per `write-adr` §10 after a careful re-read pass that surfaced and corrected three substantive drafting issues (broken anchor, safe-Rust-vs-`unsafe` zeroing contradiction, muddled "undefined-vs-error" wording in §Simulation row 2; the row-2 fix tightened the Pmm struct contract to add a cached reserved-range list for defensive `free_frame` validation, propagated to T-017). | +| ADR-0036 | QEMU virt is GICv2 / no-IOMMU in v1 (corrects ADR-0004 / 0006 / 0012) | post-B1 (**Accepted 2026-05-22**) | new — surfaced by the [2026-05-22 full-tree master review](../../analysis/reviews/master-review/2026-05-22-152729/consolidated.md): the foundational ADRs carried GICv3 / SMMUv3 statements that do not match the GICv2, no-IOMMU reality of QEMU `virt` that B1's GIC work (above) actually assumed. **Corrects** (append-only redirect rider; does **not** supersede) [ADR-0004](../../decisions/0004-target-platforms.md) / [ADR-0006](../../decisions/0006-workspace-layout.md) / [ADR-0012](../../decisions/0012-boot-flow-qemu-virt.md). Ratifies the GICv2 fact stated in the B1 milestone. | Numbers are tentative. Final numbers are assigned when the ADR is actually written, per [ADR-0013](../../decisions/0013-roadmap-and-planning.md). @@ -308,7 +309,7 @@ Numbers are tentative. Final numbers are assigned when the ADR is actually writt ## How to start Phase B -> **Historical onboarding record (B0–B4 complete).** Phase B implementation is done: B0, B1, B2, and B3 are closed and B4 is implementation-complete (closure trio pending — see the B4 status line above). The active next steps are the B4 closure trio and then B5 (syscall boundary); the live operational pointer for what to open next is [`docs/roadmap/current.md`](../current.md), not this section. The numbered procedure below is preserved as the entry procedure that was followed when Phase B opened — it is a record of how B0 was started, not live instructions. +> **Historical onboarding record (B0–B4 complete).** Phase B implementation is done: B0, B1, B2, B3, and B4 are all closed (B4 closed 2026-05-28 via its closure trio — see the B4 status line above). The active next step is B5 (syscall boundary); the live operational pointer for what to open next is [`docs/roadmap/current.md`](../current.md), not this section. The numbered procedure below is preserved as the entry procedure that was followed when Phase B opened — it is a record of how B0 was started, not live instructions. 1. Open **T-006** (raw-pointer scheduler API refactor) via the [`start-task`](../../../.agents/skills/start-task/SKILL.md) skill. Writing ADR-0021 is the first step inside that task. 2. After T-006 is In Progress, parallel work on **T-008** (architecture docs) is safe — they do not touch the same code.