Skip to content

B6 prerequisite: ADR-0033 high-half kernel migration + T-022 (boot-time kernel → TTBR1)#36

Merged
cemililik merged 7 commits into
mainfrom
t-022-high-half-kernel-mapping
May 30, 2026
Merged

B6 prerequisite: ADR-0033 high-half kernel migration + T-022 (boot-time kernel → TTBR1)#36
cemililik merged 7 commits into
mainfrom
t-022-high-half-kernel-mapping

Conversation

@cemililik
Copy link
Copy Markdown
Collaborator

@cemililik cemililik commented May 30, 2026

Opens Milestone B6's gating prerequisite: migrate the running kernel from the low identity (TTBR0_EL1) to the high half (TTBR1_EL1) at boot, freeing TTBR0_EL1 for per-task userspace. Without this a real EL0 task's SVC vector fetch + EL1 handler cannot translate (the kernel is absent from the task's TTBR0); nothing in B6 runs until it is solved.

Delivers ADR-0033 (Proposed → review-round → Accepted, separate commits per write-adr §10) + T-022 (the implementation, + a review-round addressing a careful code review).

What changed

  • Link-high / load-low: kernel linked at KBASE = 0xFFFF_FFFF_4008_0000, loaded at 0x4008_0000; ELF entry forced to _start's low PA (_start_phys).
  • Boot-time migration: kernel_entry (LOW) → mmu_bootstrap (low-identity MMU) → high_half_activate (build TTBR1 tables, EPD1 1→0) → migration trampoline (MSR VBAR-high + ISB; rebase SP; br into the high half) → kernel_main_high (frees TTBR0: null + EPD0=1 + TLBI), then the rest of bring-up at high-half addresses. New tyrne: high-half active boot marker.
  • One linear high-half offset KERNEL_HIGH_HALF_OFFSET = 0xFFFF_FFFF_0000_0000 (VA = OFFSET + PA); phys_to_kernel_va/kernel_va_to_phys (cfg all(target_arch="aarch64", target_os="none") / host-identity). Every addr_of!-as-PA + PA-deref site rebased project-wide.
  • TCR_EL1_VALUE_HIGH_HALF (EPD1 cleared; host test pins the 1-bit delta).

Two sound approach refinements vs ADR-0033's plan (both verified)

  • Whole-image-high-linked + forced-low entry instead of a separate low-VMA .idmap section — uniform high-linking makes early/low code resolve to LOW addresses via PC-relative adrp/adr. The ADR's Option-2 fallback was not needed.
  • One linear offset instead of two distinct PA↔VA offsets — they coincide by construction, so "wrong offset at a site" is impossible.

Verification — all green

340 host tests · host+kernel clippy -D warnings · cargo fmt --check · release build · cargo +nightly miri test (Stacked Borrows) · QEMU smoke boots to tyrne: all tasks complete with tyrne: high-half active · -d int,unimp shows exactly the 2 expected SVC exceptions and zero new Translation/Permission/Abort fault classes (fault-clean — ADR-0033 §Simulation row-4 abort gate). Adds tools/smoke.sh (gated, CI-usable).

Review hardening

  • Hardened against two adversarial verification passes during ADR drafting + a third multi-lens adversarial pass at implementation (4 findings, all handled).
  • A subsequent careful code review surfaced: activate not clearing EPD0 (fixed — completes the per-task swap), a latent host-portability cfg (fixedtarget_os="none"), bootstrap-root contract drift (fixed — "live"→"populated" + UNSAFE-2026-0028 Amendment), tools/smoke.sh gating (fixed), and the release console_write=0x1 being the intended debug-gate (ADR-0031), not a bug.

Audit

UNSAFE-2026-0031 (migration trampoline + TTBR0-free) + Amendments to 0022/0023/0024/0025/0026/0027/0028/0030.

⚠️ Security review requested

This changes the kernel's own translation regime + the kernel/user isolation boundary — among the highest-stakes changes in the project. T-022 is In Review pending explicit security review (Definition of done). It lands alone, before B6's EL0-entry work, per the ADR's staging discipline + CLAUDE.md §6. The remaining B6 tasks (EL0 context + ERET, task_create_from_image, the T-021 carry-forward gates — esp. gate #1 copy-user translation — tyrne-user + userland/hello) build on this settled high-half regime after the review confirms it.

🤖 Generated with Claude Code

Summary by Sourcery

Migrate the kernel from a low identity mapping to a high-half TTBR1_EL1 mapping at boot, freeing TTBR0_EL1 for per-task userspace, and document and audit this high-half regime as the B6 gating prerequisite.

Enhancements:

  • Introduce a high-half virtual memory regime with a single linear kernel VA offset and helper conversions, updating all kernel and BSP physical address dereferences accordingly.
  • Refactor the BSP boot flow to split low-half kernel_entry from high-half kernel_main_high, adding a boot-time migration trampoline and freeing TTBR0_EL1 for userspace.
  • Extend MMU bootstrap logic and linker script to support link-high/load-low images, high-half TTBR1_EL1 page tables, and a dedicated high-half TCR_EL1 configuration.
  • Tighten MMU, PMM, syscall, and address-space abstractions to operate correctly under the high-half direct map, including high-half MMIO access and page-table walks.
  • Expand architecture and roadmap documentation to describe the high-half migration, update memory-management docs, and accept ADR-0033 for kernel high-half migration.
  • Update unsafe-audit entries to cover the new migration trampoline, TTBR0_EL1 handling, and high-half table writes, clarifying contracts for existing MMU and PMM unsafe code.

Documentation:

  • Add ADR-0033 detailing the kernel high-half migration design, simulation, and decision rationale, and link it into the ADR index.
  • Update boot, memory-management, roadmap, and current-status documentation to reflect the high-half regime, B6 prerequisites, and the new boot-time migration flow.

Tests:

  • Add a non-interactive QEMU smoke script that runs the kernel to completion under a timeout, captures logs, optionally enables -d int,unimp, and gates on clean completion with no faults.
  • Extend host tests to pin the high-half TCR_EL1 configuration and validate the new high-half mappings and address translation helpers.

Summary by CodeRabbit

  • New Features

    • Kernel now performs a boot-time low→high-half migration and runs in a stable high-half address space with a post-migration entry.
    • Added high-half address translation helpers and made device/console access work both pre- and post-migration.
    • Syscall/user-window and memory allocator paths updated to use the high-half mapping.
  • Documentation

    • Added/updated ADRs, architecture, memory, boot, roadmap, audit, and task-analysis docs describing the migration, invariants, and verification history.
  • Tests

    • Added a non-interactive QEMU smoke-test runner for CI that validates boot markers and detects panics/faults.

Review Change Stack

cemililik and others added 5 commits May 29, 2026 20:48
Opens Milestone B6's gating prerequisite: making the kernel reachable from every
task's active translation so a real EL0 task's SVC vector fetch + the EL1 handler
translate (today the loader's userspace AS has only image+stack, no kernel
mappings — the EL0 vector fetch would translation-fault unrecoverably).

Decision: high-half migration (kernel -> TTBR1_EL1, EPD1 1->0; per-task userspace
-> TTBR0_EL1), the direction ADR-0027 already signposted and pre-paid for (the
single EPD1 flip; byte-stable high-half TCR fields) — extends, does NOT supersede,
ADR-0027. Two refinements make it safe + methodical: (1) BOOT-TIME (inside
mmu_bootstrap, before StaticCell init + GIC), where DAIF is masked and no live
low-VA pointer survives — removing the live-kernel bricking hazards; (2) STAGED as
a dedicated task (T-022) landed before B6's EL0 work, per CLAUDE.md #6.

The §Simulation (the boot-time low->high transition) was hardened against TWO
adversarial verification passes — the first caught an architecturally-impossible
trampoline ("mapped in both regimes", impossible with disjoint TTBR0/TTBR1 ranges
at T0SZ=T1SZ=16); the corrected mechanism has the PC physically cross low->high at
`br` with both TTBRs live. The passes also pinned: the br target must land in the
PXN=0 image window (not the physmap alias); an EPD1-cleared TCR constant with
byte-stable TTBR0 fields; ISB between the TTBR1 write and the EPD1 clear;
VBAR-high-before-br; the migration must complete before any DAIF unmask; and the
link-high/load-low + low-linked-PIC .idmap discipline + KERNEL_VA_OFFSET PA<->VA
helper that the early boot requires (the addr_of!-as-PA conflation, broken
project-wide). §Consequences is honest about the toolchain risk; Option B
(map-kernel-into-every-TTBR0) is the documented fallback if the link-split proves
intractable.

Opens T-022 (Draft) in the same commit per ADR-0025 §Rule 1; T-022's review-history
row will record the §Simulation row-to-verification mapping. Status: Proposed —
awaiting careful re-read (write-adr §10) + maintainer Accept (a separate commit).
Security-relevant (changes the kernel's own translation regime + the kernel/user
isolation boundary) — flagged for explicit security review.

Refs: ADR-0033, ADR-0027, ADR-0025
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…arly-symbol contract, AP pins

Maintainer careful-re-read (pre-Accept) on ADR-0033 + T-022; findings folded into
the Proposed draft. Each verified against the tree before fixing.

High:
- ASID policy pinned (H1): v1 keeps ASID=0 global + activate's existing
  TLBI-on-swap for correctness; the per-task ASID allocator (AddressSpace::asid +
  reuse/generation/exhaustion) is a TLB-flush-avoidance optimisation DEFERRED to a
  future task — not a B6 deliverable. (Aligns with the existing mmu.rs activate +
  the ADR-0028-deferred asid field; removes the "per-task ASID value assignment
  lands here" overclaim.)
- Dual PA↔VA offsets separated (H2): the kernel-image LINK offset
  (KBASE − KERNEL_IMAGE_PHYS_BASE; linker-symbol→PA) and the PHYSMAP offset
  (KERNEL_PHYSMAP_BASE; PA-frame deref, the existing phys_frame_kernel_ptr
  forward-flag) are distinct mappings — conflating them is a bug. §Dependency
  chain step 2 + T-022 criterion now define both + which-used-where.
- Link-high early-symbol contract closed at acceptance level (H3): T-022 now
  requires pre-jump symbols (__stack_top, __bss_*, .idmap + literal pool) to
  resolve LOW, kernel_entry as the HIGH br-target, and a relocation/linker-map
  "no pre-jump high-VMA = build failure" gate.

Medium:
- High-half layout table gained explicit AP=0b00 (EL1 RW, EL0 no-access) + SH
  columns + a boot-stack note — making "kernel not leakable to EL0" concrete
  (UXN=1 blocks execute only; AP[1]=0 is the read/write isolation).
- Row-0 barrier order corrected to DSB ISH (publish descriptors) BEFORE
  MSR TTBR1_EL1 → ISB, so no walk reads a stale descriptor.
- §Dependency chain pruned of downstream-consumer steps 7–8 (ADR-0025 §Rule 1:
  no ungrounded "Phase B6 (separate task)" forward-refs); they are now prose,
  pointing at phase-b §B6 opening sequence where they are grounded.
- M2: T-022 gained a §Review history recording the two adversarial verification
  passes + this re-read; the ADR §Context wording fixed to match (no longer
  claims a section that did not exist).

Low:
- T-022's ADR-0034 reference unlinked (was hyperlinking to the 0027 file; 0034 has
  no file — matches ADR-0027's placeholder pattern).
- ADR §Context notes the B5→B6 milestone shift (ADR-0027 anticipated "when B5
  surfaces"; B5 closed as the syscall ABI, the real EL0 task moved to B6).
- Added a VA-layout Mermaid (low-TTBR0 user / high-TTBR1 kernel + the two image
  aliases) per CLAUDE.md #4 / documentation-style.
- Softened a few charged phrases (measured-tone).

(L3 — the #### "Simulation row-to-verification mapping" heading — left as-is:
matches the established ADR-0027/0030/0031 pattern, a genuine sub-sub-section.)

Status unchanged: Proposed — awaiting careful re-read + maintainer Accept.

Refs: ADR-0033, ADR-0027, ADR-0025, ADR-0028
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Maintainer careful-re-read complete (review-round 15a6f23 folded the
findings into the Proposed draft); per write-adr §10 the Accept lands as
its own commit after the re-read.

- ADR-0033 Status: Proposed -> Accepted (Date 2026-05-30).
- docs/decisions/README.md index row + numbering-gaps note updated.
- phase-b ADR ledger row: B6 (Accepted 2026-05-30); drives T-022.
- T-022 Status: Draft -> In Progress (ADR-0033 Accepted satisfies its
  "ADR must be Accepted before code lands" dependency).

ADR-0033 extends ADR-0027 (consumes the reserved TTBR1/EPD1 + byte-stable
high-half TCR fields; no supersede). Implementation begins in T-022 (the
boot-time high-half migration, landed alone + reviewed before B6's EL0
work per the ADR's staging discipline + CLAUDE.md #6).

Refs: ADR-0033, ADR-0027, ADR-0025
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Migrate the running kernel from the low identity (TTBR0_EL1) to the high
half (TTBR1_EL1) at boot, freeing TTBR0_EL1 for per-task userspace — B6's
gating prerequisite (a real EL0 task's SVC vector fetch + EL1 handler must
translate; impossible while the kernel is identity-only in TTBR0).

Approach (two sound refinements vs ADR-0033's plan, both verified by a
multi-lens adversarial pass — the ADR's Option-2 fallback was NOT needed):
- Whole-image-high-linked + forced-low ELF entry, instead of a separate
  low-VMA .idmap section. The image links at KBASE=0xFFFF_FFFF_4008_0000
  with LMA low (AT(ADDR-OFFSET)); ENTRY is _start's low PA (_start_phys).
  Uniform high-linking makes early/low code resolve to LOW load addresses
  via PC-relative adrp/adr, so no identity-VMA section is required.
- One linear high-half offset KERNEL_HIGH_HALF_OFFSET=0xFFFF_FFFF_0000_0000
  (VA=OFFSET+PA), instead of two distinct PA<->VA offsets — they coincide
  by construction, so using the wrong offset at a site is impossible. Host
  builds define it as 0 (identity), so host tests are unaffected.

Boot flow: kernel_entry (LOW) enables the low-identity MMU (mmu_bootstrap),
builds the high-half TTBR1 tables + clears EPD1 (high_half_activate), then a
migration trampoline installs the high VBAR, rebases SP to the high stack
alias, and `br`s into kernel_main_high (HIGH). kernel_main_high frees TTBR0
(null + EPD0=1 + TLBI), prints the new `tyrne: high-half active` marker, and
runs the rest of bring-up (console/GIC at high MMIO; PMM/AS/loader/IPC/
scheduler) at high-half addresses. v1 maps the whole RAM window PXN=0
(RWX-equivalent, like the identity map it replaces; AP=0b00 keeps EL0 with
no access); the distinct PXN=1 physmap is per-section W^X hardening deferred
to ADR-0034.

HAL: TCR_EL1_VALUE_HIGH_HALF (EPD1 cleared; host test pins the 1-bit delta);
KERNEL_HIGH_HALF_OFFSET + phys_to_kernel_va/kernel_va_to_phys (cfg aarch64 /
host-identity). Every addr_of!-as-PA and PA-deref site rebased project-wide
(mmu.rs walk, pmm.rs zero-fill, phys_frame_kernel_ptr, task_loader overlap
preflight, syscall window, kernel_main_high __stack_top/__boot_pt_l0).

Verification (all green): 340 host tests; host+kernel clippy -D warnings;
cargo fmt --check; release build; cargo +nightly miri test (Stacked Borrows);
QEMU smoke boots to `tyrne: all tasks complete` with `tyrne: high-half
active`; `-d int,unimp` shows exactly the 2 expected SVC exceptions and ZERO
new Translation/Permission/Abort fault classes (fault-clean — ADR-0033
§Simulation row-4 abort gate). The pre-existing release-only console_write
status 0x1 quirk reproduces on parent bd39679 — not a T-022 regression.

Audit: UNSAFE-2026-0031 (migration trampoline + TTBR0-free) + Amendments to
0022/0023/0024 (high-half table writes / MSR sequence / post-migration TLBI)
+ 0025/0026/0027/0030 (physmap-rebase derefs).

Docs: boot.md §High-half migration; memory-management.md; phase-b §B6 step 1
+ ledger; current.md; T-022 acceptance + §Review-history (incl. the 3rd
adversarial pass + the 4 findings handled). Adds tools/smoke.sh (non-
interactive QEMU smoke for CI/agent use). T-022 -> In Review (security-
relevant; awaiting explicit security review).

Refs: T-022, ADR-0033, ADR-0027, UNSAFE-2026-0031
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ct/doc fixes

Addresses the careful review of 6c7502b (high-half migration). The verified
migration mechanism is unchanged; this completes the per-task-swap correctness
and fixes a latent host-portability cfg, contract-comment drift, tooling, docs.

- HIGH (completes T-022's "swap goes live"): QemuVirtMmu::activate now clears
  TCR_EL1.EPD0 (re-enables TTBR0 walks) in addition to writing TTBR0_EL1. The
  migration set EPD0=1 when freeing TTBR0, so without this the first real EL0
  task's lower-half access would translation-fault. (No v1 caller; fails closed
  today — the swap criterion is now actually functional for B6.)
- MEDIUM (latent host bug): KERNEL_HIGH_HALF_OFFSET's cfg changed from
  `target_arch="aarch64"` to `all(target_arch="aarch64", target_os="none")`
  (matching hal/src/cpu.rs). The bare form also matches an aarch64 HOST (Apple
  Silicon), where cargo test / Miri would deref phys_to_kernel_va of a real
  host pointer into wild memory. The kernel build (target_os=none) is unchanged.
- MEDIUM (contract drift): the bootstrap-root wrap runs AFTER kernel_main_high
  frees TTBR0, so the "currently-live in TTBR0" safety wording was false post-
  migration. from_existing_root + the wrap-site comments now say "valid +
  populated" (liveness-in-TTBR0 was never the soundness basis); UNSAFE-2026-0028
  gains an Amendment (body unchanged per unsafe-policy §3).
- LOW (tooling): tools/smoke.sh now exits non-zero on a missing completion
  marker or any panic/fault class (usable as a CI gate); --int drops
  guest_errors (uses -d int,unimp) so PL011 noise no longer interleaves.
- Docs: corrected the release console_write=0x1 note — it is the INTENDED
  debug-gate (abi.rs `5 if cfg!(debug_assertions)`, ADR-0031), not a bug or a
  T-022 effect (my earlier "B5 follow-up" framing was wrong; the review's
  abi.rs root-cause is correct). Sharpened the T-021 gate #1 copy-user
  precondition to a hard ordering requirement. boot.md sequence diagram updated
  to the split low->migration->high flow; boot_ns comments fixed to
  kernel_main_high + the metric-shift note. Pi-4 / 4 GiB forward-limit note
  added (HAL offset doc, linker.ld, boot.md). memory-management.md frame counts
  + .boot_pt frame count refreshed (T-022 +2 high-half root frames).

Gates re-run green: 340 host tests, host+kernel clippy -D warnings, fmt, kernel
build, QEMU smoke (gated, PASS) + -d int,unimp (2 SVC, zero faults).

Refs: T-022, ADR-0033, UNSAFE-2026-0028, UNSAFE-2026-0031
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@sourcery-ai
Copy link
Copy Markdown

sourcery-ai Bot commented May 30, 2026

Reviewer's Guide

Implements ADR-0033/T-022 by migrating the AArch64 kernel from a low identity mapping (TTBR0_EL1) to a high-half mapping (TTBR1_EL1) at boot, introducing a single linear high-half VA offset, updating MMU/PMM/syscall code and docs accordingly, and adding a non-interactive QEMU smoke harness.

File-Level Changes

Change Details Files
Boot-time high-half migration path: split kernel entry into low-half bootstrap and high-half main, implement migration trampoline, and free TTBR0_EL1 for per-task userspace.
  • Refactor bsp-qemu-virt kernel_entry into a low-half bootstrap that runs with MMU off/identity, sets up early console, installs vectors, calls mmu_bootstrap, then enables high-half tables via high_half_activate and jumps to kernel_main_high
  • Introduce kernel_main_high as the high-half main function that assumes execution via TTBR1_EL1, prints a new 'tyrne: high-half active' marker, frees TTBR0_EL1 (null + EPD0=1 + TLBI VMALLE1), and then performs all subsequent bring-up (PMM, AS arena, task loader, GIC, scheduler) at high addresses
  • Adjust boot_ns sampling to happen in kernel_main_high (post-migration) and update comments/docs accordingly
  • Update panic path to construct the PL011 UART via the high-half MMIO alias so panics after migration still print
  • Clarify and expand comments/docstrings in main.rs to capture the new boot flow and high-half invariants
bsp-qemu-virt/src/main.rs
docs/architecture/boot.md
Linker and page-table bootstrap changes to support link-high/load-low and a high-half TTBR1_EL1 mapping sharing L2 tables with the identity map.
  • Change linker.ld to link kernel at high KBASE while loading at physical 0x40080000 using AT(ADDR - KERNEL_HH_OFFSET) for text/rodata/data/bss, and force ELF entry to low _start_phys
  • Extend .boot_pt reservation from 4 to 6 frames, adding __boot_pt_l0_hh and __boot_pt_l1_hh for TTBR1 roots, and derive _start_phys symbol as _start - KERNEL_HH_OFFSET
  • Add a compile-time assert in bsp main.rs that KERNEL_HIGH_HALF_OFFSET matches the linker constant, failing the build on drift
  • In mmu_bootstrap.rs, declare the new high-half bootstrap frames and implement high_half_activate: build TTBR1 L0/L1 roots pointing to shared L2 tables, compute indices from KERNEL_HIGH_HALF_OFFSET, and enable TTBR1 by writing TTBR1_EL1 and TCR_EL1_VALUE_HIGH_HALF with proper barriers
bsp-qemu-virt/linker.ld
bsp-qemu-virt/src/mmu_bootstrap.rs
bsp-qemu-virt/src/main.rs
Introduce a single linear high-half VA offset and PA↔VA helpers, and rebase all kernel physical dereferences and page-table frame accesses to the high-half direct map.
  • Add KERNEL_HIGH_HALF_OFFSET, phys_to_kernel_va, and kernel_va_to_phys in hal::mmu, with cfg so host builds use offset 0, and re-export them from hal::lib
  • Change phys_frame_kernel_ptr to use phys_to_kernel_va(frame.as_usize()) for PMM frames instead of PA-as-VA identity
  • Update PMM zero-fill (Pmm::alloc_frame) to deref frames via phys_to_kernel_va and adjust safety comments to refer to the direct map instead of identity mapping
  • Change BSP page-table code (walk_or_alloc_table, walk_and_install_leaf) to obtain descriptor-frame pointers via phys_to_kernel_va instead of identity PAs
  • Convert bootstrap symbol usages that are logically physical (e.g., __stack_top, __boot_pt_l0) back to PA using kernel_va_to_phys from high-half execution
  • Adjust task_loader image-overlap preflight to rebase image.as_ptr() via kernel_va_to_phys before comparing with PMM physical extents
hal/src/mmu/mod.rs
hal/src/lib.rs
kernel/src/mm/mod.rs
kernel/src/mm/pmm.rs
bsp-qemu-virt/src/mmu.rs
kernel/src/obj/task_loader.rs
TCR/TTBR and MMU activation refinements for high-half: add high-half TCR constant, enable TTBR1_EL1 walks, and ensure per-task TTBR0_EL1 activation clears EPD0.
  • Add TCR_EL1_VALUE_HIGH_HALF in vmsav8, defined as TCR_EL1_VALUE with EPD1 cleared, and a host test asserting it differs only in bit 23
  • Use TCR_EL1_VALUE_HIGH_HALF in high_half_activate when enabling TTBR1_EL1 walks and keep TTBR0-related fields byte-stable
  • Update QemuVirtMmu::activate to also clear TCR_EL1.EPD0 when installing a per-task TTBR0 root, then do ISB/DSB/TLBI sequencing as before, so TTBR0 walks are re-enabled under the new scheme
  • Document that post-migration, the bootstrap root wrapped in from_existing_root is populated but no longer the live TTBR0, and update safety docs accordingly
hal/src/mmu/vmsav8.rs
bsp-qemu-virt/src/mmu.rs
bsp-qemu-virt/src/main.rs
Syscall and user-access window adjustments to work with the high-half kernel and direct map while preserving B5 semantics.
  • Change SYSCALL_USER_WINDOW_LEN semantics and comments to describe a window over the RAM extent as seen via the kernel's high-half direct map, not identity RAM, and make syscall_entry construct UserAccessWindow with a high-half base via phys_to_kernel_va(PMM_EXTENT_START)
  • Clarify UNSAFE-2026-0030 that in B5 the EL1 stub's user pointer is actually a valid kernel high VA, and that B6 must replace the direct deref with a per-page TTBR0-based translation when wired to real EL0
  • Ensure console_write still works in debug, and that the release build behaviour (console_write disabled via ABI decode) is documented as intended, not a regression of T-022
bsp-qemu-virt/src/syscall.rs
docs/roadmap/phases/phase-b.md
docs/audits/unsafe-log.md
Documentation and audit-log updates for ADR-0033/T-022 and new unsafe surfaces, plus a non-interactive QEMU smoke harness for CI.
  • Add ADR-0033 describing the kernel high-half migration decision, simulation, dependency chain, and option analysis
  • Update boot, memory-management, roadmap, decisions README, and current-phase docs to describe the high-half migration, new boot markers, and phase-B6 sequencing
  • Extend unsafe-log with UNSAFE-2026-0031 for the migration trampoline and TTBR0 free, and add Amendments to existing UNSAFE entries to cover high-half table writes, MSR/TLBI changes, physmap derefs, and syscall user-window semantics
  • Add tools/smoke.sh to run QEMU non-interactively with timeout, capture the full serial log, grep for markers and faults, and return CI-friendly pass/fail status
docs/decisions/0033-kernel-high-half-migration.md
docs/architecture/boot.md
docs/architecture/memory-management.md
docs/roadmap/phases/phase-b.md
docs/roadmap/current.md
docs/decisions/README.md
docs/audits/unsafe-log.md
docs/analysis/tasks/phase-b/T-022-high-half-kernel-mapping.md
tools/smoke.sh

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 30, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 04c205a4-119d-4c7c-8cbb-aee9bc79d9e0

📥 Commits

Reviewing files that changed from the base of the PR and between dfa601f and dfba2b6.

📒 Files selected for processing (6)
  • bsp-qemu-virt/src/main.rs
  • bsp-qemu-virt/src/mmu_bootstrap.rs
  • docs/architecture/boot.md
  • docs/decisions/0033-kernel-high-half-migration.md
  • hal/src/mmu/mod.rs
  • tools/smoke.sh
🚧 Files skipped from review as they are similar to previous changes (4)
  • docs/architecture/boot.md
  • bsp-qemu-virt/src/mmu_bootstrap.rs
  • docs/decisions/0033-kernel-high-half-migration.md
  • tools/smoke.sh

📝 Walkthrough

Walkthrough

Relinks the kernel high (link-high/load-low), reserves high-half bootstrap roots, builds/activates TTBR1_EL1, migrates execution to a high-half main via a trampoline, rebases runtime phys↔kernel pointer derivation to a high-half direct map, and updates docs, audits, and CI smoke tests.

Changes

High-half kernel migration (T-022)

Layer / File(s) Summary
Linker & HAL infrastructure
bsp-qemu-virt/linker.ld, hal/src/mmu/mod.rs, hal/src/mmu/vmsav8.rs, hal/src/lib.rs
Adds high-half linker constants and image layout, reserves high-half bootstrap roots, defines KERNEL_HIGH_HALF_OFFSET, phys_to_kernel_va / kernel_va_to_phys helpers, and TCR_EL1_VALUE_HIGH_HALF with unit test.
High-half table activation
bsp-qemu-virt/src/mmu_bootstrap.rs
Adds linker-symbols for high-half L0/L1 roots and high_half_activate() that publishes L0/L1 descriptors (pointing to shared L2) and programs TTBR1_EL1/TCR_EL1 with barriers.
Boot flow refactoring & migration
bsp-qemu-virt/src/main.rs
Splits boot into low kernel_entry and high kernel_main_high; uses identity UART pre-migration, runs mmu_bootstrap, activates high-half tables, performs inline-asm trampoline to high-half, frees TTBR0_EL1, captures BOOT_NS post-migration, and rebases GIC/MMIO and PMM/stack address computations.
Address-space runtime & pointer rebase
bsp-qemu-virt/src/mmu.rs, kernel/src/mm/mod.rs, kernel/src/mm/pmm.rs, kernel/src/obj/task_loader.rs, bsp-qemu-virt/src/syscall.rs
Page-table walk and descriptor slot accesses use phys_to_kernel_va; phys_frame_kernel_ptr and Pmm::alloc_frame zeroing use high-half direct-map; task-loader computes image PA via kernel_va_to_phys; syscall user-window base uses phys→kernel VA; QemuVirtMmu::activate clears EPD0 on TTBR0 install; from_existing_root contract adjusted.
QEMU smoke-test automation
tools/smoke.sh
Adds smoke-test runner with profile/timeout parsing, qemu invocation with wall-clock timeout fallback, trace capture, marker extraction, and CI pass/fail gating.
Architecture docs & task analysis
docs/architecture/boot.md, docs/architecture/memory-management.md, docs/analysis/tasks/phase-b/T-022-high-half-kernel-mapping.md
Documents low→high bootstrap/migration sequence, boot mermaid diagram, bootstrap page-table frame updates, PMM trace changes, task acceptance criteria, and review history.
Decision record & roadmap
docs/decisions/0033-kernel-high-half-migration.md, docs/decisions/README.md, docs/roadmap/current.md, docs/roadmap/phases/phase-b.md
Adds ADR-0033 describing the high-half migration decision, TTBR1 layout, boot-time simulation and dependencies; updates ADR index and roadmap active-task status.
Safety auditing (UNSAFE-2026-*)
docs/audits/unsafe-log.md
Appends and amends multiple UNSAFE entries to record high-half descriptor writes, EL1 register sequences, TLBI/barrier ordering, high-half pointer dereferences, copy_nonoverlapping destination derivation, from_existing_root contract changes, and a new UNSAFE entry for the trampoline.

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • HodeTech/Tyrne#26: Overlapping PMM alloc/zeroing changes and related alloc_frame behavior.
  • HodeTech/Tyrne#23: Prior MMU/bootstrap scaffolding and linker/mmu layout changes this builds upon.
  • HodeTech/Tyrne#10: Related EL1 interrupt/exception and VBAR/vector setup touched in the migration flow.

Poem

🐰 A kernel hops from low to high,

builds its roots and lifts SP high,
a trampoline flips VBAR and space,
TTBR1 waits to take its place,
TTBR0 freed — tasks find their base.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: implementing ADR-0033 and T-022 to migrate the kernel to high-half mapping (TTBR1_EL1) at boot, which is the central focus of this PR.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch t-022-high-half-kernel-mapping

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 1 issue, and left some high level feedback:

  • The migration trampoline currently hardcodes the 0xFFFF_FFFF mask and implicitly assumes the image PA is < 4 GiB; consider introducing a named constant/helper for this mask (and a debug assertion on the PA range) so future BSPs with >4 GiB RAM or high peripherals fail fast instead of silently miscomputing high_vbar/high_entry.
  • You encode/clear EPD0 and EPD1 using raw bit shifts in inline asm sites; it would be more robust to expose named masks/field helpers in vmsav8 (e.g. TCR_EL1_EPD0_BIT, TCR_EL1_EPD1_BIT) and reuse them in high_half_activate, kernel_main_high, and QemuVirtMmu::activate so any future TCR layout change is localized.
  • Given KERNEL_HIGH_HALF_OFFSET is chosen specifically so the QEMU virt PA range cannot overflow, it may be worth adding debug_assert!s in phys_to_kernel_va/kernel_va_to_phys (under the kernel cfg) to check that pa/va are within the expected low‑4 GiB window, which would catch accidental misuse early if these helpers get called on out‑of‑range addresses.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The migration trampoline currently hardcodes the `0xFFFF_FFFF` mask and implicitly assumes the image PA is < 4 GiB; consider introducing a named constant/helper for this mask (and a debug assertion on the PA range) so future BSPs with >4 GiB RAM or high peripherals fail fast instead of silently miscomputing `high_vbar`/`high_entry`.
- You encode/clear `EPD0` and `EPD1` using raw bit shifts in inline asm sites; it would be more robust to expose named masks/field helpers in `vmsav8` (e.g. `TCR_EL1_EPD0_BIT`, `TCR_EL1_EPD1_BIT`) and reuse them in `high_half_activate`, `kernel_main_high`, and `QemuVirtMmu::activate` so any future TCR layout change is localized.
- Given `KERNEL_HIGH_HALF_OFFSET` is chosen specifically so the QEMU virt PA range cannot overflow, it may be worth adding `debug_assert!`s in `phys_to_kernel_va`/`kernel_va_to_phys` (under the kernel `cfg`) to check that `pa`/`va` are within the expected low‑4 GiB window, which would catch accidental misuse early if these helpers get called on out‑of‑range addresses.

## Individual Comments

### Comment 1
<location path="tools/smoke.sh" line_range="43-56" />
<code_context>
+
+# perl alarm wrapper: fork QEMU, SIGTERM it after $TO seconds. QEMU inherits
+# the child's stdout/stderr (redirected to $LOG by the caller below).
+TO="$TO" perl -e '
+    my $pid = fork();
+    if ($pid == 0) { open(STDIN, "<", "/dev/null"); exec(@ARGV) or die "exec: $!"; }
+    $SIG{ALRM} = sub { kill("TERM", $pid); };
+    alarm($ENV{TO});
+    waitpid($pid, 0);
+' qemu-system-aarch64 -M virt -cpu cortex-a72 -m 128M -smp 1 \
+    -display none -serial stdio -monitor none \
+    "${INT_FLAGS[@]+"${INT_FLAGS[@]}"}" \
</code_context>
<issue_to_address>
**suggestion:** The Perl-based timeout wrapper introduces an extra dependency; consider checking for `perl` or using `timeout(1)` when available.

Using an inlined Perl script here introduces a hard Perl dependency for an otherwise lightweight harness, and some minimal CI images may not have Perl installed even though `timeout(1)` is available.

Consider:
- Prefer `timeout ${TO}s qemu-system-aarch64 ...` when `timeout` is present, with the Perl wrapper as a fallback, or
- At least `command -v perl` first and emit a clear error if it’s missing.

This improves portability and makes failures easier to diagnose in minimal environments.

```suggestion
echo "smoke: $KERNEL  (budget ${TO}s)  log -> $LOG" >&2

QEMU_ARGS=(
    -M virt
    -cpu cortex-a72
    -m 128M
    -smp 1
    -display none
    -serial stdio
    -monitor none
    "${INT_FLAGS[@]+"${INT_FLAGS[@]}"}"
    -kernel "$KERNEL"
)

if command -v timeout >/dev/null 2>&1; then
    # Prefer coreutils timeout when available
    timeout "${TO}s" qemu-system-aarch64 "${QEMU_ARGS[@]}" > "$LOG" 2>&1 || true
elif command -v perl >/dev/null 2>&1; then
    # Perl alarm wrapper: fork QEMU, SIGTERM it after $TO seconds. QEMU inherits
    # the child's stdout/stderr (redirected to $LOG by the caller below).
    TO="$TO" perl -e '
        my $pid = fork();
        if ($pid == 0) { open(STDIN, "<", "/dev/null"); exec(@ARGV) or die "exec: $!"; }
        $SIG{ALRM} = sub { kill("TERM", $pid); };
        alarm($ENV{TO});
        waitpid($pid, 0);
    ' qemu-system-aarch64 "${QEMU_ARGS[@]}" > "$LOG" 2>&1 || true
else
    echo "error: neither 'timeout(1)' nor 'perl' is available; cannot enforce ${TO}s timeout" >&2
    exit 1
fi
```
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment thread tools/smoke.sh Outdated
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements the boot-time high-half kernel migration (ADR-0033 / T-022), moving the kernel to TTBR1_EL1 and freeing TTBR0_EL1 for userspace. The changes include updating the linker script, implementing the migration trampoline and high-half initialization in main.rs, introducing PA-VA translation helpers, and updating relevant documentation and audit logs. The review feedback highlights two important improvements: first, guarding the compile-time assertion of KERNEL_HIGH_HALF_OFFSET with target-specific configuration flags to prevent build failures on hosted AArch64 environments or IDEs; second, dynamically detecting the active translation regime in the panic handler by checking the stack pointer to avoid silent translation-fault hangs during early boot panics.

Comment thread bsp-qemu-virt/src/main.rs
// A drift between the linker's hardcoded value and `tyrne_hal` would silently
// corrupt every high-half VA↔PA computation (ADR-0033 / T-022); fail the build
// instead.
const _: () = assert!(KERNEL_HIGH_HALF_OFFSET == 0xFFFF_FFFF_0000_0000);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

If the workspace is checked or compiled on a hosted AArch64 platform (such as Apple Silicon) or analyzed by IDE tools without target-specific overrides, KERNEL_HIGH_HALF_OFFSET will evaluate to 0 (as defined in the hal crate fallback), causing this compile-time assertion to fail. Guarding the assertion with #[cfg(all(target_arch = "aarch64", target_os = "none"))] ensures that workspace-wide checks and developer tooling run cleanly on hosted environments.

Suggested change
const _: () = assert!(KERNEL_HIGH_HALF_OFFSET == 0xFFFF_FFFF_0000_0000);
#[cfg(all(target_arch = "aarch64", target_os = "none"))]
const _: () = assert!(KERNEL_HIGH_HALF_OFFSET == 0xFFFF_FFFF_0000_0000);

Comment thread bsp-qemu-virt/src/main.rs Outdated
Comment on lines +1562 to +1569
// SAFETY: constructing a fresh Pl011Uart in the panic path is
// best-effort diagnostic output. Writes may interleave if the original
// instance is still reachable — acceptable per the Console contract
// (ADR-0007). Audit: UNSAFE-2026-0002.
let console = unsafe { Pl011Uart::new(PL011_UART_BASE) };
// (ADR-0007). The HIGH device-MMIO alias is used because the kernel runs
// in the high half post-migration (ADR-0033); a panic in the brief
// pre-migration low window would not print, but that window is only the
// verified `mmu_bootstrap` / `high_half_activate` path. Audit: UNSAFE-2026-0002.
let console = unsafe { Pl011Uart::new(tyrne_hal::phys_to_kernel_va(PL011_UART_BASE)) };
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Unconditionally using the high-half virtual address for the UART base in the panic handler means any panic occurring in the early boot window (pre-migration, such as during mmu_bootstrap or high_half_activate) will trigger a translation fault and silently hang the system instead of printing the panic message. Dynamically detecting whether the kernel is running in the high half by checking the stack pointer allows the panic handler to fall back to the physical UART address in early boot, significantly improving debuggability.

Suggested change
// SAFETY: constructing a fresh Pl011Uart in the panic path is
// best-effort diagnostic output. Writes may interleave if the original
// instance is still reachable — acceptable per the Console contract
// (ADR-0007). Audit: UNSAFE-2026-0002.
let console = unsafe { Pl011Uart::new(PL011_UART_BASE) };
// (ADR-0007). The HIGH device-MMIO alias is used because the kernel runs
// in the high half post-migration (ADR-0033); a panic in the brief
// pre-migration low window would not print, but that window is only the
// verified `mmu_bootstrap` / `high_half_activate` path. Audit: UNSAFE-2026-0002.
let console = unsafe { Pl011Uart::new(tyrne_hal::phys_to_kernel_va(PL011_UART_BASE)) };
// SAFETY: constructing a fresh Pl011Uart in the panic path is
// best-effort diagnostic output. Writes may interleave if the original
// instance is still reachable — acceptable per the Console contract
// (ADR-0007). We dynamically detect if the kernel is running in the high
// half by checking the stack pointer, ensuring panics in the early boot
// low window can still print. Audit: UNSAFE-2026-0002.
let uart_base = {
let sp: usize;
unsafe { core::arch::asm!("mov {}, sp", out(reg) sp) };
if (sp & 0xFFFF_0000_0000_0000) != 0 {
tyrne_hal::phys_to_kernel_va(PL011_UART_BASE)
} else {
PL011_UART_BASE
}
};
let console = unsafe { Pl011Uart::new(uart_base) };

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (3)
tools/smoke.sh (1)

29-29: 💤 Low value

Consider validating that timeout is numeric.

If a non-numeric value is passed to --timeout, the Perl alarm() call on line 51 will fail with an unclear error message. Adding a simple numeric check would improve the user experience.

🛡️ Optional validation
-        --timeout) TO="$2"; shift 2 ;;
+        --timeout) 
+            if ! [[ "$2" =~ ^[0-9]+$ ]]; then
+                echo "error: --timeout requires a positive integer" >&2
+                exit 2
+            fi
+            TO="$2"; shift 2 ;;
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tools/smoke.sh` at line 29, The --timeout parsing currently assigns TO
without validation which will cause alarm() to fail for non-numeric values;
after setting TO in the case branch (where --timeout) or immediately after
option parsing, validate TO with a numeric integer check (e.g. regex like
^[0-9]+$), ensure it is a non-negative integer, and if the check fails print a
clear error to stderr and exit non-zero so alarm() (and the rest of the script)
only receives valid integer seconds; refer to the TO variable and the alarm()
usage when adding this validation.
kernel/src/obj/task_loader.rs (1)

894-911: 💤 Low value

Miri strict-provenance warnings in test helpers.

The pipeline reports Miri warnings for integer-to-pointer casts at lines 899, 1416, and 1809. These are in test-only code and don't block CI, but you may want to address them for cleaner Miri runs under strict provenance mode.

The aligned_backing helper and similar patterns could use ptr::with_exposed_provenance (stable since Rust 1.84) or sptr crate to make the provenance explicit:

Example fix for aligned_backing (line 899)
 fn aligned_backing(frames: usize) -> (Vec<u8>, *mut u8) {
     let bytes = frames.checked_mul(PAGE_SIZE).expect("test math");
     let alloc = bytes.checked_add(PAGE_SIZE).expect("test math");
     let mut v: Vec<u8> = vec![0u8; alloc];
     let raw = v.as_mut_ptr();
-    let aligned = ((raw as usize + (PAGE_SIZE - 1)) & !(PAGE_SIZE - 1)) as *mut u8;
+    let offset = raw.align_offset(PAGE_SIZE);
+    let aligned = unsafe { raw.add(offset) };
     (v, aligned)
 }

Also applies to: 1414-1421, 1807-1824

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@kernel/src/obj/task_loader.rs` around lines 894 - 911, The test helpers
produce Miri strict-provenance warnings due to integer-to-pointer casts in
aligned_backing, pmm_over_backing and frame; replace the raw casts with
provenance-explicit pointer creation (e.g. use
core::ptr::with_exposed_provenance to obtain an aligned *mut u8 from the
computed usize or use an sptr helper that creates a provenance-carrying pointer
from usize) so the pointer origins are explicit; update aligned_backing to
compute the aligned address as a usize then call
with_exposed_provenance(aligned_usize, |p| p as *mut u8), and similarly convert
backing_ptr usages in pmm_over_backing and frame to use
with_exposed_provenance/from_usize helpers rather than direct as-casts.
docs/architecture/boot.md (1)

17-17: ⚡ Quick win

Consider restructuring this paragraph for clarity.

This 400+ word paragraph is difficult to parse due to:

  • Nested parentheticals about T-022 migration
  • Mixing steps that run in kernel_entry (low, before migration) with steps that run in kernel_main_high (high, after migration)
  • The final parenthetical clarifies the split, but it's buried at the end

Consider breaking this into two sub-bullets or adding a sentence break before the T-022 parenthetical to clearly delineate the pre-migration vs post-migration execution phases.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/architecture/boot.md` at line 17, The paragraph describing kernel_entry
mixes pre-migration and post-migration steps and buries the T-022/high-half
migration detail; split it into two clear parts: one bullet/sentence for actions
performed in kernel_entry (low physical alias, before
mmu_bootstrap/high_half_activate — e.g. console setup, EL1 vector install,
mmu_bootstrap call, initial load of embedded userspace via
task_loader::load_image) and a second bullet/sentence for the migration and
post-migration actions in kernel_main_high (T-022 / high-half migration,
high_half_activate building TTBR1_EL1, PMM initialization, address-space arena
setup, GIC, unmasking DAIF.I, kernel-object arenas, capability tables, IPC,
scheduler), and move the long T-022 parenthetical into the second paragraph so
the split between kernel_entry and kernel_main_high is explicit and easy to
scan; reference kernel_entry, kernel_main_high, mmu_bootstrap,
high_half_activate, and T-022/T-018/T-019 where relevant.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@docs/architecture/boot.md`:
- Line 17: The paragraph describing kernel_entry mixes pre-migration and
post-migration steps and buries the T-022/high-half migration detail; split it
into two clear parts: one bullet/sentence for actions performed in kernel_entry
(low physical alias, before mmu_bootstrap/high_half_activate — e.g. console
setup, EL1 vector install, mmu_bootstrap call, initial load of embedded
userspace via task_loader::load_image) and a second bullet/sentence for the
migration and post-migration actions in kernel_main_high (T-022 / high-half
migration, high_half_activate building TTBR1_EL1, PMM initialization,
address-space arena setup, GIC, unmasking DAIF.I, kernel-object arenas,
capability tables, IPC, scheduler), and move the long T-022 parenthetical into
the second paragraph so the split between kernel_entry and kernel_main_high is
explicit and easy to scan; reference kernel_entry, kernel_main_high,
mmu_bootstrap, high_half_activate, and T-022/T-018/T-019 where relevant.

In `@kernel/src/obj/task_loader.rs`:
- Around line 894-911: The test helpers produce Miri strict-provenance warnings
due to integer-to-pointer casts in aligned_backing, pmm_over_backing and frame;
replace the raw casts with provenance-explicit pointer creation (e.g. use
core::ptr::with_exposed_provenance to obtain an aligned *mut u8 from the
computed usize or use an sptr helper that creates a provenance-carrying pointer
from usize) so the pointer origins are explicit; update aligned_backing to
compute the aligned address as a usize then call
with_exposed_provenance(aligned_usize, |p| p as *mut u8), and similarly convert
backing_ptr usages in pmm_over_backing and frame to use
with_exposed_provenance/from_usize helpers rather than direct as-casts.

In `@tools/smoke.sh`:
- Line 29: The --timeout parsing currently assigns TO without validation which
will cause alarm() to fail for non-numeric values; after setting TO in the case
branch (where --timeout) or immediately after option parsing, validate TO with a
numeric integer check (e.g. regex like ^[0-9]+$), ensure it is a non-negative
integer, and if the check fails print a clear error to stderr and exit non-zero
so alarm() (and the rest of the script) only receives valid integer seconds;
refer to the TO variable and the alarm() usage when adding this validation.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 0efe4486-20eb-41aa-b105-4866197a37b6

📥 Commits

Reviewing files that changed from the base of the PR and between bd39679 and 3ecef5e.

📒 Files selected for processing (20)
  • bsp-qemu-virt/linker.ld
  • bsp-qemu-virt/src/main.rs
  • bsp-qemu-virt/src/mmu.rs
  • bsp-qemu-virt/src/mmu_bootstrap.rs
  • bsp-qemu-virt/src/syscall.rs
  • docs/analysis/tasks/phase-b/T-022-high-half-kernel-mapping.md
  • docs/architecture/boot.md
  • docs/architecture/memory-management.md
  • docs/audits/unsafe-log.md
  • docs/decisions/0033-kernel-high-half-migration.md
  • docs/decisions/README.md
  • docs/roadmap/current.md
  • docs/roadmap/phases/phase-b.md
  • hal/src/lib.rs
  • hal/src/mmu/mod.rs
  • hal/src/mmu/vmsav8.rs
  • kernel/src/mm/mod.rs
  • kernel/src/mm/pmm.rs
  • kernel/src/obj/task_loader.rs
  • tools/smoke.sh

…rts, panic-handler regime fallback

Addresses the PR #36 code review. The verified migration mechanism is
unchanged; this adds named-constant/robustness cleanups and fixes a real
early-boot panic-handler regression + a hosted-AArch64 build assertion.

- Named TCR field bits in vmsav8 (TCR_EL1_EPD0_BIT / TCR_EL1_EPD1_BIT), reused
  in TCR_EL1_VALUE_HIGH_HALF, QemuVirtMmu::activate (EPD0 clear) and
  kernel_main_high (EPD0 set) — localizes the TCR layout to one place.
- Migration trampoline: named KERNEL_IMAGE_PA_MASK (0xFFFF_FFFF) + a
  high_half_alias() helper with a debug_assert! that the symbol address is a
  low PA or its exact high-half alias (fails fast if a future image escapes the
  low-4 GiB window the mask assumes), replacing the inline masks.
- phys_to_kernel_va / kernel_va_to_phys: debug_assert! the PA/VA is inside the
  low-4 GiB direct-map window (kernel build only — short-circuits when the
  offset is 0 on host; uses wrapping_sub to avoid absurd_extreme_comparisons).
- Compile-time KERNEL_HIGH_HALF_OFFSET assert gated on
  all(target_arch="aarch64", target_os="none") — it spuriously failed under
  rust-analyzer / a hosted aarch64 cargo check (fallback offset = 0).
- Panic handler: select the UART MMIO alias by regime (SP >= offset -> high,
  else physical base) so an early-boot (pre-migration) panic prints instead of
  silently translation-faulting on the high alias. Real debuggability fix.
- tools/smoke.sh: prefer timeout(1), fall back to the Perl alarm wrapper, error
  if neither (portability); validate --timeout is a non-negative integer.
- Docs: boot.md Stage 3 split into kernel_entry(low) / kernel_main_high(high) +
  Stage 4 handoff reference corrected.

Skipped (with reason): the task_loader test-helper Miri strict-provenance
warnings are pre-existing (not touched by T-022; Miri passes — warnings, not
failures), out of scope for this PR; noted for a separate test-hygiene cleanup.

Gates green: 340 host tests, host+kernel clippy -D warnings, fmt, kernel build,
QEMU smoke (gated PASS), Miri (Stacked Borrows).

Refs: T-022, ADR-0033, PR #36
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@cemililik
Copy link
Copy Markdown
Collaborator Author

@coderabbitai review all

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 30, 2026

@cemililik Sure, I'll redo the full review of all the changes in PR #36!

✅ Actions performed

Full review triggered.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
bsp-qemu-virt/src/main.rs (1)

1418-1426: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Update the timer-banner comment to the new BOOT_NS timing point.

This comment still says the baseline was captured pre-MMU and includes MMU activation, but BOOT_NS now gets written after the high-half migration. Leaving it as-is will misstate the meaning of the emitted metric.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@bsp-qemu-virt/src/main.rs` around lines 1418 - 1426, Update the timer-banner
comment to reflect that BOOT_NS is now written after the high-half migration
rather than pre-MMU: change the text around the QemuVirtCpu::new / Timer
description to state that the recorded baseline (BOOT_NS) is captured
post-high-half migration and therefore does not include MMU activation cost, and
keep references to the UART/device mapping (mmu_bootstrap) and that the timer
sampled CNTFRQ_EL0 are still accurate; ensure any mention that the baseline was
captured “pre-MMU” or “includes the MMU activation” is removed or corrected to
state “post-migration” semantics.
🧹 Nitpick comments (1)
hal/src/mmu/mod.rs (1)

98-102: ⚡ Quick win

Make the low-4 GiB window checks fail in release too.

These helpers sit directly on the raw-address translation path. With debug_assert! only, an out-of-window PA/VA silently wraps in a release kernel and the next dereference hits the wrong memory instead of failing fast.

Proposed change
 pub const fn phys_to_kernel_va(pa: usize) -> usize {
-    debug_assert!(
-        KERNEL_HIGH_HALF_OFFSET == 0 || pa < 0x1_0000_0000,
-        "phys_to_kernel_va: PA outside the low-4 GiB high-half direct map",
-    );
+    if KERNEL_HIGH_HALF_OFFSET != 0 {
+        assert!(
+            pa < 0x1_0000_0000,
+            "phys_to_kernel_va: PA outside the low-4 GiB high-half direct map",
+        );
+    }
     KERNEL_HIGH_HALF_OFFSET.wrapping_add(pa)
 }
@@
 pub const fn kernel_va_to_phys(va: usize) -> usize {
-    debug_assert!(
-        KERNEL_HIGH_HALF_OFFSET == 0 || va.wrapping_sub(KERNEL_HIGH_HALF_OFFSET) < 0x1_0000_0000,
-        "kernel_va_to_phys: VA outside the high-half direct-map window",
-    );
+    if KERNEL_HIGH_HALF_OFFSET != 0 {
+        assert!(
+            va.wrapping_sub(KERNEL_HIGH_HALF_OFFSET) < 0x1_0000_0000,
+            "kernel_va_to_phys: VA outside the high-half direct-map window",
+        );
+    }
     va.wrapping_sub(KERNEL_HIGH_HALF_OFFSET)
 }

Also applies to: 127-131

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@hal/src/mmu/mod.rs` around lines 98 - 102, The address-range checks currently
use debug_assert! so they are disabled in release builds; change those to
assert! (or otherwise ensure they panic in release) so invalid low-4 GiB PA/VA
translations fail fast; update the checks in the phys_to_kernel_va helper (the
KERNEL_HIGH_HALF_OFFSET.wrapping_add(pa) path) and the corresponding
kernel_va_to_phys helper (the inverse check around KERNEL_HIGH_HALF_OFFSET
subtraction) to use assert! instead of debug_assert! so release kernels do not
silently wrap addresses.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@bsp-qemu-virt/src/main.rs`:
- Around line 943-986: The new unsafe blocks (notably the call to
mmu_bootstrap::high_half_activate and the inline assembly block that writes
VBAR_EL1, rebases SP and branches to kernel_main_high using
tyrne_vectors/high_vbar/KERNEL_HIGH_HALF_OFFSET) are missing the repo-required
justification paragraph; update each unsafe block comment to include (a) why
unsafe is necessary (e.g., raw register writes / control-transfer across
translation regimes), (b) the exact invariants upheld (which
registers/memory/translation regimes must be valid, DAIF state, that TTBR0/TTBR1
mappings exist), and (c) why safer alternatives were rejected (e.g., no safe
wrapper can express atomic CPU register/state transition or absolute jump
semantics), and add the audit entry ID(s) referenced (e.g., UNSAFE-2026-0031 and
related audit IDs) per docs/audits/unsafe-log.md; do the same for the other
affected blocks around lines 1015-1085 and 1596-1620 to satisfy the coding
guideline.

In `@bsp-qemu-virt/src/mmu_bootstrap.rs`:
- Around line 332-365: In high_half_activate(), expand the two unsafe-block
comments to the three-part standard: for the table-write unsafe (the block
writing table_descriptor to l0_hh and l1_hh referencing l0_hh, l1_hh,
l1_idx_dev, l1_idx_ram, l2_low_pa, l2_high_pa, table_descriptor) state (a) why
unsafe is required (raw pointer writes to page tables that cannot be expressed
safely), (b) the precise invariants upheld (frames are page-aligned, exclusively
owned for this boot call, pre-zeroed, indices masked <512, descriptors use
physical addresses only), and (c) why safer alternatives were rejected (no
existing safe page-table API can perform these early-boot PA writes before MMU
is enabled); include the audit tag UNSAFE-2026-0022 and add an entry to
docs/audits/unsafe-log.md. For the register-write unsafe (the asm block that
sets TTBR1_EL1 and TCR_EL1 using l0_hh and TCR_EL1_VALUE_HIGH_HALF), similarly
document (a) why asm/unsafe is required (ordering and privileged system register
writes cannot be done safely), (b) invariants (DSB/ISB ordering guarantees
publication of descriptors, EPD1-cleared TCR value, TTBR1 points to prepared
L0_hh so no walker sees stale descriptors), and (c) why safer alternatives were
rejected (no higher-level API ensures required barrier/reg sequence and exact
ordering at EL1); include audit tag UNSAFE-2026-0023 and add its entry to
docs/audits/unsafe-log.md.

In `@docs/architecture/boot.md`:
- Around line 24-25: The doc currently contradicts itself about the ELF entry:
reconcile the "ELF entry forced to _start_phys" claim with the later
"ENTRY(_start)" statement by choosing one canonical description and updating
references; explicitly state whether the linker.ld uses ENTRY(_start) to set the
ELF entry to the virtual-symbol _start which is then aliased/translated to the
physical symbol _start_phys, or state that the linker forces ENTRY to
_start_phys and remove the ENTRY(_start) mention. Update mentions of _start,
_start_phys, ENTRY macro in the boot.md text and ensure they align with the
linker.ld behavior and the KERNEL_HIGH_HALF_OFFSET /
tyrne_hal::phys_to_kernel_va mapping described earlier.

In `@docs/decisions/0033-kernel-high-half-migration.md`:
- Line 16: ADR-0033 currently conflicts with the implemented migration contract:
it asserts a two-offset model and KBASE = 0xFFFF_FFFF_8008_0000 whereas T-022
and shipped objectives use a one-offset model and KBASE = 0xFFFF_FFFF_4008_0000;
update the ADR governance record by either superseding ADR-0033 or adding an
explicit amendment ADR that reconciles the models and sets the canonical KBASE
value to 0xFFFF_FFFF_4008_0000, reference T-022 and the implemented migration
contract in the new ADR text, and also reconcile the related ADRs called out
(items 67-69 and 109-123) to ensure they reference the same canonical decision
instead of leaving conflicting accepted documents.

In `@tools/smoke.sh`:
- Around line 38-41: The timeout validation currently accepts "0" which disables
timeouts; update the check around the TO variable so it requires a strictly
positive integer (e.g., replace the regex check or add an extra condition to
reject TO==0) and change the error message to say "--timeout must be a positive
integer (seconds); got '$TO'". Ensure the condition that now fails for
non-numeric or zero values is applied where TO is validated (the if ! [[ "$TO"
=~ ^[0-9]+$ ]]; then ... block).

---

Outside diff comments:
In `@bsp-qemu-virt/src/main.rs`:
- Around line 1418-1426: Update the timer-banner comment to reflect that BOOT_NS
is now written after the high-half migration rather than pre-MMU: change the
text around the QemuVirtCpu::new / Timer description to state that the recorded
baseline (BOOT_NS) is captured post-high-half migration and therefore does not
include MMU activation cost, and keep references to the UART/device mapping
(mmu_bootstrap) and that the timer sampled CNTFRQ_EL0 are still accurate; ensure
any mention that the baseline was captured “pre-MMU” or “includes the MMU
activation” is removed or corrected to state “post-migration” semantics.

---

Nitpick comments:
In `@hal/src/mmu/mod.rs`:
- Around line 98-102: The address-range checks currently use debug_assert! so
they are disabled in release builds; change those to assert! (or otherwise
ensure they panic in release) so invalid low-4 GiB PA/VA translations fail fast;
update the checks in the phys_to_kernel_va helper (the
KERNEL_HIGH_HALF_OFFSET.wrapping_add(pa) path) and the corresponding
kernel_va_to_phys helper (the inverse check around KERNEL_HIGH_HALF_OFFSET
subtraction) to use assert! instead of debug_assert! so release kernels do not
silently wrap addresses.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 09171078-072f-40e0-93e9-c834117a5734

📥 Commits

Reviewing files that changed from the base of the PR and between bd39679 and dfa601f.

📒 Files selected for processing (20)
  • bsp-qemu-virt/linker.ld
  • bsp-qemu-virt/src/main.rs
  • bsp-qemu-virt/src/mmu.rs
  • bsp-qemu-virt/src/mmu_bootstrap.rs
  • bsp-qemu-virt/src/syscall.rs
  • docs/analysis/tasks/phase-b/T-022-high-half-kernel-mapping.md
  • docs/architecture/boot.md
  • docs/architecture/memory-management.md
  • docs/audits/unsafe-log.md
  • docs/decisions/0033-kernel-high-half-migration.md
  • docs/decisions/README.md
  • docs/roadmap/current.md
  • docs/roadmap/phases/phase-b.md
  • hal/src/lib.rs
  • hal/src/mmu/mod.rs
  • hal/src/mmu/vmsav8.rs
  • kernel/src/mm/mod.rs
  • kernel/src/mm/pmm.rs
  • kernel/src/obj/task_loader.rs
  • tools/smoke.sh

Comment thread bsp-qemu-virt/src/main.rs
Comment on lines +943 to +986
// ── High-half table build — T-022 / ADR-0033 §Simulation rows 0-1 ─────────
//
// Build the TTBR1_EL1 tables and enable TTBR1 walks (EPD1 1→0). Both
// translation regimes are live on return; the kernel still executes low.
//
// SAFETY: called once, at EL1, after `mmu_bootstrap` (the shared L2 tables
// + the low-identity MMU are live) and before the migration trampoline.
// Audit: UNSAFE-2026-0022 / 0023 (Amendments).
unsafe {
mmu_bootstrap::high_half_activate();
}

// ── Boot-time high-half migration — T-022 / ADR-0033 §Simulation row 2 ────
//
// Install the high VBAR (so a fault on the first high fetch vectors to a
// mapped handler), rebase SP to the high stack alias, and branch the PC
// into the high-half image (`kernel_main_high`). The low identity stays
// live (TTBR0 is freed inside `kernel_main_high`), DAIF is masked (since
// `_start`), and no `StaticCell` holds a low VA yet, so the few pre-`br`
// instructions cannot brick. The high targets are derived by masking the
// (PC-relative-resolved) address to its physical part and OR-ing the
// high-half offset, so the computation is correct regardless of how the
// compiler materialises the symbol addresses.
let high_vbar = high_half_alias(core::ptr::addr_of!(tyrne_vectors) as usize);
let high_entry = high_half_alias(kernel_main_high as *const () as usize);
// SAFETY: the absolute-jump migration trampoline (ADR-0033 §Simulation
// row 2). `MSR VBAR_EL1` to the high vector base (mapped PXN=0 in TTBR1) +
// `ISB` so high vectors are live before the branch; `add sp, sp, off`
// rebases SP to the high alias of the same boot stack; `br` crosses the PC
// from the low identity to the high-half image alias. Both regimes are
// live across the branch and DAIF is masked, so the crossing cannot fault.
// `options(noreturn)`: control never returns (kernel_main_high is `-> !`),
// so changing SP here is sound. Audit: UNSAFE-2026-0031.
unsafe {
core::arch::asm!(
"msr vbar_el1, {vbar}",
"isb",
"add sp, sp, {off}",
"br {entry}",
vbar = in(reg) high_vbar,
off = in(reg) KERNEL_HIGH_HALF_OFFSET,
entry = in(reg) high_entry,
options(noreturn),
);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

Bring the new unsafe sites up to the repo's required comment format.

Several newly added unsafe blocks in the migration and panic paths describe the invariants, but they still omit the required "why safer alternatives were rejected" rationale. Please make these comments complete before merge.

As per coding guidelines, "All kernel and userspace code must be written in Rust, with every unsafe block documented with comments explaining (a) why it is needed, (b) what invariants it upholds, and (c) why safer alternatives were rejected; audit entries must be logged in docs/audits/unsafe-log.md".

Also applies to: 1015-1085, 1596-1620

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@bsp-qemu-virt/src/main.rs` around lines 943 - 986, The new unsafe blocks
(notably the call to mmu_bootstrap::high_half_activate and the inline assembly
block that writes VBAR_EL1, rebases SP and branches to kernel_main_high using
tyrne_vectors/high_vbar/KERNEL_HIGH_HALF_OFFSET) are missing the repo-required
justification paragraph; update each unsafe block comment to include (a) why
unsafe is necessary (e.g., raw register writes / control-transfer across
translation regimes), (b) the exact invariants upheld (which
registers/memory/translation regimes must be valid, DAIF state, that TTBR0/TTBR1
mappings exist), and (c) why safer alternatives were rejected (e.g., no safe
wrapper can express atomic CPU register/state transition or absolute jump
semantics), and add the audit entry ID(s) referenced (e.g., UNSAFE-2026-0031 and
related audit IDs) per docs/audits/unsafe-log.md; do the same for the other
affected blocks around lines 1015-1085 and 1596-1620 to satisfy the coding
guideline.

Comment thread bsp-qemu-virt/src/mmu_bootstrap.rs
Comment thread docs/architecture/boot.md
Comment thread docs/decisions/0033-kernel-high-half-migration.md
Comment thread tools/smoke.sh Outdated
…t.md ENTRY/linker reconcile, hard direct-map asserts

Addresses the PR #36 round-3 code review. Each finding was verified against
current code; only still-valid issues changed, the rest skipped with reason.
The migration mechanism is unchanged.

- ADR-0033: append-only Revision-notes rider reconciling the SHIPPED single
  linear offset (KERNEL_HIGH_HALF_OFFSET = 0xFFFF_FFFF_0000_0000, canonical
  KBASE = 0xFFFF_FFFF_4008_0000) against the body's two-offset /
  0xFFFF_FFFF_8008_0000 model. Original body left intact per ADR-0025 §Rule 2;
  the rider explains why the two PA<->VA offsets collapse to one and names
  ADR-0027 Option C's prospective base as superseded.
- boot.md: ENTRY(_start_phys) reconciled (body previously contradicted itself
  with ENTRY(_start)); Stage-1 entry-point wording fixed; "Linker script
  responsibilities" rewritten to the real link-high/load-low form (KBASE + AT(),
  no MEMORY{} block, .text.vectors 2 KiB-aligned).
- hal/mmu: phys_to_kernel_va / kernel_va_to_phys now use assert! (was
  debug_assert!) so an out-of-window PA/VA fail-stops in release (CLAUDE.md #1)
  rather than wrapping to a wild pointer; added # Panics sections so
  clippy::missing_panics_doc stays green (assert! is not exempt; debug_assert!
  was). The check cannot fire in v1 (QEMU PA < 4 GiB).
- mmu_bootstrap: high_half_activate's two SAFETY comments gained explicit
  "Safer alternatives rejected" clauses, matching the sibling mmu_bootstrap()
  blocks. Audit entries UNSAFE-2026-0022 / 0023 already cover these writes via
  their T-022 Amendments (no audit-log change needed).
- main.rs: timer-banner comment corrected — BOOT_NS is captured POST high-half
  migration (excludes MMU-activation + migration cost), not pre-MMU.
- tools/smoke.sh: --timeout now requires a strictly positive integer; rejects
  "0", which disables timeout(1)/alarm() and would let the WFI-idling kernel
  hang the run forever.

Skipped (with reason): the migration trampoline / TTBR0-free / panic-handler
unsafe blocks in main.rs already carry conforming SAFETY comments + audit IDs
(incl. UNSAFE-2026-0031), and the referenced audit entries already exist with
T-022 Amendments — no change needed.

Gates green: cargo fmt, host + kernel clippy -D warnings, 340 host tests,
kernel build, QEMU smoke (gated PASS; -d int,unimp fault-clean, high-half
active -> all tasks complete).

Refs: T-022, ADR-0033, ADR-0025, PR #36
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@cemililik cemililik merged commit b6549d7 into main May 30, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant