Skip to content

mohnkhan/MyOS2026

MyOS2026 — VM-First Operating System in Rust

CI

A modern, minimal, secure operating system designed specifically for virtual machines — fast boot, reproducible images, strong security defaults, and a full Unix utility layer. Written entirely in Rust.


At a Glance

Target Goal Result
Boot time < 2 s (BIOS, headless QEMU) 1.79–1.93 s
Idle RAM < 256 MB < 256 MB ✅
Base image < 1 GB compressed QCOW2 12.5 MB
SSH ready < 5 s after shell prompt < 1 s ✅
Integration tests 100% pass 10/10 smoke suite + 31-program syscall-diff corpus
Unit tests All pass 567/567 kernel + 632/632 mybox + 61/61 nsh + 17/17 dwarf-extractor + 134/134 mymc (cargo test --lib) ✅
ABI-drift CI gate Catch new SYS_* vs Linux x86_64 collisions on every PR make abi-drift-check — 116 consts scanned, ≤ 30 ms; vendored snapshot + monthly upstream audit (Feature 171)
File manager Dual-pane Rust TUI shipped on the image /bin/mymc 2.05 MiB stripped — copy/move/delete/mkdir + resumable transfers + previewer + fuzzy filter + history (Feature 070) ✅
/proc entries Linux-compatible 15 virtual files + /proc/audit/{data,stats} + per-PID /proc/<pid>/{trace,stack,wchan,status}status shows real Uid:/Gid:/Groups: lines (Feature 071) ✅
Credential audit log Every privilege transition recorded dmesg | grep AUDT (Stage 1) + /proc/audit/data binary stream + /proc/audit/stats 6-line KV (Stage 2+3) — every set{u,g,res,fs}{uid,gid} syscall recorded; 112-byte AuditRecord with seq + drop counter; poll(2) + O_NONBLOCK + ≤5 ms p99 wake latency via kernel::sched::wait::WaitQueue (Features 072 + 073 + 074) ✅
Memory-safety diagnostics KASAN + FASAN Both default-on for debug builds (heap redzones + frame poisoning/ownership) ✅
Panic backtrace Source-line annotated In-kernel DWARF lookup — every frame shows at <file>:<line> (Feature 066) ✅
Networking userland DNS, HTTP, nc, ping 5/5 tests pass
Reproducible builds Identical SHA-256
Verified boot BLAKE2b hash chain

All 11 success criteria (SC-001–SC-011) pass. See VALIDATION.md.


Demo

MyOS2026 shell demo

nsh$ prompt with mybox applets, pipe chains, and standard utilities — captured via make screenshot.

Animated terminal demo

Real nsh session over SSH — uname, /proc/meminfo, /proc/cpuinfo, ps, a base64 pipe, and the colored [1] prompt that appears after a failed command. Generated via make demo-gif (paramiko + compound nsh -c, cast trimmed to remove SSH-negotiation dead time). See Feature 062 below.


What's Inside

A complete, self-contained OS stack:

+-------------------------------------------------------+
|                    User Space                         |
|  init | nsh | myos-pkg | cloud-init | dropbear        |
|  mybox (97 applets) | sandbox | exploit-test          |
|  /proc/self/{maps,fd/,status,exe} | /proc/{cpuinfo,   |
|  uptime,net/dev,net/tcp} | /proc/[pid]/…             |
+-------------------------------------------------------+
|                  Security Layer                       |
|  Per-process syscall allowlist (SYS_SANDBOX_ENTER)    |
|  Capability tokens (CAP_FS_ADMIN, CAP_NET_BIND, …)    |
|  Verified boot (BLAKE2b → ed25519 attestation)        |
+-------------------------------------------------------+
|                  System Layer                         |
|  VFS (symlink-following) | Syscall dispatch | Pipes   |
|  IPC | MLFQ scheduler | Linux ELF binary compatibility  |
+-------------------------------------------------------+
|                    Kernel                             |
|  MM (demand paging) | Interrupts (APIC/HPET)          |
|  smoltcp 0.11 | DHCP | ext2 (read/write) | firewall   |
|  procfs (12 virtual files, Linux-compatible)           |
+-------------------------------------------------------+
|                VirtIO Drivers                         |
|  blk | net | console | rng | scsi (VirtualBox compat) |
+-------------------------------------------------------+
|               Virtual Hardware                        |
|  QEMU q35 (primary) | VirtualBox (secondary)          |
+-------------------------------------------------------+

Quick Start

# Prerequisites
apt install qemu-system-x86 ovmf sgdisk mtools e2fsprogs qemu-utils nasm python3
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
rustup toolchain install nightly
rustup component add rust-src --toolchain nightly
rustup target add x86_64-unknown-linux-musl

# Build and boot (< 2s to shell prompt)
RELEASE=1 bash build/scripts/assemble-image.sh myos.qcow2
make qemu

💾 Spare your SSD: make tmpfs-setup redirects target/ and dist/ (the only large gitignored output trees) into /tmp/MyOS/<hash>/ so the write-heavy build cycle hits RAM. Reversible (make tmpfs-teardown); idempotent; opt-in; no-op on CI. See docs/dev-tmpfs.md.

Interactive session (recommended)

Boot the VM in a graphical window showing the kernel framebuffer terminal:

make qemu-sdl                    # opens SDL window; QEMU monitor via Ctrl+Alt+2

SSH in simultaneously on port 2222:

ssh -p 2222 -i tests/keys/test_id_ed25519 \
  -o StrictHostKeyChecking=no root@127.0.0.1

Headless SSH session

qemu-system-x86_64 -machine q35 -cpu qemu64 -smp 2 -m 256M \
  -drive if=pflash,format=raw,readonly=on,file=/usr/share/OVMF/OVMF_CODE_4M.fd \
  -drive file=myos.qcow2,format=qcow2,if=virtio \
  -netdev user,id=net0,hostfwd=tcp::2222-:22 \
  -device virtio-net-pci,netdev=net0 -device virtio-rng-pci \
  -serial stdio -display none &

ssh -p 2222 -i tests/keys/test_id_ed25519 \
  -o StrictHostKeyChecking=no root@127.0.0.1

Running in VirtualBox

MyOS2026 boots in VirtualBox using the default hardware profile — no custom settings required. The kernel includes a native LSI Logic MPT SCSI driver and an Intel E1000 NIC driver.

Import and boot:

# Convert QCOW2 → VDI (VirtualBox native format)
qemu-img convert -f qcow2 -O vdi myos.qcow2 myos.vdi

# Create VM via VBoxManage (or use the GUI)
VBoxManage createvm --name MyOS2026 --ostype Linux_64 --register
VBoxManage modifyvm MyOS2026 --memory 256 --cpus 1 --nic1 nat
VBoxManage modifyvm MyOS2026 --firmware efi --graphicscontroller vmsvga

# Add SCSI controller (LSI Logic — the VirtualBox default)
VBoxManage storagectl MyOS2026 --name SCSI --add scsi --controller LsiLogic
VBoxManage storageattach MyOS2026 --storagectl SCSI --port 0 --device 0 \
  --type hdd --medium myos.vdi

# Start (headless)
VBoxManage startvm MyOS2026 --type headless

# SSH in (once nsh$ appears in VBoxManage guestcontrol or serial log)
ssh -p 22 -i tests/keys/test_id_ed25519 -o StrictHostKeyChecking=no root@<guest-ip>

VirtualBox hardware mapping:

VirtualBox device Kernel driver Notes
LsiLogic SCSI drivers::lsi_scsi MPT 1.x protocol; auto-detected
Intel E1000 (NIC1) drivers::e1000 Default VirtualBox NIC; MAC from RA[0]
EFI firmware Limine UEFI loader Use --firmware efi

The kernel auto-detects which block device is present (root= cmdline override available for explicit selection: root=lsi-scsi, root=virtio-scsi, root=virtio-blk).

Generate documentation artifacts

# Styled PNG screenshot → docs/screenshots/demo.png
# Requires: cargo install silicon
make screenshot

# Animated GIF demo → docs/demo.gif
# Requires: pip install asciinema && (download agg from github.com/asciinema/agg/releases)
make demo-gif

Once in, the full Unix utility set is available via mybox:

nsh$ ls /bin | wc -l        # 91+ binaries (mybox applets + sh + test binaries)
nsh$ cat /etc/hostname | grep -c .
1
nsh$ ps | head -5
nsh$ echo hello | grep hello
hello

Feature Highlights

mybox — Busybox-in-Rust (91 applets)

A multi-call binary providing 91 Unix applets via symlinks in /bin. Dispatch is purely by argv[0] basename — no runtime overhead per applet.

Category Applets
File ops cat, cp, mv, rm, ln, mkdir, rmdir, touch, chmod, chown, chgrp, install, truncate
Text grep, sed, awk, cut, head, tail, sort, uniq, wc, tr, comm, diff, patch, tee
Filesystem ls, find, du, df, stat, file, readlink, realpath, basename, dirname, pathchk
Process ps, kill, killall, nice, nohup, timeout, watch, pgrep, pkill, wait
System uname, hostname, dmesg, uptime, free, sysctl, env, printenv, nproc
Archive tar, gzip, gunzip, zcat, bzip2, bzcat, xz, unxz
Shell utils echo, printf, test, true, false, yes, seq, sleep, date, expr, xargs
Misc od, xxd, base64, md5sum, sha256sum, cmp, strings, stty

Networking applets (nslookup, wget, nc, ping) are included — shipped in Feature 025. DNS resolution, HTTP fetch, TCP netcat, and ICMP ping all pass integration tests.

nsh$ /bin/grep -i root /etc/passwd
root:x:0:0:root:/root:/bin/sh
nsh$ /bin/ls -la /bin/ls
lrwxrwxrwx        10 ls -> /bin/mybox
nsh$ mybox --list | wc -l
91

Linux ELF Binary Compatibility

Statically-linked musl ELF binaries compiled on Linux run directly on MyOS2026 without modification or recompilation:

# On a Linux host:
musl-gcc -static -o hello hello.c
cargo build --target x86_64-unknown-linux-musl --release

# Copy to MyOS2026 (scp or baked into the disk image) and run:
nsh$ /bin/hello
Hello, World!
nsh$ /bin/mybox-linux echo "from linux musl"
from linux musl

What works: ELF64 static executables (ET_EXEC), full System V AMD64 ABI initial stack layout (argc/argv/envp/auxv), all musl startup syscalls (set_tid_address, arch_prctl(ARCH_SET_FS), prlimit64, getrandom, rt_sigprocmask), anonymous mmap, brk, /proc/self/exe, /proc/self/maps. Invalid accesses deliver SIGSEGV (no kernel panic); stack overflows are caught at the stack bottom guard and deliver SIGSEGV to the process.

Out of scope: dynamic linking (PT_INTERP), 32-bit ELF, kernel modules.

Kernel Symlink Resolution

The VFS resolve() function follows symlinks after every directory lookup (depth-capped at 8 to prevent loops). EXT2 fast symlinks (≤60 bytes stored directly in i_block[]) are supported without data block allocation. This enables execve("/bin/cat") to transparently dispatch through /bin/mybox.

Security: Per-Process Syscall Sandbox

nsh$ sandbox --allow=read,write,exit /usr/bin/exploit-test
BLOCKED (errno=1)      ← mount(2) blocked by kernel allowlist

The kernel enforces a deny-by-default syscall filter per process, installed via SYS_SANDBOX_ENTER (nr=999). Filters survive execve and are independent across processes.

Verified Boot

Every RELEASE build embeds a BLAKE2b hash chain:

UEFI → Limine (config hash enrolled) → kernel.elf (BLAKE2b verified)
     → kernel_main ([vboot] ACTIVE  pubkey: be5f7844108bcdd1)

Any binary tampering before a single kernel instruction executes causes an immediate boot abort (hash_mismatch_panic: yes).

cloud-init

Boots with a cidata ISO and applies provisioning automatically:

  • Sets hostname via sethostname(2)
  • Writes SSH authorized keys to /root/.ssh/authorized_keys
  • Runs runcmd entries (e.g. chmod 700 /root/.ssh)

Reproducible Builds

Two independent builds from identical source produce byte-identical QCOW2.

RELEASE=1 bash build/scripts/assemble-image.sh build-a.qcow2
RELEASE=1 bash build/scripts/assemble-image.sh build-b.qcow2
sha256sum build-a.qcow2 build-b.qcow2
# a3e64333...  build-a.qcow2
# a3e64333...  build-b.qcow2  ← identical ✓

Achieved via SOURCE_DATE_EPOCH, pinned GPT/FAT UUIDs, and build/scripts/fix-ext2-timestamps.py.


Integration Tests

Test Suite Description Status
test_boot.py integration 9-phase boot sequence → nsh$ ✅ PASS
test_ssh.py integration SSH login (Dropbear, key auth) ✅ PASS
test_shell.py integration nsh pipes, redirects, builtins ✅ PASS
test_cloud_init.py integration cidata: hostname + SSH key injection ✅ PASS
test_sandbox.py integration Sandbox blocks mount(2) ✅ PASS
test_rollback.sh integration QCOW2 snapshot / rollback ✅ PASS
test_lsi_scsi.py vbox mptsas1068 detection + CD-ROM coexistence ✅ PASS
test_e1000_ssh.py vbox E1000 NIC + SSH login ✅ PASS
test_vbox_combined.py vbox LSI SCSI + E1000 together ✅ PASS
test_dual_nic.py vbox virtio-net + E1000 dual NIC ✅ PASS
test_signal.py syscalls rt_sigaction delivery + sigreturn ✅ PASS
test_nanosleep.py syscalls nanosleep blocks correct duration ✅ PASS
test_futex.py syscalls futex WAIT/WAKE under concurrent load ✅ PASS
test_misc_posix.py syscalls uname, getcwd/chdir, TIOCGWINSZ ✅ PASS
test_debug_mode.py misc kdebug subsystem log tags ✅ PASS
test_scheduler.py scheduler MLFQ: CPU demotion, I/O boost, starvation, nice ✅ PASS
test_linux_elf.py elf-compat 16 scenarios: musl C/Rust ELF, SIGSEGV, stack overflow, /proc ✅ PASS
test_reproducible.sh misc Two builds produce identical SHA-256 ✅ PASS

Run all suites:

make test-all QCOW2=dist/myos2026.qcow2

# Or individually:
make test-unit                                    # 434 kernel unit tests, no QEMU
make test-integration QCOW2=dist/myos2026.qcow2  # boot, SSH, shell, cloud-init, sandbox
make test-vbox        QCOW2=dist/myos2026.qcow2  # LSI SCSI, E1000, dual-NIC
make test-syscalls    QCOW2=dist/myos2026.qcow2  # signal, nanosleep, futex, misc-posix
python3 tests/boot/test_linux_elf.py dist/myos2026.qcow2  # Linux ELF compat (16 scenarios)

CI before push (Feature 035)

Every PR is gated by GitHub Actions; the ci check runs clippy, unit tests, and the integration suite under smp ∈ {1, 2} matrix axes. To run the same pipeline locally before pushing:

make ci-local       # ~15 min; same step order, same per-step timeouts as remote CI

Failed local runs leave artifacts under dist/ci-artifacts/ (QEMU serial log, test stdout/stderr, optional kernel-panic excerpt). The CI gate is documented in detail in specs/035-ci-pr-gate/quickstart.md.


What Is Implemented

Kernel

  • Boot: Limine (BIOS + UEFI), x86-64 entry, GDT/IDT, APIC/HPET timer
  • Memory: bitmap physical allocator, 4-level page table, demand paging, kernel heap
  • Drivers: UART, virtio-blk, virtio-net, virtio-console (bidirectional: TX output + RX input via poll), virtio-rng, virtio-scsi, LSI Logic MPT SCSI (VirtualBox), Intel E1000 NIC (VirtualBox), framebuffer
  • Filesystem: ext2 (superblock, block groups, inodes, directories, read/write, symlinks), VFS with symlink-following resolve(), 64-slot LRU write-back block cache
  • Linux ELF compat: ELF64 static loader, full System V AMD64 ABI initial stack (auxv), /proc/self/exe|maps, mmap_min_addr guard, stack-bottom guard → SIGSEGV
  • Process: PCB (with nice: i8, stack_bottom), fork, exec, wait, exit, FD table, dup/dup2, pipes
  • Scheduler: 3-level MLFQ — quanta P0=1/P1=2/P2=4 ticks, priority decay on quantum exhaustion, I/O boost (wake at P0), starvation prevention (boost every 100 ticks), nice-based base priority, fork inherits nice
  • Syscalls: Full POSIX Linux ABI (SYSCALL/SYSRET). Process: fork (CoW lazy frame sharing), execve, waitpid, exit, getpid, getppid, clone (CLONE_THREAD), set_tid_address, exit_group. Files: open, close, read, write, lseek, dup, dup2, pipe, fcntl, stat, fstat, lstat, getdents64. Memory: brk, mmap (anon+file-backed, MAP_SHARED/PRIVATE), munmap, mprotect. Signals: kill, rt_sigaction, rt_sigprocmask, rt_sigreturn, alarm. Time: clock_gettime, nanosleep, clock_nanosleep, gettimeofday. Threading: futex (WAIT/WAKE). Sockets: socket (TCP/UDP/ICMP-raw), bind, listen, accept, connect, send, recv, sendto, recvfrom, sendmsg, recvmsg, setsockopt, getsockopt, shutdown. Priority: nice (nr=34), getpriority (nr=140), setpriority (nr=141). Misc: uname, getcwd, chdir, ioctl (TIOCGWINSZ), sysinfo, gettid. Custom: gethostname (nr=125), vboot-status (nr=998), sandbox-enter (nr=999).
  • Network: smoltcp 0.11, virtio-net device, DHCP, TCP/UDP/ICMP sockets, sendmsg/recvmsg, packet firewall (default-deny, allow TCP/22 + ICMP + UDP)
  • /proc filesystem (Feature 024): /proc/self/exe, /proc/self/maps (dynamic load+heap+stack ranges), /proc/self/fd/<N> (symlinks to open file paths), /proc/self/status (Name/Pid/VmRSS/Threads), /proc/<pid>/exe, /proc/<pid>/maps, /proc/cpuinfo (model name, cores), /proc/uptime, /proc/net/dev (rx/tx per NIC), /proc/net/tcp (TCP socket table); fully compatible with musl readlink("/proc/self/exe"), /proc/self/maps mmap parsing, and standard Unix tools
  • Fix: cargo test --lib no longer SIGSEGVs under default parallel execution (Feature 172 — closes #329). Pre-F172, contributors running the muscle-memory default cargo test --lib (no --test-threads flag) hit a 100%-reproducible SIGSEGV partway through the 567-test suite on master and on every feature branch since F166. CI rollup was unaffected because scripts/ci/ci-local.sh pinned --test-threads=1; the bite was confined to local dev. Root cause: test scaffolding in multiple modules wrote to process-global statics that production code serialised via PTABLE.lock() but tests bypassed for speed. F172 lands three fix shapes across six modules — (a) per-instance counter in audit::ring::Ring (new #[cfg(test)] push_wake_count: AtomicU64 field replaces the global PUSH_WAKE_COUNT; production fn push_wake() signature stays zero-arg so SC-005 byte-identical holds), (b) per-module TEST_LOCK (OnceLock<StdMutex<()>> + lock() helper, acquired at the top of every state-touching test) in proc::pipe and drivers::uart, (c) cross-module shared lock mm::MM_TEST_LOCK used by mm::{fasan,phys,virt} test modules because phys::alloc_frame → fasan::on_alloc dereferences HHDM_OFFSET that fasan tests swap to a sandbox Vec via HhdmSandbox (concurrent phys::alloc_frame against a swapped HHDM = wild-pointer SIGSEGV). New CI step kernel-tests-parallel in scripts/ci/ci-local.sh runs cargo test --lib (no flag) BEFORE the sequential step so any future global-static regression fails fast. Verified 5-of-5 consecutive parallel runs green, 567/567 each. SC-005 verified — size target/kernel.elf and nm --size-sort both byte-identical to master. SC-007 verified — parallel wall-clock 0.77× sequential (0.31s vs 0.40s, well under the 1.5× budget). Kernel suite 567/567 passing (test count unchanged — pure test-rewrite). Zero new BSS. Zero production-binary delta. Spec: specs/172-parallel-test-sigsegv-fix/.
  • Feature: ABI-drift CI gate — mechanically catches what F164–F170's hand-audits missed (Feature 171 — closes the entire audit-sweep-by-hand pattern). New make abi-drift-check runs on every PR that touches kernel/src/proc/**: scans every [pub] const SYS_*: u64 = N; declaration, cross-references against a vendored copy of upstream arch/x86/entry/syscalls/syscall_64.tbl (5-line provenance header, os.replace-atomic refresh path), and emits NDJSON findings for linux_collision, myos_custom_out_of_range, intra_kernel_duplicate, linux_gap_warning, and upstream_drift classes. The gate caught five pre-existing collisions on its first run: four historical-name aliases (SYS_WAITPID=61 / SYS_READDIR=78 / SYS_SEND=44 / SYS_RECV=45 — all matched on slot to the Linux name under a different label, so codified in allowlist.toml::[[linux_alias]] with semantic-equivalence justifications) AND one real semantic collision: SYS_NICE was at slot 34, where Linux owns pause(2) — a static-musl binary calling pause() would have misrouted into MyOS nice() and silently consumed its first arg as a nice-increment. Fix landed in the same PR: renumbered SYS_NICE 34 → 996 (joins SYS_VBOOT_STATUS=998 / SYS_SANDBOX_ENTER=999 / SYS_KMSG_READ=997 in the reserved 990–999 range). Per-PR step is path-conditional via git diff --name-only master...HEAD -- kernel/src/proc/ — skips with ≤ 100 ms wall-clock on PRs that don't touch the dispatcher. Monthly abi-drift-audit against the live upstream tree (wrapped by tests/abi/audit-runner.sh, dedupes against open issues, auto-files with abi-drift + audit-compliance labels) covers the inverse drift direction — when Linux claims a slot MyOS already owns. Net-new artifacts: tests/abi/abi_drift.py (Python 3.11+ stdlib only — no jsonschema, no requests), tests/abi/test_abi_drift.py (16 unit tests, Constitution II RED-before-GREEN), tests/abi/linux-syscall_64.tbl (vendored snapshot at upstream commit 6779b50f), tests/abi/allowlist.toml, tests/abi/README.md (full refresh recipe + 990–999 rationale), 3 new Makefile targets, 3 new kernel abi_pin_tests (all_myos_custom_consts_in_990_999_range + nice_slot_34_unbound_now_in_myos_range + no_intra_kernel_syscall_number_duplicates). Kernel suite 567/567 passing (+3 from F170 baseline). Zero new BSS. Gate wall-clock ≤ 30 ms on a clean tree.
  • Fix: renumber SYS_KMSG_READ from 440 → 997 — Linux 5.10 added process_madvise(2) at slot 440 (Feature 170 — closes #325). Third hit in the F168/F169 syscall-table audit sweep. kernel/src/proc/syscall.rs:141 bound SYS_KMSG_READ = 440 (Feature 038's structured kmsg read); slot 440 was unused upstream at allocation time, but Linux 5.10 (2020) subsequently assigned it to process_madvise(2) (per-process advice on memory ranges; 5 args). A static-musl binary built against glibc ≥ 2.36 / musl ≥ 1.2.4 that calls process_madvise() would have landed on sys_kmsg_read (3-arg handler) — corrupting the caller's iovec array (treated as a kmsg buffer) and writing garbage into the vlen slot (treated as max_entries). Zero live callers in MyOS today (CLAUDE.md already noted "Zero current userspace consumers"), but a forward-blocker for any third-party Linux binary doing memory advice. Fix: moved the const value from 440 → 997, matching MyOS's reserved 9XX range alongside SYS_VBOOT_STATUS = 998 and SYS_SANDBOX_ENTER = 999. Updated docs (CLAUDE.md "kmsg + GDB workflow" section; mybox strace.rs comment) to track the new number. Net-new regression test process_madvise_slot_440_unbound_kmsg_read_in_myos_range pins both that slot 440 is no longer bound here AND that every MyOS-custom syscall (SYS_KMSG_READ, SYS_VBOOT_STATUS, SYS_SANDBOX_ENTER) lives in the 990+ range — so any future drift gets caught at the cargo test --lib gate. Kernel suite 564/564 passing. Zero new BSS. Zero userspace ABI break.
  • Fix: remove SYS_GETHOSTNAME = 125 — Linux x86_64 reserves slot 125 for capget(2); mybox hostname now routes through uname (63) (Feature 169 — closes #323). Same pattern as F168, caught the next morning of the same audit sweep: kernel/src/proc/syscall.rs bound SYS_GETHOSTNAME = 125, but Linux x86_64 reserves slot 125 for capget and 126 for capset (there is no standalone gethostname syscall — glibc/musl implement gethostname(3) via uname(2) syscall 63 and copy out the nodename). Unlike F168, this collision had a live caller: userland/mybox/src/applets/hostname.rs hard-coded syscall(125, ...) (with a comment claiming "musl's gethostname() uses SYS_uname which this kernel does not implement" — but sys_uname had since been implemented at slot 63, so the comment was stale). The collision was a forward-blocker for the capability model (#142) which would need slot 125 for capget. Fix: (a) delete const SYS_GETHOSTNAME + its dispatch arm in syscall.rs; raw 125 falls through to _ => ENOSYS. (b) Rewrite mybox hostname's kernel_gethostname() to issue raw syscall(63, &uts) with a 390-byte stack-allocated utsname buffer, then extract nodename (offset 65, NUL-terminated). No new mybox dep; binary works on both MyOS and host Linux (syscall(63) is uname on both). proc_sys::sys_gethostname Rust function is kept intact for future internal callers. Net-new regression test capget_capset_slots_125_126_are_unbound pins LINUX_CAPGET_NR=125 / LINUX_CAPSET_NR=126 as distinct from MyOS's existing hostname + credential consts. Kernel suite 563/563 passing, mybox suite 632/632 passing (no net-new tests; existing hostname tests still pass via host-side uname when run under cargo test). Zero new BSS.
  • Fix: remove SYS_SETEUID = 113 / SYS_SETEGID = 114 constants — Linux x86_64 reserves those slots for setreuid/setregid (Feature 168 — closes #321). Discovered while triaging F164's pin of the Linux x86_64 syscall table: kernel/src/proc/syscall.rs bound SYS_SETEUID = 113 and SYS_SETEGID = 114, but arch/x86/entry/syscalls/syscall_64.tbl lists those numbers as setreuid (113) and setregid (114). There is no standalone seteuid/setegid syscall on Linux x86_64: musl libc implements them as setresuid(-1, X, -1) / setresgid(-1, X, -1) (handled by F167's classifier on syscalls 117/119). The shadow was latent because no userspace currently issues raw syscall(113, ...) (musl wraps every credential call through setres*), but any future test or 3rd-party static binary that did would have been misrouted to the 1-arg sys_seteuid handler — silently dropping the second argument and emitting the wrong audit record. Fix: delete both const SYS_SETEUID/SETEGID definitions and their dispatch arms; raw syscall 113/114 now falls through to the _ => ENOSYS arm (the correct response until real setreuid/setregid handlers land). The io::sys_seteuid / io::sys_setegid Rust functions are kept intact as internal helpers — F167's classify-by-intent path still routes through them when musl's setresuid(-1, X, -1) is recognized as a single-effective-set. Net-new regression test setreuid_setregid_slots_113_114_are_unbound pins LINUX_SETREUID_NR=113/LINUX_SETREGID_NR=114 as distinct from all four SYS_SET*UID/GID/RESUID/RESGID constants. Kernel suite 562/562 passing. Zero new BSS. Spec: specs/168-syseuid-abi-collision/ (none — three-file scoped fix).
  • Fix: setresuid(-1, X, -1) now audits as AuditOp::SetEuid, not SetResuid (and gid mirror) (Feature 167 — closes #319, addresses the SetEuid/SetEgid half of #309). musl libc has no dedicated seteuid(2) syscall: seteuid(x) becomes setresuid(-1, x, -1) and setegid(x) becomes setresgid(-1, x, -1). Pre-F167, the kernel handlers emitted AuditOp::SetResuid (op=2) / SetResgid (op=5) unconditionally — every userland seteuid() showed up in /proc/audit/data as a setresuid, leaving ops 1 (SetEuid) and 4 (SetEgid) silently uncovered. Fix is a 12-line pure helper classify_setres(r,e,s,single_op,triple_op) invoked from both sys_setresuid and sys_setresgid: the (-1, X, -1) shape (with X != -1) routes to the effective-only op with AuditTarget::Single(X); every other shape (including the no-op (-1, -1, -1)) keeps the triple op + AuditTarget::Triple. Linux's audit subsystem performs the equivalent classification by caller intent; MyOS now matches. 7 net-new unit tests pin the full truth table (uid + gid mirrors, all-no-op preserves triple, real-uid-set preserves triple). Kernel suite 561/561 passing. Zero new BSS.
  • mybox chmod POSIX symbolic mode (Feature 163 — closes #209). chmod now accepts the symbolic grammar [ugoa]*[+-=][rwxXstugo]* in addition to octal: chmod +x file, chmod u+w,g-r file, chmod o=u file, chmod g+s file, chmod +X dir all work. Routing chooses by the first char of the mode arg (digit → existing octal path, else symbolic). Symbolic clauses are applied per-file against that file's current mode so chmod -R u+w is meaningful even when files have differing starting modes. Supports +/-/=, copy-from-class (u/g/o inside perm), X (execute-if-dir-or-already-exec), s (setuid when u in who, setgid when g), t (sticky), multi-clause comma lists, and chained op-perm pairs (u+w-x). = clears the who-selected bits before setting (matches GNU coreutils — u=rw on a setuid file does drop setuid). The -r recursive alias was removed (it collided with symbolic r); only -R remains. 21 net-new tests bring mybox suite to 632/632 passing.
  • Fix: test_credentials.py S5 scans for helper's record instead of asserting on the first (Feature 165 — closes #313). /proc/audit/data streams from the OLDEST surviving record on each open. The pre-fix S5 used dd bs=112 count=1 and asserted the result was the helper's setuid(1000) call — but by the time the test runs, an early-init setresgid(-1, -1, -1) from another process (typically PID 8) has already populated the ring. The decoded record had op=5, target_payload[0]=0xFFFFFFFF — both technically correct encodings of the wrong record (the encoder pins u32::MAX as the verbatim representation of i32::-1; see encode_triple_sentinel unit test). Fix: read 32 records (3,584 bytes — same shape as S5b coverage scan) and scan for the matching op=SetUid + post=(1000,1000,1000,1000) record. Test-only change; no kernel touched; encoder behavior is correct as documented.
  • Fix: SYS_SETFSUID and SYS_SETFSGID syscall numbers off by one (Feature 164 — closes #309 partial). The kernel dispatcher had SYS_SETFSUID = 123 and SYS_SETFSGID = 124, but the Linux x86_64 ABI (arch/x86/entry/syscalls/syscall_64.tbl) specifies 122 and 123. musl's setfsuid(2)/setfsgid(2) wrappers issue syscall 122/123; the kernel left both unrouted (fell into the _ => ENOSYS arm), so no AuditOp::SetFsuid (=7) or AuditOp::SetFsgid (=8) record was ever emitted and the F072/F073 9-op audit coverage helper was silently missing two ops. Fix is a one-line constant change in kernel/src/proc/syscall.rs. Added abi_pin_tests::credential_syscall_numbers_match_linux_x86_64 regression test pinning all 11 credential syscall numbers (SYS_SETUID/SETGID/SETPGID/SETSID/SETGROUPS/SETRESUID/GETRESUID/SETRESGID/GETRESGID/SETFSUID/SETFSGID) against the upstream table — silent drift here means userland binaries land on the wrong handler. Kernel tests 553/553 passing. SetEuid/SetEgid coverage gaps remain (musl wraps seteuid/setegid through setresuid/setresgid, so they dispatch to ops 5/6, not 1/4 — separate work). Spec: specs/164-setfsuid-syscall-nr/.
  • Fix: ELF AT_PHDR auxv entry now passes phdr virtual address, not file offset — closes Bug B of #175 + #178 (Feature 162). kernel/src/proc/elf.rs:376 was writing AT_PHDR = hdr.e_phoff directly into the new process's auxv. e_phoff is the FILE offset of the program-header table within the ELF image; the correct AT_PHDR value is the VIRTUAL address at which the phdrs are mapped at runtime. For a static-musl binary linked at the SysV-ABI default 0x400000, e_phoff = 0x40 but the phdrs map at vaddr 0x400040 (LOAD[0].p_vaddr + e_phoff). Pre-fix, musl's __init_tls dereferenced 0x40 (NULL+0x40) to walk PT_TLS/PT_DYNAMIC, faulted instantly, and the kernel reaped the process with SIGSEGV (exit code=-11) BEFORE a single syscall ran — so the symptom was "execve succeeds, then process exits silently with zero stdout". Every static-musl C helper failed this way; Rust mybox (no AT_PHDR consumption) and bare inline-asm binaries (skip libc init) were unaffected, which is why the bug stayed invisible for the whole tests/credentials/*.c corpus regression window. The fix is a 1-line site change + a 15-line resolve_phdr_vaddr(data, hdr) helper that walks PT_LOAD segments to find the one covering e_phoff and returns p_vaddr + (e_phoff - p_offset); falls back to bare e_phoff if no LOAD covers it (preserves prior behavior for malformed binaries — no regression). Diagnosis path used --features debug-syscall (existing flag, gets a new make image-debug-syscall target) + direct kprint! instrumentation in sys_write/writev/execve/exit_group; bisect was rejected because backward git-bisect breaks SSH at F088 (per F093 research). The page-fault address 0x40 + rip in musl's __init_tls+0x55 (mov (%rax),%edx) gave away the root cause in one read. 2 net-new kernel regression tests pin the invariant (synthesized ET_EXEC + assert resolve_phdr_vaddr returns vaddr, not file offset) bringing kernel suite to 552/552 passing. Live verification: test_hello_smoke.py PASS (WRITE + PRINTF both observed); test_suid_elevation.py PASS (RESULT: PASS × 2 + 2 audit op=9 records in delta — also closes #175 Bug A's still-deferred live SC since F091/F092 + this fix); test_credentials.py now runs (no 240 s hang) but exposes pre-existing F072/F073/F074 audit-emission bugs that are independent of Bug B (to be filed as follow-ups). Zero new BSS (logic-only fix; SC-006 met). Spec: specs/162-musl-stdio-fix/.
  • Fix: kernel sys_write to a pipe now blocks instead of dropping data past the 4 KiB ring (closes #106). Both sys_write call sites for PIPE_FS_ID (stdout/stderr fast path at line ~322; general-fd path at line ~456) now route through a new pipe_write_blocking helper that loops over the kernel pipe primitives, yielding when the ring is full and any reader is still attached, returning EPIPE if all readers close before any byte was written. Before this fix, a single write(fd, buf, N) for N > 4096 returned only what fit in the ring (pipe::write caps at BUF_SZ), and the rest was silently dropped — dmesg | tail -5 and ls /bin | head -12 both returned empty chunks because dmesg's first write filled the ring and musl's stdio gave up. O_NONBLOCK still returns EAGAIN immediately. 7 new unit tests in kernel/src/proc/pipe.rs document the kernel pipe contract (write caps at BUF_SZ, write_space tracks outstanding bytes, read_ready flips on writer-close, has_readers is the EPIPE signal, ring wraparound preserves bytes across (head + count) % BUF_SZ). New pipe::write_space(idx) accessor for callers that need to size their reads/writes. Live end-to-end verification (running dmesg | tail -5 over SSH) is currently blocked by #105 — the kernel [BAD-RET] scheduling panic fires during dropbear's SSH-handshake handling before any pipe activity from a test command can complete; will resolve automatically when #105 is fixed.
  • nsh stderr redirects (2>, 2>>, 2>&1) (Feature 116 — closes #224). Pre-fix, cmd 2>&1 was a parse error and cmd 2> file silently mis-parsed as cmd 2 > file (the 2 became a literal arg, stdout went to file, stderr still hit the terminal). Post-fix: tokenizer intercepts 2-followed-by-> BEFORE the digit falls into the word loop. Three new tokens (RedirErrOut, RedirErrAppend, RedirErrToOut) and three new Redir::Err* variants. External command 2>&1 uses unsafe pre_exec { dup2(1, 2) } so the merge happens at exec time regardless of how stdout was set up; builtins do the same via in-process dup2. 4 net-new tests bring nsh suite to 61/61 passing. Spec: specs/116-nsh-stderr-redirect/.
  • nsh ~ tilde expansion (Feature 115 — closes #222). Natural F107 follow-up: leading ~ (followed by / or end-of-arg) now expands to $HOME. Applied per-arg AND per-redirect-target inside expand_pipeline. POSIX-leaning rules: only leading ~, only when followed by / or end-of-arg; ~user, ~+, ~- out of scope; HOME-unset falls through to literal ~. 5 net-new tests bring nsh suite to 57/57 passing. Spec: specs/115-nsh-tilde-expansion/.
  • nsh cd - + cd → $HOME + PWD/OLDPWD (Feature 107 — closes #205). Fixes two POSIX shortcomings: cd - now toggles to $OLDPWD (was cd: -: No such file or directory); bare cd now defaults to $HOME (was /). Every successful cd updates the PWD + OLDPWD env vars so the prompt, scripts, and external tools can see the change. First-ever cd - returns rc=1 with cd: OLDPWD not set. Test serialization via Mutex<()> since builtin_cd mutates process-global state (CWD + env vars). 4 net-new tests bring nsh suite to 52/52 passing. Spec: specs/107-nsh-cd-dash-home/.
  • nsh || run-on-failure operator (Feature 106 — closes #203). The classic POSIX cmd && ok || fail idiom now works. || tokenizes as a single operator (greedy match before single |), and Script.seps was widened from Vec<bool> to Vec<Sep> where Sep ∈ { Always, And, Or }. Mixed chains evaluate left-to-right per POSIX. The run loop in userland/shell/src/main.rs skips a pipeline when the sep is And and the prior rc != 0, OR when the sep is Or and the prior rc == 0. 8 net-new tests bring nsh suite to 48/48 passing. Spec: specs/106-nsh-or-operator/.
  • mybox uname POSIX defaults + bundling + -p/-i/-o (Feature 118 — closes #228). Pre-fix, bare uname printed -a-style multi-field output (POSIX violation — should be sysname only); uname -ZZZ silently accepted unknown flags; -p/-i/-o were missing. Post-fix: bare unameLinux; flag bundling works (uname -srm); unknown flags return rc=1; -o prints GNU/Linux; -p and -i print unknown (matches Linux util-linux when proc info unavailable). 8 net-new tests bring mybox suite to 461/461 passing. Spec: specs/118-mybox-uname-posix/.
  • mybox basename -a + -s SUFFIX (Feature 117 — closes #226). Pre-fix basename -a /a/foo /b/bar /c/baz silently mis-ran — the literal -a became the path, the actual paths became suffixes to strip, output was -a and rc=0. Post-fix: -a/--multiple switches to multi-operand mode (one basename per line); -s SUFFIX / --suffix=SUFFIX strips SUFFIX from each result (implies -a). The 1-arg and 2-arg POSIX forms are preserved unchanged. 6 net-new tests bring mybox suite to 453/453 passing. Spec: specs/117-mybox-basename-multi/.
  • mybox date strftime (Feature 114 — closes #220). Stub replaced with real strftime. Supported tokens: %Y %y %m %d %H %M %S %j %s %n %t %% %T %D %F %R. Unknown %X pass through literally. -u accepted as no-op (output already UTC); -d returns rc=1 with deferred-feature error. No-arg path preserved. 12 net-new tests bring mybox suite to 447/447 passing. Spec: specs/114-mybox-date-strftime/.
  • mybox env POSIX semantics (Feature 113 — closes #218). Big behavioral fix: the pre-fix env was a stub that printed std::env::vars() and ignored everything else — including assignments AND any command name. Post-fix supports the full POSIX form env [-i] [-u NAME]... [NAME=VALUE]... [COMMAND [ARG]...]: assignments are applied, -u unsets, -i clears, and if a COMMAND is given it execs with the modified env (rc propagated from child; rc=127 when not found). Output is sorted for deterministic testability. 8 net-new tests bring mybox suite to 435/435 passing. Spec: specs/113-mybox-env-posix/.
  • mybox cut -c CHARS (Feature 112 — closes #216). POSIX byte-position extraction was missing entirely: cut -c 2-4 was rejected with unknown option. Implementation reuses the existing -f list parser (was parse_fields, now parse_list since both modes share the LIST grammar N, N-M, N-, -M, comma-combos). New Mode { Fields, Chars } enum keeps the two paths cleanly separate. Mutual exclusion enforced: -f X -c Y → rc=1. 4 net-new tests bring mybox suite to 427/427 passing. Spec: specs/112-mybox-cut-c/.
  • mybox seq -w + unknown-flag rejection (Feature 111 — closes #214). Two bugs in one shot: (1) seq -w 8 12 now produces 08 09 10 11 12 per GNU (was producing 1\n9 — silently mis-running because -w was parsed as a malformed numeric arg and the remaining tokens filled INCR + LAST); (2) unknown flags like -z now return rc=1 with descriptive error (was silently coerced into nonsense numeric output). Negative-arg path preserved (seq -3 -1 still works). 5 net-new tests bring mybox suite to 423/423 passing. Spec: specs/111-mybox-seq-w-and-unknown/.
  • mybox echo -e (interpret escapes) (Feature 110 — closes #212). Adds GNU/BusyBox-style -e (and -E for explicit-off) flags to echo. Recognized escapes: \\ \n \t \r \a \b \v \f \e \0NNN \xHH \c (where \c stops output and suppresses the trailing newline). Flag bundling works in any order (-ne, -en, -Ene); rightmost e/E wins. Unrecognized escapes pass through as \z per GNU. 9 net-new tests bring mybox suite to 418/418 passing. Spec: specs/110-mybox-echo-e/.
  • mybox wc -L (longest line length) (Feature 109 — closes #210). POSIX 2024 / GNU --max-line-length flag. wc -L FILE reports the byte length of the longest line; bundling (-lL, -wL) works. count() extended from 3-tuple to 4-tuple (lines, words, bytes, max); the byte-streaming pass already touches every byte once, so adding max-line tracking is essentially free. 4 net-new tests bring mybox suite to 409/409 passing. Spec: specs/109-mybox-wc-L/.
  • mybox printf format-string reuse (Feature 108 — closes #207). Real POSIX bug fix: printf '[%s] ' a b c was outputting [a] instead of [a] [b] [c] . Implementation adds count_conversions(fmt) (skipping %%) and format_output_repeated(fmt, args) that loops the format string per round of args. Edge case: a format with zero conversion specifiers is emitted exactly once, even if extra args are present (POSIX-required). 5 net-new tests bring mybox suite to 405/405 passing. Spec: specs/108-mybox-printf-reuse/.
  • mybox wc -c byte-accurate (Feature 105 — closes #201). Real bug fix: pre-fix, mybox wc -c over-counted bytes by 1 on every file without a trailing newline (the per-line +1 for the newline was always added, even on the final unterminated line). Pre-fix printf 'abc' | mybox wc -c4; post-fix → 3. Implementation switches count() from BufRead::lines() (UTF-8-restricted, lossy on binary input) to read_to_end() over raw bytes; counts lines as b'\n' occurrences and words via an in-word/out-word state machine on ASCII whitespace. Side benefit: binary input no longer errors. 3 net-new regression tests bring mybox suite to 400/400 passing. Spec: specs/105-mybox-wc-c-fix/.
  • mybox head -c BYTES + tail -c BYTES (Feature 104 — closes #199). POSIX byte-count flag for both applets. head -c 5 emits the first 5 bytes; tail -c 4 emits the last 4 bytes; oversize N returns the whole file; -c 0 emits nothing. Implementation uses a Mode { Lines, Bytes } enum so the existing line-mode path is unchanged. Last-flag-wins matches GNU coreutils. 8 net-new tests (4 head + 4 tail) bring mybox suite to 397/397 passing. Spec: specs/104-mybox-head-tail-c/.
  • mybox cat -n (Feature 103 — closes #197). Adds POSIX line-numbering to cat. -n prefixes each output line with right-aligned 6-col count + tab; numbering continues across multiple file arguments and works on stdin. No-flag path unchanged (still byte-perfect chunked copy — important for binary files post-F095). 4 net-new tests bring mybox suite to 389/389 passing. Spec: specs/103-mybox-cat-n/.
  • mybox sort bundled short flags (Feature 102 — closes #195). Fourth application of the F097/F098/F101 bundled-flag pattern. sort -rn, -ru, -nu, -nr (order-independent) all parse per POSIX. Char set: r n u. 4 net-new tests bring mybox suite to 385/385 passing. Spec: specs/102-mybox-sort-bundled-flags/.
  • mybox uniq bundled short flags (Feature 101 — closes #193). uniq -cd, -cu, -cdu, -dc (order-independent) all parse correctly. Same fix pattern as F097 (wc) and F098 (grep), adapted for uniq's char set (c d u). 4 net-new tests bring mybox suite to 381/381 passing. Spec: specs/101-mybox-uniq-bundled-flags/.
  • nsh $VAR + ${VAR} env-var expansion (Feature 100 — closes #191). nsh now expands $NAME and ${NAME} via env::var(name), with empty-on-unset (POSIX default). Extends F096's $? + $$ expander from a two-fixed-string str.replace() to a proper char-walker over the input, dispatching per-char on what follows $. NAME is a POSIX identifier ([a-zA-Z_][a-zA-Z0-9_]*); the walker stops at the first non-identifier char so $HOME/foo correctly expands to <HOME>/foo. Braced form ${NAME} behaves identically. Tokens like $0, $@, ${} pass through as literal (positional args + special params deferred to v3). Defers ${VAR:-default}, ${VAR:+alt}, $(cmd) command substitution, and POSIX-correct quoting (single-quote should NOT expand, double should — v1 is permissive across the board). 6 net-new tests (and one F096 test updated to reflect superseded behavior) bring nsh suite to 40/40 passing. Spec: specs/100-nsh-env-var-expansion/.
  • mybox grep bundled short flags (Feature 098 — closes #188). grep -rn, -vi, -vc, -cn, -rl, etc. now parse correctly per POSIX. Same fix pattern as F097/wc, adapted for grep's char set (c i v n r l) and exit-code convention (rc=2 for usage errors, not 1). 4 net-new tests bring mybox suite to 377/377 passing. Spec: specs/098-mybox-grep-bundled-flags/.
  • mybox wc bundled short flags (Feature 097 — closes #186). wc -lw, -lwc, -wc, -wl (order-independent) all parse correctly. Single bundled-flag arm added to wc's option-parse loop, mirroring the pattern mybox ls already uses for -la / -al. Unknown chars in a bundle return unknown option '-X' and rc=1. 4 net-new tests bring mybox suite to 373/373 passing. Spec: specs/097-mybox-wc-bundled-flags/.
  • nsh $? + $$ variable expansion (Feature 096 — closes #184). nsh now expands $? (last command exit code) and $$ (current shell PID) in argv elements and redirect paths. Pre-F096, true; echo $? printed literal $?; every shell exit-code-checking idiom (if [ $? -ne 0 ]) was silently broken. Implementation is two helpers in userland/shell/src/main.rs: expand_dollar_vars(s, last_exit, pid) does the string substitution; expand_pipeline() walks the pipeline's argv + redirs and mutates in place. Called per-pipeline so each pipeline sees the PREVIOUS pipeline's exit code (POSIX-faithful evaluation order). v1 is permissive — uses str.replace() which expands inside single quotes too; proper quoting semantics is a v2 follow-up. Defers $VAR env-var expansion, ${VAR} braced syntax, ${VAR:-default} default-value, and $(cmd) command substitution. 6 net-new tests bring nsh suite to 34/34 passing. Spec: specs/096-nsh-exit-pid-expansion/.
  • nsh built-in cat — binary-safe (Feature 095 — closes #182). nsh's builtin_cat previously used fs::read_to_string() which REQUIRES valid UTF-8 — every binary file (ELF, qcow2, image, .o) triggered cat: <path>: stream did not contain valid UTF-8. Fix switches to byte-level I/O: fs::read() for files, io::stdin().read_to_end() for stdin, io::stdout().lock().write_all() for output. BrokenPipe on stdout exits 0 (POSIX-faithful). Brings nsh's built-in to parity with mybox cat (which was already byte-safe). 3 net-new tests bring nsh suite to 28/28 passing. Spec: specs/095-nsh-cat-binary-safe/.
  • mybox tail / head POSIX -N shorthand (Feature 094 — closes #180). tail -20 (≡ tail -n 20) and head -3 (≡ head -n 3) now work — the POSIX/GNU shorthand. Pre-F094, single-dash-number tokens fell through to the catchall and got mybox: tail: unknown option '-20'. 8 net-new tests (4 per applet) bring mybox suite to 369/369 passing. Spec: specs/094-mybox-tail-head-shorthand/.
  • Bug B of #175 diagnosis + harness scaffolding (Feature 093 — fix deferred to #178/F094). Ships the diagnostic infrastructure for the second bug in #175 (static-musl helpers produce zero stdout under SSH): (a) tests/credentials/hello_smoke.c — a 10-LOC C source that does write(1, "WRITE\n", 6); printf("PRINTF\n"); fflush; exit(0), the simplest possible probe for stdio-via-musl regressions; (b) tests/boot/test_hello_smoke.py (~200 LOC) — paramiko + SCP + scrape harness that boots the image, compiles the helper on demand (musl-gcc), transfers it, runs via SSH, and exit-coded based on whether both expected lines appear; (c) a documented Phase A2 differential at HEAD that proved the bug is in musl's __libc_start_main (or fd-state-at-entry), NOT in the kernel I/O path itself — a bare-syscall inline-asm binary (no libc) prints correctly via the same SSH session, while every musl-static binary fails silently. Bisect across F074-F092 was attempted but blocked by SSH-config drift backward in history (P2 at F088 couldn't even establish SSH). No kernel changes — kernel suite stays at 550/550 passing (post-F092 baseline). The actual fix needs --features debug-syscall rebuild + full syscall-chain capture to identify the musl-init failure point; tracked at #178. Spec: specs/093-musl-stdio-bisect/.
  • chmod(2) / fchmod(2) / fchmodat(2) syscalls — Bug A of #175 closed (Feature 092). Replaces the pre-existing no-op stubs at kernel/src/proc/syscall.rs:327-328 (SYS_CHMOD => 0, SYS_FCHMOD => 0) with real implementations that actually update the inode's i_mode on disk. Adds a new SYS_FCHMODAT (= 268) dispatch arm (was falling into _ => ENOSYS). The three handlers live in a new module kernel/src/proc/syscalls/chmod.rs (~280 LOC incl. 16 unit tests + cfg(test) stubs that exercise the truth-table logic without booting QEMU). The kernel walks the VFS to locate the inode, enforces a Linux-faithful owner-or-root permission gate (FR-005 — caller's euid == 0 || euid == file_uid; otherwise EPERM), then delegates to a new Filesystem::chmod(inode, mode) trait method (default-impl Err("chmod not supported"); overridden by Ext2Filesystem::chmod to do the actual on-disk mutation). The ext2 driver applies the file-type-bit-preservation mask (existing & 0o170000) | (mode & 0o7777) per FR-004 — caller's mode argument is silently masked to the bottom 12 bits (rwx + setuid + setgid + sticky); the existing inode's file-type bits (S_IFREG 0o100000, S_IFDIR 0o040000, etc.) are always preserved. i_ctime is updated per POSIX (chmod changes inode metadata); i_mtime is NOT touched (chmod doesn't change file content). Per Q1 clarification, fchmodat(dirfd, ...) returns ENOTSUP (-95) for any dirfd != AT_FDCWD (-100) — full dirfd-relative path resolution is deferred to a v2 alongside openat/mkdirat. flags != 0 (e.g., AT_SYMLINK_NOFOLLOW) returns EINVAL rather than silently treating as flags=0. Unblocks #175 Bug A: every SCP+chmod+exec userland flow that pre-F092 silently failed (because chmod 755 /tmp/foo returned 0 but the file stayed mode 100644) now actually works. Bug B remains open (static-musl helpers produce no stdout under SSH — separate root cause; F093). Zero new BSS (SC-006 — size target/kernel.elf .bss actually shrank by 16 bytes post-feature; the cfg(test)/cfg(not(test)) split eliminates more dead code than the new handlers add). 16 net-new kernel unit tests bring the suite from 534 → 550 passing. New corpus C helper tests/credentials/chmod_basic.c (chmod + fchmod + fchmodat round-trip, runtime use blocked by Bug B until #175 fully closes). Spec: specs/092-chmod-syscall/.
  • Live S_ISUID corpus integration scaffolding (Feature 091 — closes #173). Ships the test-harness scaffolding that turns F090's two corpus C helpers (tests/credentials/suid_elevation.c + suid_dropelev_roundtrip.c) into image-installed runnable binaries: (a) a reusable install_suid_binary shell helper in build/scripts/assemble-image.sh that pre-chmods the host file + pins host mtime to SOURCE_DATE_EPOCH (SC-006 reproducibility) + emits 4 debugfs directives (write + 3× sif for mode / uid / gid) so a file lands in the ext2 rootfs with explicit mode and root ownership regardless of the host build-user uid; (b) a static-musl launcher tests/credentials/suid_launcher.c (~25 LOC) that setuid(1000) + execve(argv[1], argv+1, envp) to bridge the root-only SSH login surface to a non-root exec context — drops privilege at the kernel boundary so F090's apply_exec_credentials truth table fires under realistic conditions; (c) the orchestrator tests/boot/test_suid_elevation.py (~250 LOC) that boots the image, captures pre/post audit-ring deltas (per Q1 delta-scoping clarification), runs both helpers via the launcher, scrapes RESULT: PASS, and confirms ≥ 1 audit record with op = 9 (AuditOp::ExecSuid) in the delta window. Three new image artifacts ship with the install: /usr/bin/suid_elevation (mode 0o104755, uid=0, ~37 KiB), /usr/bin/suid_dropelev_roundtrip (same shape, ~44 KiB), /root/secret (mode 0o100600, uid=0, 11 bytes — fixed payload "top secret\n" for SC-006 reproducibility). Verified statically via debugfs -R 'stat ...' dist/myos2026.qcow2 showing all three with correct mode + owner. SC-006 reproducibility preserved (two consecutive make image produce byte-identical qcow2 — verified). Zero kernel changes; kernel suite stays at 534/534 passing (SC-004). FR-010: NOT added to make ci-local's required matrix in v1 — lives as opt-in standalone (python3 tests/boot/test_suid_elevation.py dist/myos2026.qcow2); CI promotion is a separate follow-up after stability data. Live runtime verification (SC-001 + SC-003) DEFERRED to #175 — SCP'd static-musl helpers currently produce zero stdout under SSH on this dev box (affects test_credentials.py too, which times out at 240s); root cause is likely in the same area as the long-standing #50 dropbear flake. The harness scaffolding will work end-to-end once #175 is resolved — no F091 code change needed. Spec: specs/091-suid-live-integration/.
  • S_ISUID / S_ISGID semantics on execve(2) — closes #141 (Feature 090). Wires sys_execve to honor the executable file's S_ISUID (0o4000) / S_ISGID (0o2000) mode bits per Linux semantics: a non-root caller (euid != 0) execve()-ing a binary owned by root with S_ISUID set obtains euid = file_owner_uid + suid = file_owner_uid + fsuid = file_owner_uid (canonical sudo/passwd/ping pattern); ruid and supplementary groups remain unchanged. S_ISGID lifts egid/sgid/fsgid identically. Root-special-case (FR-005): euid == 0 caller is a no-op — applying suid-from-non-root to root would be a privilege DROP, which the mechanism is not designed for (Linux faithful). FR-008 failure-isolation: the credential transition runs the SINGLE line after table::execve() Ok-return, so any earlier bail (open/stat/read/load) returns BEFORE any cred change — a failed exec leaves the caller's credentials byte-identical. The transition is encapsulated in a new pure Credentials::apply_exec_suid(file_mode, file_uid, file_gid) -> bool (truth-table-as-function pattern from F071 — unit-testable without booting QEMU) plus a thin proc::credentials::apply_exec_credentials(pid, ...) wrapper that takes PTABLE.lock once and emits the audit record. Two ABI extensions (both append-only): (1) AuditOp::ExecSuid = 9 in F073's enum — every credential-changing exec emits one 112-byte audit record discriminating user-driven setuid syscalls (op 0-8) from binary-driven elevation (op 9); pre-F090 parsers see "unknown op" and log-and-skip (no crash, no misclassification). (2) AT_SECURE = 1 in the new process's auxv after a transition fires (else 0) — musl/glibc read this and enable secure-execution mode (strip LD_PRELOAD / LD_LIBRARY_PATH / LD_AUDIT, ignore LD_* debug knobs) closing the env-injection attack vector on suid binaries. The auxv slot for AT_SECURE was already wired by load_elf at u64 slot 25/26 with initial value 0; F090 adds a secure_exec: bool param to write_argv_stack that walks the saved auxv pairs and overwrites AT_SECURE's value (no const bumps, no new entries). Zero new BSS (SC-006 — every storage element is Pcb.creds field, .rodata enum variant, or transient user-stack auxv). 11 net-new kernel unit tests bring the suite to 523 → 534 passing (FR-002 isuid lifts euid, FR-004 isgid lifts egid, FR-005 root no-op, FR-007 no-bits no-op, FR-006 drop-and-re-elevate via seteuid round-trip, both-bits, op=9 ABI pin, encode_exec_suid shape, AT_SECURE=1 / =0 / missing-no-op). The drop-and-re-elevate test (T009) caught a subtle Linux-semantics gotcha: setuid(2) as root resets ALL three (uid, euid, suid) so the drop idiom MUST use seteuid() to preserve the saved-set; the test now asserts the right call. Out of scope: MS_NOSUID mount flag (deferred to mount-model feature), file capabilities (xattrs / CAP_* — separate F071 follow-up #142), live image-installed setuid corpus run (deferred to follow-up — needs debugfs sif-for-uid + chmod-on-host-temp-for-mode + non-root invocation path in the test harness). Spec: specs/090-execve-suid/.
  • /proc/waitq-fallback-total — cross-subsystem WaitQueue saturation observability (Feature 089 — closes #157). Single-line procfs file at /proc/waitq-fallback-total whose read returns kernel::sched::wait::TOTAL_FALLBACKS (the global counter populated by every WaitQueue saturation event across the 7 #155-migrated subsystems: pipe/socket-recv/UNIX-accept/futex/pty/tty/kbd) as a decimal ASCII u64 + \n. Operators poll-and-diff the file every ~10s to detect any kernel waitq saturation event without parsing the kmsg ring (parser-fragile + bounded by 256 KiB ring retention). Format mirrors /proc/uptime shape: single-line, world-readable (0o100444), 32-byte fixed reported size, side-effect-free read. Inode INODE_WAITQ_FALLBACK = 18 (next free after F073's audit stats). Path location chosen per Q1 clarification: top-level /proc/waitq-fallback-total instead of Linux-faithful /proc/sys/kernel/... (would need new dir-level inode infrastructure; not justified for one counter). Zero new BSS — the feature is a pure read-accessor over the existing F075 static. 3 net-new kernel unit tests (waitq_fallback_total_renders_current_value, ..._advances_with_counter, ..._read_is_side_effect_free) bring the suite to 520 → 523 passing. The side-effect-free test (FR-005 / SC-005) is the load-bearing one — it reads the file 1000× in a loop and asserts TOTAL_FALLBACKS is unchanged, preempting any future refactor that accidentally taps a wait queue from the read path (which would be a self-saturation feedback loop). Spec: specs/089-waitq-fallback-procfs/.
  • WaitQueue migration for kbd — final #155 slice, closes #155 (Feature 088). Migrates kernel/src/drivers/kbd.rs from F073-era sleep_until(u64::MAX) + single-waiter KB_WAITING_PID: AtomicU32 to F074's WaitQueue<8> primitive in a single per-system static KB_WAITQ (kbd is a single-consumer system-wide device, mirrors F087 TTY shape). 1 wait site migrated (read) using an atomic-load predicate (WRITE_IDX != READ_IDX); 1 wake site augmented (process_byte keeps legacy wake_pid(KB_WAITING_PID.swap) path AND adds post_kbd_wake() per R6 augment-not-replace risk insurance). Per FR-007 (mirrors F086/F087): EINTR-as-EOF preserves the usize return signature. Per R6 (mirrors F083/F086/F087): legacy KB_WAITING_PID field RETAINED — kbd has no correctness bug, pure latency migration. BSS delta +72 bytes (1 × WaitQueue<8> + alignment; 1.8× under the 128-byte SC-003 budget). 4 net-new kernel unit tests bring the suite to 516 → 520 passing. This PR closes #155 — the WaitQueue migration arc (started post-F074) is COMPLETE: pipe (F075), TCP/UDP/ICMP + UNIX recv (F076), UNIX accept (F083), futex (F084), pty (F086), tty (F087), kbd (F088) — all seven subsystems migrated. Total #155 BSS cost: ~18 KiB across 7 features. Spec: specs/088-kbd-waitqueue/.
  • WaitQueue migration for tty — #155 sixth slice (Feature 087). Migrates kernel/src/drivers/tty.rs from F073-era sleep_until(u64::MAX) busy-blocks + single-waiter TTY_WAITING_PID / HVC0_WAITING_PID: AtomicU32 fields to F074's WaitQueue<8> primitive in two per-path statics: TTY_WAITQ (kbd-fed VT path) + HVC0_WAITQ (virtio-console path). 4 wait sites migrated: read_line, read_raw, read_line_hvc0, read_raw_hvc0 — each replaces a sleep_until(u64::MAX) loop with WaitQueue::wait_until against an atomic-load predicate (LINE_READY for canonical, RAW_WRITE_IDX != RAW_READ_IDX for raw mode). Two wake sites (wake_waiter, wake_waiter_hvc0) now ALSO call the new wake wrappers — they wake the legacy single-PID via wake_pid AND fire WaitQueue::wake_one (latter is a ~5 ns no-op when no reader is parked). No nested-lock concern (tty.rs uses no shared mutex on the hot path — all predicates are pure atomic reads) so per-path WaitQueue placement is unconstrained. TTY and HVC0 paths are guaranteed independent via tty_and_hvc0_waitqs_are_independent unit test. Per FR-007 (mirrors F086): EINTR-as-EOF preserves the usize return signature; real -EINTR widening is a follow-up. Per R6 (mirrors F083/F086): legacy TTY_WAITING_PID/HVC0_WAITING_PID fields RETAINED — tty.rs has no correctness bug, pure latency migration. BSS delta +96 bytes (2 × WaitQueue<8>, exactly matches prediction; 2.6× under the 256-byte SC-003 budget). 5 net-new kernel unit tests bring the suite to 511 → 516 passing. After F087, #155 has only kbd remaining (1 subsystem). Spec: specs/087-tty-waitqueue/.
  • WaitQueue migration for pty — #155 fifth slice (Feature 086). Migrates kernel/src/drivers/pty.rs from the F073-era sleep_until(u64::MAX) busy-block + single-waiter master_wait/slave_wait: u32 fields to F074's WaitQueue<8> primitive in a new parallel static PTY_WAITQS: [PtyWaitQs; 16] where each PtyWaitQs carries TWO independent waitqs (master_recv + slave_recv) for directional independence (FR-011). BSS cost: 16 × 2 × ~48 B ≈ 1,536 bytes (matches the predicted SC-003 ≤2 KiB budget at the byte). Wake-call sites: master_writeslave_recv.wake_one(), slave_writemaster_recv.wake_one(), close()wake_all() on BOTH directions (EOF semantics for parked readers). All three wake sites fire AFTER TABLE.lock is dropped per R2 (mirrors F076 unix_write / F083 unix_connect / F084 futex_wake); the runtime-verified pty_wake_is_after_lock_drop test sets a LAST_PTY_WAKE_LOCK_FREE: AtomicBool flag inside the cfg(test) wake wrapper to catch any future refactor that moves wake-call inside the held lock. close() follows the F083 R4 analyze-corrected ordering — publish in_use = false INSIDE the held lock, drop the lock, THEN wake_all(); woken readers re-check the predicate (master_buf has data || !in_use), observe !in_use, and return 0 (EOF) per FR-005. Per R6 (mirrors F083, NOT F084): legacy master_wait/slave_wait: u32 fields retained for one revision as risk insurance — the pre-feature flow is correct (just slow), no correctness bug to delete. Signal-interrupt path: per FR-007, the existing usize return signature is preserved by treating Err(Interrupted) as return 0 (same shape as the EOF return; userland reads it as EOF); a future feature can widen the signature to surface real -EINTR. 8 net-new kernel unit tests bring the suite to 503 → 511 passing (covers FR-001 BSS init, FR-003 per-direction wake counters × 2, FR-004 close-time wake_all, FR-003 R2 runtime check, FR-009 fast-path no-touch, FR-011 inter-direction independence, FR-010 slot-reuse invariant). After F086, #155 still has tty + kbd remaining (2 subsystems). Spec: specs/086-pty-waitqueue/.
  • FUTEX_WAIT Linux-fidelity timer-expiry return (Feature 085 — closes #162). Completes the F084 Q1 deferral: futex_wait now returns -ETIMEDOUT (-110) from timer-expiry instead of conflating it with successful wake. Pre-feature, both paths returned 0 and userland was expected to re-derive the case via clock_gettime (non-Linux-faithful). Atomically distinguishes wake-from-timer by calling FUTEX_WAITQS[bucket_idx].unregister(pid) post-sleep: true means our slot was still registered → no wake_one reached us → timer (or signal); false means a waker cleared our slot → success. Signal-pending takes precedence over both wake and timer per Linux convention (preserves F084 FR-008 EINTR semantics). F074 surface change: WaitQueue::unregister signature widened from fn unregister(&self, pid: u32)pub(crate) fn unregister(&self, pid: u32) -> bool so callers can do the wake/timer classification (1-line change; existing F074 callers that discard the return value compile unchanged). 1 net-new kernel unit test (futex_wait_classifies_timer_expiry_as_etimedout) covers 4 scenarios: timer/wake/signal precedence + the FR-005 "no spurious ETIMEDOUT on infinite-wait" invariant — kernel suite 502/502 → 503/503. Zero BSS growth. Spec: specs/085-futex-etimedout/.
  • WaitQueue migration for futex — F083b / fourth slice of #155 (Feature 084). Migrates kernel/src/proc/futex.rs from the hand-rolled per-bucket waiters: [u32; 16] array + naked add_waitersleep_until(wake_tick) flow (which has a classic missed-wakeup race) to F074's WaitQueue<16> primitive in a new parallel static FUTEX_WAITQS: [WaitQueue<16>; 64] (~5 KiB BSS replaces the ~4.5 KiB pre-feature waiter storage; net delta ~512 B, well under the SC-003 ≤6 KiB budget with 12× headroom). Closes the missed-wakeup race via F074's register-then-recheck protocol: futex_wait now re-reads *addr == val AFTER registering on the bucket's wait queue — if the userland-side compare-and-update raced past our register, the recheck observes the published *addr flip and returns 0 immediately without sleeping (pre-feature, this scenario would hang forever absent a timeout). futex_wake calls wake_one ENTIRELY outside FUTEX_TABLE.lock per R2 wake-after-lock-drop discipline (mirrors F076's unix_write / F083's unix_connect). Wake-call latency drops from F073-era ~10 ms scheduler-tick worst case to ≤5 ms p99 (matches F075/F076/F083 targets); wake_one fast path on a no-waiter bucket is now a single AtomicU8::load + branch (~5 ns) instead of the pre-feature FUTEX_TABLE.lock() + scan + drain (~200 ns) — a ~40× speedup on the wake path. Open-codes the register/recheck/sleep dance in futex_wait (research R4) rather than extending F074's wait_until with a timeout variant: futex needs sleep_until(wake_tick) for the struct timespec relative-timeout case, which F074's wait_until can't accommodate; pty/tty/kbd (remaining #155 work) don't need timeouts, so F074's surface stays frozen until a second timeout-needing consumer appears. Deletes legacy code cleanly (R6) — add_waiter, drain_waiters, FutexBucket.waiters, FutexBucket.waiter_count, MAX_WAITERS_PER_BUCKET all removed (NOT retained for one revision as F083 did for Listener.waiter); the pre-feature code has a real correctness bug, so a parallel old-path would dilute the fix and risk silent re-introduction. Per Q1 clarification, the migration preserves the pre-feature behavior of returning 0 from BOTH successful-wake AND timer-expiry; the Linux-faithful 0-from-wake / -ETIMEDOUT-from-timer split is OUT OF SCOPE and tracked as a follow-up. 9 net-new kernel unit tests bring the suite to 502/502 (incl. the dedicated SC-005 missed-wakeup-race-close test futex_wait_recheck_after_register_observes_addr_flip, the R2-discipline runtime check futex_wake_is_after_lock_drop via LAST_FUTEX_WAKE_LOCK_FREE: AtomicBool, and the FR-008 EINTR test via test-only signal-injection). New integration helper tests/credentials/futex_wake_latency.c (pthread-mutex contention) + scenario tests/boot/test_futex_wake_latency.py S1 measures p99 latency across 100 trials against the SC-001 ≤5 ms budget. Surface change in F074: WaitQueue::try_register promoted from private to pub(crate) (1-line F074 change) so futex can open-code the register dance. After F084, #155 still has pty / tty / kbd remaining (3 subsystems × ~3 days each). Spec: specs/084-futex-waitqueue/.
  • WaitQueue migration for UNIX accept() — F076b / third slice of #155 (Feature 083 — closes #160). Migrates the two sleep_until(u64::MAX) busy-block sites in kernel/src/net/unix.rsunix_accept (path-keyed) and unix_accept_by_slot (slot-keyed) — to F074's WaitQueue::wait_until against a new per-listener parallel-array LISTENER_WAITQS: [WaitQueue<8>; MAX_LISTENERS] (16 slots × ~48 B ≈ 768 B BSS, within the SC-003 ≤1.5 KiB budget with 2× headroom). Wake-one fires from unix_connect after the TABLE.lock is dropped (research R2 mirrors F076's unix_write discipline); wake-all fires from unix_unbind after the listener slot has been zeroed inside the held lock and the lock dropped — woken accepters re-check the predicate unix_listener_has_pending(li) || !unix_listener_in_use(li), observe the cleared in_use flag, and the calling unix_accept returns EBADF per Q1 clarification (matches Linux's fd-vanishes-mid-syscall convention). Non-blocking accept (O_NONBLOCK listener / accept4(SOCK_NONBLOCK)) never registers on the wait queue per Q2 — returns EAGAIN from the fast path (FR-011), preserving the 8-slot capacity for genuine blockers. Capability upgrade as a side effect: the existing code's single Listener.waiter: u32 field supported exactly ONE parked accepter per listener; F083 raises that to 8 (the WaitQueue capacity), with the legacy field retained as a free 9th slot per research R6. Wake-after-lock-drop discipline (R2) is verified at runtime by a new LAST_CONNECT_WAKE_LOCK_FREE: AtomicBool flag set inside the #[cfg(test)] wake wrapper — if a future refactor moves the wake call back inside the held-lock block, unix_connect_wake_is_after_lock_drop fails loudly. 12 new kernel unit tests bring the kernel suite to 493/493. New integration helper tests/credentials/unix_accept_wake_latency.c + scenario tests/boot/test_unix_accept_wake_latency.py S1 measures p99 latency across 100 trials against the SC-001 ≤5 ms budget (matches F076's UNIX recv target). The unix_accept_by_slot first-in-use-listener heuristic is a pre-existing semantic gap (research R3 — not introduced by F083), preserved verbatim; a follow-up to store listener_idx per fd is its own ticket. After F083, #155 still has futex / pty / tty / kbd remaining (4 subsystems × ~3 days each). Spec: specs/083-unix-accept-waitqueue/.
  • WaitQueue migration for sockets — second slice of #155 (Feature 076). Migrates kernel/src/net/unix.rs (UNIX recv) and kernel/src/proc/syscalls/net.rs (TCP/UDP/ICMP recv/send/accept across 8 sched_yield sites at lines 564/772/879/912/1366/1410/1537/1633) from F073-era busy-yield loops to F074's WaitQueue<8> primitive. UNIX sockets get per-pair recv_waitq in a parallel static UNIX_WAITQS: [UnixWaitQs; 32] outside the existing Mutex<TABLE> (research R1 — avoids nested-lock deadlock). unix_write calls wake_one after appending bytes; unix_close calls wake_all so blocked readers see EOF (FR-012). UNIX wake latency target: ≤ 5 ms p99 (matches F075). TCP/UDP/ICMP get per-slot SocketWaitQs { recv, send, accept } in SOCKET_WAITQS: [SocketWaitQs; 64]; kernel::net::stack::with_sockets invokes post_poll_wake_iteration() AFTER the post-poll Interface::poll and AFTER dropping the NET_STACK guard (research R2 wake-after-lock-drop) — every blocked socket waiter is woken so their wait_until predicate re-checks via the non-polling check_sockets path (no self-wake race). v1 wakes blindly per poll; spurious wakes self-correct via the is_readable/is_writable predicate at ~1 µs/poll cost (#157 follow-up territory). TCP wake latency target: ≤ 15 ms p99 (bounded by smoltcp poll interval — interrupt-driven smoltcp would unblock the 5 ms target as a separate feature). The pipe-via-socketpair branch in sys_recv now routes through F075's pipe::wait_for_read instead of busy-yield. unix_accept migration descoped to F076b (listeners aren't pair-indexed; needs a separate parallel array). BSS cost: ~10 KiB (64 × 3 × ~48 B sockets + 32 × ~16 B unix). 5 new kernel unit tests bring the kernel suite to 481/481 (T010 socket_waitqs_start_empty, T011 post_poll_wake_iteration_no_waiters_is_safe, T012 wait_for_socket_ready_nonblock_returns_eagain, T013 wait_for_socket_accept_nonblock_returns_eagain, T014 wait_for_socket_ready_out_of_range_slot). Defensive bounds-check added to get_slot so out-of-range slots return None rather than panic. Three subsystems remain in #155 after this (futex, pty, tty, kbd) — each is its own ~3-day NNN- feature. Spec: specs/076-socket-waitqueue-migration/.
  • WaitQueue migration for pipes (Feature 075 — first slice of #155). Migrates kernel/src/proc/pipe.rs from F073-era while !pipe::ready(idx) { sched_yield(); } busy-yield loops to F074's explicit-wake WaitQueue<8> primitive. Pipe blocking-read/write wake latency drops from ~10 ms (one scheduler tick) to ≤5 ms p99 (matches the F074 audit-ring improvement); shell pipelines (cmd1 \| cmd2) and bulk transfers (cat large.bin \| ssh remote 'cat > x') benefit invisibly without userland changes. Each of the 64 pipe slots gains TWO wait queues in a parallel static array PIPE_WAITQS: [PipeWaitQs; 64] declared OUTSIDE the existing Mutex<PipeTable> (per research R1 — placing them inside would deadlock against the readiness-predicate lock acquisition). Wake fires AFTER TABLE.lock is dropped (research R2 — shorter critical section; single-CPU serial ordering preserves state-publish-before-wake). wake_one for data flow (FR-010/011); wake_all for close-event broadcasts (FR-012 EOF; FR-013 EPIPE). New pipe::has_writers(idx) accessor symmetric to existing has_readers makes EOF detection composable in wait_until predicates. Partial-write semantics: per clarification Q1, a write(2) interrupted by signal after pushing N≥1 bytes now returns N (matches Linux pipe semantics; pre-F075 returned -EINTR universally, which caused duplicate-write bugs in daemons that retried the full payload). New global kernel::sched::wait::TOTAL_FALLBACKS: AtomicU64 counter + 1-line dmesg log per fallback event gives operators a single grep target (dmesg \| grep waitq-fallback) across all kernel waitq users (audit + pipe + future subsystems). BSS cost: ~6 KiB. 9 new kernel unit tests (476/476 total). 5 new integration scenarios in tests/boot/test_pipe_wake_latency.py (S1 read wake, S2 write wake, S3 EOF wake, S4 EPIPE wake, S5 100/100 push/read correspondence per SC-007). Per-pipe-slot procfs stats deferred as #157 (operators can detect saturation via dmesg + the global counter for v1). Five subsystems remain in #155 (TCP, futex, pty, tty, kbd) — each is its own ~3-day NNN- feature. Spec: specs/075-pipe-waitqueue-migration/.
  • WaitQueue primitive + poll(2) on /proc/audit/data (Feature 074 — closes #152). Introduces kernel::sched::wait::WaitQueue<const N: usize> — a generic explicit-wake primitive backed by the existing sched::sleep_until + sched::wake_pid mechanism. Each instance is a fixed-size [AtomicU32; N] waiter list with FIFO ordering; the missed-wakeup race is closed by a 3-step register-then-recheck protocol (research R3). First consumer is audit::RING_WAITQ: WaitQueue<16>: Ring::push now calls wake_one() after every record so blocked read(2) callers wake within ≤5 ms p99 (vs F073's ~10 ms scheduler-tick worst case). O_NONBLOCK short-circuits read(2) to return -EAGAIN (11) immediately on an empty ring (FR-020); the fcntl(F_SETFL, O_NONBLOCK) runtime toggle now works on procfs FDs via a 1-line thread-through in sys_open (research R6). The existing sys_poll dispatch gains an INODE_AUDIT_DATA arm so daemons can multiplex audit + control FDs in one syscall (FR-030; closes #152's headline use case). Blocked reads + polls return -EINTR on any deliverable signal per clarification Q3 (FR-041; no SA_RESTART). The 17th-waiter fallback (FR-008) silently degrades to F073's sched_yield loop — a waitq_fallback_total counter exposed via /proc/audit/stats lets operators detect saturation. /proc/audit/stats body grew from 4 lines to 6 (waiters_now + waitq_fallback_total appended per F073 contract § 4 append-only ABI; F073 consumers parsing 4 lines continue to work). 11 new kernel unit tests bring the kernel suite to 467/467. 5 new live VM scenarios in tests/boot/test_credentials.py (S10 poll-multiplex, S11 EAGAIN, S12 ≤5 ms p99 wake latency across 100 trials, S13 SC-007 push/read correspondence, S14 exit-during-block stale cleanup). make bench-setuid-stage3 enforces ≤50 ns delta vs the post-F073 baseline. Two follow-ups filed for incremental work: #154 epoll(7), #155 WaitQueue migration for pipe/TCP/futex/pty/tty. Spec: specs/074-waitqueue-poll-audit/.
  • Dedicated kernel audit ring — Stage 2 (Feature 073 — Stage 2 of #144). Adds a 4096-slot, BSS-resident, SeqLock-protected ring of fixed 112-byte binary AuditRecords alongside Stage 1's kmsg-ring ASCII path. Each credential-changing syscall now pushes into BOTH transports inside the same PTABLE.lock acquisition (FR-011 atomicity): the Stage 1 dmesg | grep AUDT line for ad-hoc operator grep, AND a Stage 2 binary record for programmatic streaming consumers. Userland reads via /proc/audit/data (binary, 112-byte aligned; cursor model: oldest_seq + offset / 112; blocking-read via sched_yield per FR-022; root-only open(2) via explicit gate in sys_open) and /proc/audit/stats (4-line ASCII KV: seq_oldest/seq_newest/dropped_total/bytes_written; idempotent — reads don't consume records; root-only). Per-slot AtomicU32 SeqLock counters give lock-free reader correctness without taking a writer lock (PTABLE.lock already serialises the F072 hook). Drop counter is monotonically exact (SC-006): if a slow consumer falls behind, the seq returned jumps and dropped_total accounts for every overwritten record. Record discriminants pinned in contracts/audit-ring-format.md: op (0–8, 9 values), result (0–2), target_kind (0=Single, 1=Triple-for-setresuid/setresgid with u32::MAX sentinel, 2=Groups). 22 new kernel unit tests (5 layout/encoder + 8 ring SeqLock + 3 stats + 4 read_data + 2 other) bring the kernel suite to 456/456 passing. 5 new integration scenarios in tests/boot/test_credentials.py (S5 streaming + S5b 9-op coverage + S6 stats + S7 EACCES non-root + S8 SC-006 overflow + S9 bytes_written stability). make bench-setuid-stage2 enforces ≤100 ns Stage 2 delta vs the post-F072 baseline. Userland audit daemon (drain to ext2) deferred — needs ext2 write path (#73); follow-up issue + ROADMAP row filed. Spec: specs/073-credential-audit-ring/.
  • Kernel audit log of credential transitions — Stage 1 (Feature 072 — Stage 1 of #144; Stage 2 shipped as Feature 073 above). Every credential-changing syscall — setuid/seteuid/setresuid/setgid/setegid/setresgid/setgroups/setfsuid/setfsgid — now emits a structured single-line audit record into the existing kmsg ring (Feature 038) under a new Level::Audit = 6 (tag AUDT). Records carry the operation name, caller PID, pre-call credential tuple, target argument, post-call credential tuple, and the result (ok | EPERM | EINVAL). Both successful changes AND denials are recorded (FR-012) — an attacker probing for privilege escalation leaves a trace whether they succeed or not. The hook fires INSIDE proc::table::with_creds_mut's PTABLE.lock acquisition so pre/post snapshots are atomic with the mutation (FR-010 — same lock the mutation already holds). The setfsuid/setfsgid Linux quirk (return value is OLD fsuid even on EPERM) is preserved at the syscall ABI; the audit record reports the LOGICAL outcome (pre.fsuid != post.fsuid ? ok : EPERM). The (uid_t)-1 sentinel for setresuid/setresgid renders literally as -1 (not 4294967295). Default-on, no compile flag, no per-PID gate — by design, you cannot opt out of audit on your own PID. Read records with dmesg | grep AUDT or programmatically via SYS_KMSG_READ (440) filtering on level == 6. 15 audit format unit tests + 2 Level::Audit ABI tests in kernel/src/proc/audit.rs::tests; integration scenarios in tests/boot/test_credentials.py cover success (US1) + EPERM/setfsuid quirk (US2) + zero-bytes-for-non-cred (NFR-002 / SC-004); make bench-setuid enforces the ≤200 ns regression budget vs the pre-F072 baseline (NFR-001 / SC-003). Stage 2 (dedicated /proc/audit ring with drop counter + sequence number + userland daemon draining to ext2) is deferred per #147. Spec: specs/072-credential-audit-log/.
  • Real UID / credential model on Pcb (Feature 071 — closes #74). Replaces the 11 hardcoded 0 UID/GID syscall arms in kernel/src/proc/syscall.rs (lines 315-318 + 340-350) with reads/writes through a new Credentials sub-struct on Pcb: 8 × u32 scalars (uid/euid/suid/fsuid + GID equivalents) + a fixed [u32; 16] supplementary-group array + ngroups: u8 = exactly 100 bytes per Pcb (within NFR-001 ≤112 B budget). All 14 set/get UID/GID syscalls land in kernel/src/proc/syscalls/io.rs: getuid/geteuid/getgid/getegid/getresuid/getresgid/getgroups (reads) + setuid/seteuid/setresuid/setgid/setegid/setresgid/setgroups/setfsuid/setfsgid (writes) — each implementing the full Linux EPERM truth tables verbatim, with all-or-nothing atomicity on setresuid/setresgid (no partial field writes on EPERM) and the Linux setfsuid "returns OLD value even on permission denial" quirk preserved. /proc/<pid>/status grows three new tab-separated lines (Uid:\t<r>\t<e>\t<s>\t<f>\n, Gid:, Groups:\t<space-separated>\n) matching Linux byte-for-byte; the old hardcoded "Uid:\t0 0 0 0" space-separated form is replaced. Feature 045's /proc/<pid>/trace permission rule upgrades from structural (writer == target || writer == PID 1) to standard own-effective-UID-or-effective-root (writer.creds.is_root() || writer.creds.euid == target.creds.euid), closing the F045 spec TODO. Default boot state preserved (every process starts Credentials::root() = all zeros), so existing v0 binaries see identical values — the change is plumbing, not policy. 31 EPERM-matrix kernel unit tests in kernel/src/proc/credentials.rs::tests cover the truth tables (setuid_unprivileged_to_real/saved_ok, setresuid_atomicity, setfsuid_returns_old_value_on_perm_check, setgroups_size_over_16_einval, trace allow/deny 4-cell matrix, etc.). 3 single-process corpus programs (credentials_round_trip, getgroups_query, setresuid_root_nop) ride make syscall-diff for live VM coverage; the multi-PID setuid scenarios are covered by unit tests because live multi-user shells aren't shipped today. SC-005 audit grep returns zero matches post-merge. Deferrals (4 GitHub issues + ROADMAP rows): S_ISUID exec semantics (#141), capabilities(7) (#142), user namespaces (#143), audit log (#144). Spec: specs/071-pcb-credentials/.
  • mymc — dual-pane file manager in Rust userland (Feature 070). Trimmed from upstream Cargonaut to fit MyOS2026's image constraints: 2.05 MiB stripped musl release binary (NFR-001 budget: 2.5 MiB), no async runtime (std::thread + std::sync::mpsc), no wasmtime plugin host, no SFTP/S3 backends, no OS keychain dependency. Phase A ships dual-pane LocalFs navigation (j/k/Enter/Backspace + Tab focus + Alt-1/2 jump), resumable copy/move engine (8 MiB chunks fsync'd + CRC-chained .mymc-transfer-*.json checkpoints; SIGKILL → scan_resumable picks up at last fsync), confirm/input/resume/tasks/picker dialog system (F5 copy/F6 rename/F7 mkdir/F8 delete + F12 in-flight jobs + launch-time resume prompt), text previewer (F3; plain text, no syntect to fit NFR-001), hand-rolled fuzzy filter (<; nucleo-matcher would have added 300 KiB of Unicode tables), per-pane directory history (Alt-Shift-h / Alt-y / Alt-u), quick-cd popup (Alt-c), panel filter (Alt-!), sync-other-panel (Alt-i/Alt-o), toggle-hidden / split-orient / recursive-dir-size (Alt-./Alt-,/Ctrl-Space), and the mymc.sh cd-on-exit shell wrapper (FR-017; MYMC_EXIT_CWD_FILE). Configuration is layered TOML (/etc/mymc/config.toml~/.config/mymc/config.toml--config). Keymap is data-driven (/etc/mymc/keymap.toml) with 7 dialog kinds and Mode/Key/Mods abstraction. Three CI jobs feed ci rollup: mymc-size (NFR-001 + NFR-007 unsafe-grep), mymc-coverage (NFR-006 tarpaulin ≥ 80%), mymc-startup (SC-004 + FR-010 ≤ 150 ms), mymc-boot (T_A.07 image install + non-interactive CLI). 134 unit tests pass; clippy -D warnings clean; zero unsafe blocks in userland/mymc/src/**. Deferrals (5 GitHub issues + ROADMAP rows): editor handoff (#129; needs MyOS2026 editor), archive-as-VFS (#130; needs tar/zip workspace deps), HMAC audit (#131; needs OS keychain), seccomp hardening (#132), io_uring (#133). Spec: specs/070-mymc-userland-fm/. Upstream: github.com/mohnkhan/cargonaut.
  • DWARF CFI-based frame-pointer-free unwinding (Feature 068 — closes #112). Off-by-default kernel-side stack unwinder that walks past frames with broken or zeroed frame pointers via DWARF .debug_frame CFI rules. Build-time dwarf-extractor --cfi walks every FDE via gimli's UnwindContext, emits a sorted 16-byte CfiEntry rule table to kernel/src/debug/cfi_generated.rs (currently 55,893 rows; 100% of 13,379 FDEs fit the 4-tuple shape). Runtime kernel::debug::unwind::next_frame is pure binary search + 4-tuple arithmetic with kernel-VA bounds checking — panic-safe by construction, reuses Feature 066's recursion guard. Panic-handler integration is purely ADDITIVE: existing FP backtrace runs as today; when the FP walker truncates (typically RBP=0 from an asm trampoline), the handler re-walks the rbp chain to find the broken frame, then calls next_frame repeatedly to extend the backtrace past it. Each CFI-recovered frame is prefixed by a cfi> rip=<hex> marker so operators can distinguish FP-walked vs CFI-walked frames at a glance. SC-002 (zero CFI bytes when feature OFF) enforced by scripts/ci/check-cfi-zero-cost.sh; SC-003 (≤20% loaded-image growth when ON) enforced by scripts/ci/check-cfi-size-budget.sh. Verified end-to-end via image-cfi-test which boots a chain cfi_test_outer → cfi_test_middle → asm-trampoline-zeroes-rbp → panic; the CFI walker bridges past the broken frame as designed. Quickstart: specs/068-dwarf-cfi-unwinding/quickstart.md.
  • Idle-flush UART decoupling for kprint! (Feature 067 — closes #69). Boot with uart_async=1 on the kernel cmdline; from that point forward, kprint! pushes to the kmsg ring and returns immediately — the UART catches up via a 4-line-per-pass drain called from the idle loop. Measured speedup: 813× (7.2 ms → 8.9 µs per kprint! call under QEMU TCG; production should be ~3 µs). Default sync behavior preserved (byte-identical boot log under no opt-in). The framebuffer terminal write is gated on the same flag so async mode is genuinely fire-and-forget for the issuing CPU. set_uart_mirror(true) runtime flip synchronously flushes any backlog before returning (FR-007a — preserves UART monotonic-by-seq invariant). Ring overflow during async mode emits a single [uart: NN lines dropped (ring overflow)] marker per skip event so silent gaps are impossible (FR-008a). Panic-time emission unaffected — write_direct bypass means uart_async=1 panic_now=1 produces a complete panic block on UART (FR-009 + SC-004). Quickstart: specs/067-uart-async-flush/quickstart.md.
  • CI gate for named-root-cause discipline in PR bodies (closes #72). Feature 047 FR-011 / US4 require every feature PR body to name the concrete root cause (file:line + what was wrong + the fix + why it works); until now this was enforced only by self-policing + reviewer eyes. scripts/ci/check-pr-body.sh is a 100-line bash gate that runs alongside docs-gate on every PR: requires a ## Summary or ## Root cause heading; if the body uses ## Root cause, additionally requires at least one file.ext:line reference. Same [no-docs] bypass token as docs-gate (one knob for both). 9-case self-test (scripts/ci/test-check-pr-body.sh) covers summary-only, root-cause-with-fileref, root-cause-without-fileref, empty-body, no-required-heading, case-insensitive, both-headings, fileref-anywhere, and gh-fails-skip cases.
  • DWARF inlined-function chain expansion in panic backtrace (closes #111). Feature 066's panic-handler emission loop now displays the full inlined-function chain beneath each primary frame as indented inline> <name> at <file>:<line> lines. Critical for kernel debugging because rustc aggressively inlines generic methods (Option::unwrap, Mutex::lock, Result::expect) — before this, a panic that fired inside such a method would just show the OUTER function with a confusing source location (e.g., kernel_main+0x109a at <rust>/core/src/option.rs:1015); now it ALSO shows inline> <core::option::Option<u32>>::unwrap at kernel/src/lib.rs:136 naming the exact call site. The build-time dwarf-extractor tool (userland/tools/dwarf-extractor) walks every CU's DIE tree, captures DW_TAG_inlined_subroutine entries, resolves names through the DW_AT_abstract_origin → DW_AT_specification chain, and emits INLINED_CHAINS / INLINE_OVERFLOW static tables. Sweep-line bucketing produces non-overlapping per-RIP-range chains with outermost-to-innermost ordering; the 8-level cap (FR-003) records elided counts in INLINE_OVERFLOW. Measured: 6,169 chains captured for the current kernel ELF; total DWARF table grew from 9.29% → 10.34% of loaded image (still well under FR-011's 20% ceiling). Closes US2 of Feature 066, which deferred this work for time-budget reasons.
  • #105 BAD-RET scheduling panic — root cause fixed. Three months of intermittent kernel panics during dropbear SSH handshake under multi-execve load ([BAD-RET] about to switch to pid=N new_rsp=... ret_addr=0x0 — halting) traced to a 128-KiB struct-assignment overflowing the kernel stack. In kernel/src/net/unix.rs:alloc_pair, the line t.pairs[i] = PairEntry::new() was compiled as "construct on caller's kstack, then move to static slot"; PairEntry contains two [u8; 65536] ring buffers, so 128 KiB landed on a 256 KiB kstack — and overflowed into PHYSICALLY ADJACENT kstack frames, silently corrupting OTHER processes' saved-context area (PCB.kernel_rsp + 56). The all-zeros pattern in the BAD-RET dump was the implicit memset of the ring data arrays. Fix: in-place field reset (no construct), one struct field at a time. Regression test pins both invariants. Found via the sticky HW watchpoint instrumented for #120 — first reproduction after the false-positive filter caught the corruption with a 10-frame backtrace from compiler_builtins::memset through unix::alloc_pair to sys_socket. Verified across 572 execves + 250 zombie-reaps on a 5-minute reproducer with zero panics.
  • DWARF-based in-kernel stack unwinder (Feature 066 — closes #67). Every panic backtrace now ends each line with at <file>:<line> — sourced from a build-time KALLSYMS-style lookup table compiled from the kernel ELF's DWARF (userland/tools/dwarf-extractor is a new host-target Rust workspace member using gimli). The kernel itself has no DWARF parser; runtime is just binary search on &'static arrays — panic-safe by construction. Path normalization at build time keeps the embedded strings short: kernel/-relative for first-party (kernel/src/mm/phys.rs), <cargo>/-prefixed for crates (<cargo>/spin-0.9.8/src/mutex.rs), <rust>/-prefixed for stdlib (<rust>/core/src/option.rs) — no developer-host paths leak. SC-001 acceptance: 3/3 kernel frames in the panic backtrace now show file+line (4th is the _start asm trampoline). SC-002 measured: 14% loaded-image growth — the embedded table is ~1.5 MB for 125k line entries; well under FR-011's 20% hard ceiling but above the original 5% soft target (spec updated to reflect measured reality). The infrastructure for inlined-function chain expansion (US2) ships as empty stubs in v1 — kernel accessors + panic-handler iteration are wired and zero-cost when tables are empty; a future feature populates them. CFI walking (US3) similarly deferred. Quickstart: specs/066-dwarf-unwinder/quickstart.md.
  • FASAN — Frame Allocator SANitizer (Feature 065). Per-frame physical-memory poisoning + ownership tracking + diagnostic accessors layered on kernel/src/mm/phys.rs. Every allocated frame is filled with 0xAABBCCDDAABBCCDD (alloc-pattern sentinel); every freed frame is filled with 0xDEADBEEFDEADBEEF (free-poison sentinel). A 1-byte-per-frame shadow array (1 MiB BSS) records the current owner (one of FREE/KSTACK/HEAP/USERPAGE/PAGETABLE/DEVICE/BOOT/UNKNOWN). The BAD-RET handler in the scheduler now emits three [FASAN] frame=... owner=... sample=[...] lines per panic (failing frame + page±1 neighbours), so issue #105's "kstack contents are all zero, no idea why" mystery becomes a one-repro diagnosis — the developer sees whether bytes are the alloc pattern ("allocated but never written"), free poison ("freed, then handed back without re-init"), or true zeros (something actively wrote 0), plus the owners of the neighbouring frames so corruption from above/below is visible. The kernel panic handler and kassert! path additionally emit a --- FASAN per-PID kstack summary --- block, one line per live PID with the top-of-kstack frame's owner + 4-word sample. The allocator emits [FASAN-XSTATE] warnings on illegal owner transitions (double-free, alloc-over-non-FREE) — non-halting per FR-011. SC-002 budget: ≤10% binary growth — measured at 0.04% (text+data delta 4,512 bytes; the 1 MiB SHADOW is BSS-only). SC-004 budget: ≤10% boot-time hit — tests/boot/test_boot.py confirms all 9 phases still pass with FASAN on. Feature-flagged frame-poison in kernel/Cargo.toml (default-on for debug + KASAN; off via --no-default-features for release; OFF stubs preserve every consumer's signature per contract §1.4). 26 production alloc_frame() call sites migrated to alloc_frame_owned(<owner>) covering kstack/heap/userpage/pagetable/device buckets. Quickstart: specs/065-frame-poisoning/quickstart.md.
  • Wiki-ready demo GIF + nsh non-TTY fallback (Feature 062). make demo-gif now produces a real recording of nsh over SSH — docs/demo.gif, 416 KB, ~15 s playback — usable directly in a wiki article or README. Three pieces: (1) nsh's main loop falls back to a line-buffered prompt → read → run loop when Editor::with_config returns ENOTTY, so scripted demos no longer panic with "Not a tty" (userland/shell/src/main.rs:108); (2) build/scripts/capture-demo-gif.sh switched from the broken nsh$-on-serial wait + paramiko-with-tight-timeouts pattern to a sshd started-marker wait + raw-socket banner probe, and runs all demo commands as a single semicolon-chained nsh -c "cmd1; cmd2; ..." via paramiko (per-command exec_command was flaky against MyOS dropbear, ~7 s per cmd and the 3rd hung); (3) cast post-processing trims ~30 s of SSH-negotiation dead time so the GIF starts immediately. tests/demo/demo-commands.sh was rewritten in nsh-compatible syntax (one command per line, # comments stripped by the driver). Colored prompt is rendered host-side using the same ANSI escapes as userland/shell/src/prompt.rs so the GIF matches what a real TTY would show. Discoveries documented in Learnings.MD include: dropbear in MyOS does not allocate PTYs; a multi-line stdin pipe through nsh's new fallback path triggers a kernel [BAD-RET] scheduling panic (separate issue, not blocking demo); nsh pipes drop output for large left-hand sides (dmesg | tail -5 returns empty, echo small | base64 works fine).
  • TSC-resolution timestamps in kmsg ring (Feature 061 — closes #68). Replaces the kmsg ring's 32-bit scheduler-tick timestamp source (10 ms granularity) with a 64-bit nanosecond timestamp derived from the x86 TSC. /proc/dmesg's text format switches from [<integer-ticks>] to Linux dmesg-style [<seconds>.<microseconds>] — sub-microsecond precision visible in every dmesg line, panic-tail block, and per-PID syscall trace record (Feature 045 inherits the new resolution transparently). /proc/uptime's first field also switches to the new TSC-derived source. Per-entry layout grew from 256 to 264 bytes (the Entry header widened from 16 to 24 bytes; the 240-byte message payload is unchanged) — an 8 KiB BSS bump on the always-on ring. SYS_KMSG_READ(440)'s binary struct ABI changed (v1 → v2: see specs/061-tsc-kmsg-timestamps/contracts/kmsg-entry-format.md); zero current userspace consumers are affected because the text-format consumers (mybox dmesg, mybox strace) parse /proc/dmesg not the struct. New monotonic_ns() accessor in kernel/src/arch/x86_64/timer.rs is the canonical TSC-derived nanosecond clock for future kernel features. Quickstart: specs/061-tsc-kmsg-timestamps/quickstart.md.
  • In-kernel dmesg ring buffer + GDB workflow (Feature 038): 1024-entry × 256-byte ring in BSS captures every kprint!/debug! line via dual-write at the UART driver; readable from userland via /proc/dmesg or syscall 440 (SYS_KMSG_READ); cleared by writing to /proc/dmesg-clear; the panic handler dumps the most recent 256 entries to UART with --- dmesg tail --- markers so the QEMU serial log preserves pre-panic context; make debug and make debug-kasan launch the kernel under QEMU paused at entry with gdbserver on :1234 (see specs/038-dmesg-gdb-stub/quickstart.md). Note: per-entry size and format updated by Feature 061 — see that entry above.
  • Live-verified: kernel-side strace decoder works end-to-end (issue #91 — PR #92 follow-up). New strace_test=1 cmdline trigger calls crate::proc::strace::emit directly with 7 synthetic records (covering path-decode arms, integer-arg arms, and the catchall) then panics — the panic handler's --- dmesg tail --- block surfaces all 7 records on serial output. New make image-strace-test target builds the QCOW2 with the trigger baked in; new tests/boot/test_strace_kernel.py integration test scrapes the captured serial and asserts each expected record's exact format. Proves the kernel decoder + kmsg-write path + panic-dump surface are correct in isolation from any userland glue. The remaining userland-applet integration (strace ./prog round-trip via hvc0/nsh) can now be diagnosed cleanly — anything misbehaving from here is in the userland polling loop, not the kernel emission.
  • Strace-format syscall trace + userland strace binary (issues #70 + #78 — Feature 045 follow-ups). The per-PID trace gate in syscall_dispatch now emits decoded records like [S 7] open("/etc/passwd", 0x80002, 0o0), [S 7] write(1, 0x20001030, 51), [S 7] execve("/bin/echo", 0x..., 0x...) instead of the v0 [S pid/nr] format. Decoder lives in kernel/src/proc/strace.rs and covers the ~10 most-used syscalls (read/write/open/openat/close/mmap/brk/execve/exit_group/fork); any syscall not in the decode table falls through to the v0 catchall, so coverage is purely additive. Path arguments are read from userland via the same *const u8 + null-scan pattern as sys_open; flags are emitted as hex (symbolic decoding deferred to v2). New strace mybox applet at userland/mybox/src/applets/strace.rsstrace ./prog [args] forks, enables trace on the child via /proc/<pid>/trace in pre_exec, execs, polls /proc/dmesg filtering for the child's [S <pid>] records, and streams to stderr. v1 supports the fork-then-exec mode only; -p <pid> attach mode and return-value capture are tracked as follow-ups.
  • Per-CPU current-PCB cache (issue #66 — Feature 045 R3 Option B). sched::current_pid() was taking SCHED.lock() on every call — once per syscall via syscall_dispatch, adding ~50–100 ns of uncontended-mutex overhead per syscall. Replaced with a lock-free AtomicU32 cache (kernel/src/sched/percpu.rs) updated at every context-switch commit point (schedule_to_next + exit_current). current_pid() and current_pid_try() now read the atomic in single-digit ns. The panic path's current_pid_try no longer needs the try_lock fallback (which could return None on contention and force pid=unknown in the dump). Single-AtomicU32 today; the shape transparently becomes a per-CPU-array indexed by APIC ID when SMP arrives (#75).
  • Panic dump: register snapshot + page-table chain walk (issues #64 + #65 — Feature 046 follow-ups). The #[panic_handler] now emits two new blocks alongside the existing dmesg-tail / backtrace / pcb-dump. --- registers --- captures all 16 GPRs + RFLAGS + RIP + CR0/CR2/CR3/CR4 via a single inline-asm spill to a stack buffer (snapshots panic_handler entry state, not the panic site — the actual fault RIP is in the backtrace block). --- page table chain (VA=CR2, CR3=...) --- walks PML4 → PDPT → PD → PT for the value in CR2 (the faulting VA on page-fault panics, informational otherwise), showing the raw PTE + present/not-present at each level and halting at the first missing level. Lock-free, panic-safe, follows the existing DirectWriter pattern. New walk_page_table_chain(va) -> [PageTableLevel; 4] accessor in kernel/src/mm/virt.rs is reusable from fault handlers, future kassert!s, or any other diagnostic that needs to show the translation chain.
  • Fix: per-channel TTY input buffer — /dev/hvc0 no longer races /dev/tty0/1/2/3 for host-sent bytes (issue #82, closes #63). Before this change, every byte arriving on the virtio-console RX queue landed in the same PS/2-fed shared line buffer that all other VT readers were sleeping on; whichever nsh was first off sched::sleep_until consumed the bytes, so host input to /dev/hvc0 almost never reached the nsh actually listening there. Added a dedicated HVC0_* set of input statics (canonical line buffer + raw ring + waiter PID), a push_byte_hvc0() producer entry point called from virtio::console::poll_rx, and a read_hvc0() consumer entry point dispatched from both branches of sys_read (redirected fd 0/1/2 + general fd ≥ 3). Echo on hvc0 routes through virtio::console::write back to the host (not the framebuffer). The previously-failing tests/boot/test_hvc0_rx.py (HELLO_FROM_HOST round-trip) now passes in ~0.00s after socket connect.
  • Fix: nsh on /dev/hvc0 no longer SIGABRTs at startup (issue #63 partial): two surgical kernel fixes. (1) PTY_MASTER_FS_ID had silently shared 0xF5 with HVC0_FS_ID, so every write to /dev/hvc0 was being routed into pty::master_write(0) (slot 0 never in_use → returns 0 → Rust stdio panics with WriteZero → SIGABRT). Moved PTY_MASTER_FS_ID to 0xF2. (2) sys_read's redirected-fd branch (used when init dup2s /dev/hvc0 onto fd 0/1/2 before exec'ing nsh) was missing an HVC0_FS_ID handler and fell through to vfs::readEIO. Added the missing branch. Banner now flows host-bound through the virtio-console socket. Full host-input echo is gated on a follow-up architectural change (per-channel TTY input buffers).
  • Per-PID /proc/<pid>/{stack,wchan} + kernel symbol-table accessor + symbolized panic backtrace (Feature 049): two new procfs files answer the classic "where is this process blocked?" question without a debugger. /proc/<pid>/stack walks the saved RBP chain of a sleeping task and emits up to 16 RIPs as 0x<hex>\n per line; /proc/<pid>/wchan emits a one-line <name>+0x<offset>\n for the top frame, or [running] / [zombie] / [never scheduled] for non-walkable states. Backed by a new build-time symbol-table extractor (build/scripts/gen-symtab.{sh,awk}) that pipes nm --demangle target/kernel.elf through a two-pass kernel link (Linux KALLSYMS pattern) so the table is byte-deterministic and embedded inside the same kernel.elf it describes — measured 6.21% .text growth for 3,183 symbols, well under the SC-005 ≤10% budget. Same accessor symbolizes Feature 046's panic-backtrace block (raw RIPs → 0x<rip> <name>+0x<offset>), turning a hex dump into something readable at a glance. New make image-wchan-test target boots the kernel with wchan_test=1 for the end-to-end integration test. Quickstart: specs/049-stack-wchan-symbols/quickstart.md.
  • Structured kernel panic + kassert! with PCB context (Feature 046): every panic!() and every failed kassert!() emits a self-contained, line-oriented block on serial output for post-mortem diagnosis. Panic emission order: existing PANIC <file>:<line> <message> line → existing --- dmesg tail (N entries) --- block (Feature 038) → new --- backtrace (M frames) --- block (frame-pointer walker, max 16 frames) → new --- pcb dump (K live pids) --- block (one line per live PID with PID/PPID/state/last_syscall_nr/CR3). kassert!(cond, msg) is a drop-in replacement for assert!() that on failure emits --- kassert FAILED --- plus message, file:line, current PID, CR3, kernel RSP, last_syscall_nr, and the most-recent 16 kmsg ring entries — then halts. Zero-cost when the condition is true. New Pcb.last_syscall_nr field is written at every syscall entry and feeds both dump paths. Frame-pointer flag (-C force-frame-pointers=yes) promoted from KASAN-only to ALL kernel builds; binary cost measured at −1.54% (text section actually shrinks slightly at debug-profile, well under SC-005's ≤5% budget — see specs/046-structured-panic-kassert/research.md R9). New make image-kassert target for the integration-test trigger. Quickstart: specs/046-structured-panic-kassert/quickstart.md.
  • Per-PID syscall trace toggle (Feature 045): /proc/<pid>/trace exposes a runtime per-task flag. Writing 1 enables [S pid/nr] (and [S blocked pid/nr]) records into the kmsg ring (Feature 038) for syscalls issued by that PID; writing 0 disables. Output reaches the same surfaces as any other dmesg line — dmesg from userland, /proc/dmesg direct read, or SYS_KMSG_READ (440). Trace records bypass the UART mirror so a heavily traced process does not stall on serial output. Inherited at fork (FR-010), preserved across execve (FR-011). v1 permission model: writer must be the target task or PID 1 (research R1 — degrades from standard own-UID-or-root because MyOS does not yet model UID per task). Replaces the prior compile-time debug-proc Cargo feature for the syscall-entry trace point. Quickstart: specs/045-per-pid-syscall-trace/quickstart.md.
  • Differential syscall harness vs Linux (Feature 040 + transport rework in Feature 041): a corpus of 31 small static-musl C programs (tests/syscall_diff/corpus/) runs on both the host Linux kernel and inside MyOS2026 via QEMU; the harness diffs observable outputs (exit code, exit signal, stdout, stderr, normalized) and reports per-test PASS / FAIL / KNOWN / SKIP. A TOML allowlist (known-divergences.toml) suppresses by-design divergences with mandatory justifications. Five syscall families are covered (file-I/O, process, signal, memory, time). Invoke with make syscall-diff for the full corpus, make syscall-diff CASE=<name> to debug one program, make syscall-diff TRANSPORT=ssh for the legacy paramiko/dropbear path. Wired into the ci rollup as the syscall-diff job — PRs that introduce POSIX-deviation regressions are blocked from merge once Actions is re-enabled at the repo level. The default transport since Feature 041 is virtio-console over a UNIX socket (no dropbear) — code complete + unit-tested; live integration blocked on a kernel-side hvc0/socket bug (#54). The legacy SSH transport remains available via --transport ssh (still hit by #50 intermittently). Operations guide: docs/syscall-diff.md (architecture, CLI reference, troubleshooting, known limitations, Transports section).
  • Security: per-process syscall allowlist, capability bitmask, verified boot attestation

Userland (all statically linked, musl)

Binary Description
init Stage 1–3: mount root, spawn cloud-init, sshd, nsh
nsh Shell: rustyline REPL (↑/↓ history, ← → editing, Ctrl-R search, Tab completion), ANSI color prompt, persistent /root/.nsh_history, pipes, redirects, &&, banner, help
mybox 97 Unix applets via multi-call binary (stripped)
cloud-init cidata provisioning: hostname, SSH keys, runcmd
dropbear SSH daemon (cross-compiled C, key auth, port 22)
myos-pkg Package manager: install, remove, list, verify (signed tar.gz)
sandbox Installs per-process syscall allowlist then exec's target
exploit-test Calls mount(2) via raw syscall; used by T057 regression test
vboot-check Queries SYS_VBOOT_STATUS; prints key fingerprint + chain

Build System

make all                          # kernel + userland + disk image
make test-unit                    # 434 kernel unit tests (no QEMU)
make test-smoke QCOW2=dist/myos2026.qcow2  # curated 9-test smoke suite (no KVM required)
make test-slow QCOW2=dist/myos2026.qcow2   # timing-sensitive tests (requires /dev/kvm)
cargo clippy -p mybox --target x86_64-unknown-linux-musl -- -D warnings

Architecture Details

Design Principles

Principle Choice
Kernel type Minimal monolithic (Rust, no_std)
Bootloader Limine v8.x (BIOS + UEFI, single config)
I/O model virtio-only (blk / net / console / rng / scsi)
Network smoltcp 0.11 (pure Rust, no_std)
Filesystem ext2 (custom pure-Rust read/write driver)
SSH Dropbear (userspace, cross-compiled for musl)
Userland Rust + statically linked musl
Assembly ~170 LOC total (entry stub, ISR trampoline, context-switch)

Repository Layout

kernel/          Rust kernel (no_std)
  src/
    arch/x86_64/ Entry stub, GDT/IDT, APIC, timer, syscall setup
    mm/          Physical + virtual memory, heap, demand paging
    drivers/     UART, virtio (blk/net/console/rng/scsi), LSI Logic MPT SCSI,
                 Intel E1000 NIC, generic PCI scanner, framebuffer
    fs/          ext2 + VFS (symlink-following) + 64-slot LRU block cache
    net/         smoltcp, DHCP, firewall, ethernet dispatch
    proc/        Process table, ELF loader, syscall handlers, capabilities
    sched/       MLFQ scheduler (3-level, decay, I/O boost, starvation prevention, nice)
    ipc/         Pipes

userland/        Userspace crates (musl-static)
  init/          Stage 1–3 init
  shell/         nsh minimal shell
  mybox/         91-applet multi-call binary (Busybox-in-Rust)
    src/applets/ cat, chmod, chown, cp, cut, date, echo, env, false,
                 grep, head, hostname, kill, ls, mkdir, mv, ps, pwd,
                 rm, sleep, sort, sed, awk, find, tar, gzip,
                 nslookup, wget, nc, ping, …(97 total)
  pkg/           myos-pkg package manager
  cloud-init/    Provisioning agent (hostname, SSH keys, runcmd)
  sshd/          Dropbear SSH daemon + host keys
  tools/         sandbox, exploit-test, vboot-check, network utilities

bootloader/      Limine config + vendored binaries
build/           Makefile, image assembly scripts, CI helpers
  scripts/       assemble-image.sh, fix-ext2-timestamps.py,
                 setup-verified-boot.sh, sign-release.sh, convert-vdi.sh
tests/           Integration and benchmark test suites
  boot/          test_boot.py, test_ssh.py, test_shell.py, test_cloud_init.py,
                 test_sandbox.py, test_debug_mode.py, test_lsi_scsi.py,
                 test_e1000_ssh.py, test_vbox_combined.py, test_dual_nic.py,
                 test_signal.py, test_nanosleep.py, test_futex.py,
                 test_misc_posix.py, test_scheduler.py, test_linux_elf.py,
                 test_reproducible.sh
  snapshot/      test_rollback.sh
  bench/         boot_time.sh, bench_boot_time.py
  keys/          Test SSH keypair (test_id_ed25519)
specs/           Feature specs, implementation plans, contracts
  001-vm-optimized-os/   Core OS: kernel, userland, verified boot, security
  002-rust-busybox/      mybox 30-applet multi-call binary
  003-shell-screenshots-demo/  Shell banner + demo artifacts
  004-kdebug-copyright/  Compile-time debug feature flags + copyright
  005-scsi-e1000-drivers/  LSI Logic MPT SCSI + Intel E1000 NIC drivers
  006-syscall-coverage/  Full POSIX syscall layer (60 tasks)
  009-mlfq-scheduler/    MLFQ priority scheduler + nice syscalls
  010-mybox-core-utils/  mybox expanded to 91 applets (full POSIX core utility set)
  011-linux-elf-compat/  Linux ELF binary compatibility (static musl binaries)
  022-file-backed-mmap/  File-backed mmap (MAP_SHARED/MAP_PRIVATE, demand paging)
  023-cow-fork/          Copy-on-Write fork (lazy frame sharing, CoW fault handler)
  024-proc-fs-expansion/ /proc filesystem: self/{fd/,maps,status,exe}, cpuinfo, uptime, net/*
  025-networking-userland/ DNS resolver, wget (HTTP+HTTPS), nc, ping — 5/5 integration tests
  026-nsh-rustyline/     rustyline REPL for nsh: interactive editing, history, ANSI prompt, Tab completion
  027-interactive-console/ Console shell wired to TTY, password SSH, 3 virtual terminals, TUI demo shell
  028-cmos-rtc-wallclock/  CMOS RTC driver; clock_gettime(CLOCK_REALTIME) and gettimeofday() return correct UTC epoch
  032-virtio-console-rx/  virtio-console RX: receiveq allocated, pre-filled, polled every 10 ms — console now bidirectional
  033-console-demo-smoke-tests/ Interactive console wired to qemu-sdl; curated 9-test smoke suite; test_shell.py fixed; KVM tests quarantined
  034-terminal-multiplexer/  myscreen: screen-style terminal multiplexer; PTY kernel subsystem; AF_UNIX sockets; detach/reattach; 5 integration tests
  035-console-hvc0-fix/     Fix /dev/hvc0 routing: was assigned NULL_FS_ID; now HVC0_FS_ID routes reads→tty::read() and writes→virtio::console::write()
  036-socketpair-cloexec-fix/ sys_socketpair(AF_UNIX) backed by pipes; CLOEXEC honoured in sys_pipe2; recv/send pipe fallback; OVMF VARS fix for reliable UEFI boot

Boot Log

[ 0.20s] Booting from Hard Disk   ← Limine BIOS stage 2
[ 0.69s] MyOS2026 v0.1.0          ← kernel_main
[ 0.89s] [1] UART ok
         [2] mm ok
         [3] interrupts ok
         [4] drivers ok            (virtio-net, virtio-blk, virtio-rng)
         [4b] rtc ok               (epoch=<unix-timestamp> from CMOS RTC)
         [5] fs ok                 (ext2 mounted at /)
         [6a] firewall ok
         [6b] net stack ok         (DHCP → 10.0.2.15/24)
         [7] userspace: launching init
[ 4.46s] → dropbear listening :22
[ 4.83s] MyOS2026 v0.1.0 — type 'help' for built-in commands
         nsh$


Kernel Address Sanitizer (Feature 037)

The kernel ships with an opt-in KASAN-equivalent for catching memory-safety bugs at the corruption site. Build with make image-kasan to produce a sanitized disk image; run python3 tests/boot/test_kasan.py dist/myos2026-kasan.qcow2 for the 7-scenario integration test (5 violations + 2 negatives). The CI kasan-test job runs this on every PR; failures block merges. See specs/037-kernel-kasan/quickstart.md for the full developer guide.

The default build (no --features kasan) is unchanged: zero runtime cost, kernel.elf binary size within 1 % of pre-feature baseline.

What Is Not Yet Implemented

  • Dynamic linking — only statically-linked ELF binaries run; PT_INTERP (glibc/musl dynamic linker) is rejected with ENOEXEC
  • HTTPS wget — TLS (wget https://...) is compiled in (rustls + webpki-roots) but certificate verification against public CAs requires certificate pinning or a local CA bundle; HTTP (wget http://...) works fully
  • Package managementmyos-pkg installs from signed tar.gz; repo tooling and signing pipeline deferred
  • initrd/initramfs — no preloaded ramdisk support; all binaries live on the ext2 partition
  • Loadable kernel modules (monolithic; in-VM module builds deferred)
  • GPG-signed release artifacts (persistent ed25519 key used; GPG pipeline not wired up)
  • strace-equivalent userland tool
  • POSIX lstat() does not distinguish final symlink component (stat() and lstat() both follow symlinks)

Use Cases

  • OS learning platform — every subsystem fits in your head, written in safe Rust
  • Secure ephemeral VMs — sandbox + verified boot + fast teardown via snapshot/rollback
  • CI/CD throwaway environments — < 2s boot, 12.5 MB image, SSH ready in < 5s
  • Kernel and systems programming research — modify kernel, rebuild, boot in < 2 minutes

Contributing

  1. Fork the repository
  2. Read specs/001-vm-optimized-os/plan.md for the core OS architecture
  3. Read specs/006-syscall-coverage/plan.md for the syscall layer design
  4. Run make test-unit — kernel unit tests, no QEMU needed
  5. Boot: make qemu (< 2s to shell prompt)
  6. Open a PR

Good first issues:

  • POSIX lstat() that does not follow the final symlink component
  • strace-style syscall tracer using kernel instrumentation hooks
  • Dynamic ELF loader (PT_INTERP support) — enables glibc-linked binaries
  • GPG signing pipeline for release artifacts

License

Mozilla Public License 2.0

About

VM First Experimental Operating System written in Rust, A Rust OS operating System

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors