diff --git a/ARCHITECTURE.md b/ARCHITECTURE.md index 2a62ff8..83fe791 100644 --- a/ARCHITECTURE.md +++ b/ARCHITECTURE.md @@ -258,11 +258,14 @@ region. first, here's what they wrote, you decide." The library never throws on schema grounds. -Version bumped `3 → 4` because the `Header` binary layout grew by -the two new fields: one cache line for `schema_state` (a -`uint32_t` atomic padded to 64 B by `alignas(CACHE_LINE)`) plus -eight cache lines for `schema_data` (512 B). Pre-v4 binaries are -rejected at `open()` by the existing version check. +Version bumped `4 → 5` for the cross-process runtime counters +added to `SubRingHeader` (`dropped_count`, `lost_count` — see the +Subscriber Ring section for their layout and semantics). The +fields fit inside the existing `write_pos` cache-line padding, so +`sizeof(SubRingHeader)` is unchanged, but the in-memory meaning of +the bytes following `has_waiter` changed — v4 and v5 cannot share +a region. Pre-v5 binaries are rejected at `open()` by the existing +version check. ## Treiber Free Stack @@ -325,12 +328,29 @@ Ring[0] - **has_waiter** (atomic uint32): set by the subscriber before blocking on `futex_wait`, cleared after. Publishers skip the `futex_wake_all` syscall when no subscriber is sleeping. +- **dropped_count** (atomic uint64): cumulative count of publisher + delivery drops on this ring — incremented whenever the two-phase + commit CAS lock fails and the publisher falls through to the self- + repair / drop path. Cross-process visible; the per-Publisher + `dropped()` accessor reports the same events but counted per + instance. +- **lost_count** (atomic uint64): cumulative count of subscriber + losses on this ring — bumped on every `++lost_` site (drain-ahead + skip, stale-overwrite, invalid-slot, unpinnable, seqlock miss). + Cross-process visible; `Subscriber::lost()` reports the same events + per instance. - **Sequence number** is monotonically increasing (`pos + 1`), used as a seqlock for data consistency validation and as a commit barrier between publishers (see Publish Flow below). - Stale entries (sequence < subscriber's expected) are detected and reported as lost messages. +`write_pos`, `has_waiter`, `dropped_count`, `lost_count` all share one +cache line: the hot path already owns this line when incrementing +`write_pos`, so bumping the counters on a rare drop/loss adds no new +coherency traffic. Different rings target different lines (128 B +stride), so publishers/subscribers on distinct rings never contend. + ### Subscriber join and visibility window A subscriber joins by CAS-ing a Free ring to Live. The CAS expects @@ -1201,6 +1221,157 @@ Mailbox paths include the owner's node name because they are personal reply channels -- the sender must know who to reply to. +## Registry & Discovery + +Kickmsg's core IPC path is decentralized: no broker, no daemon, no +central registry. Channels are plain shared-memory regions that +producers and consumers find by name. That works well until an +operator asks *"what's running on this box?"* — which is the job of +the **participant registry**. + +### What it is + +A single dedicated shared-memory region per namespace, at +`/{namespace}_registry`, holding a fixed-size array of participant +entries. Each entry records one `(Node, channel, role)` membership. +All scalar fields are `std::atomic` so that the seqlock protocol +between the writer and a concurrent new-tenant claim never involves +plain non-atomic accesses on the same bytes: + +- `state` — atomic `Free` / `Claiming` / `Active` / `Reclaiming` +- `generation` — atomic counter bumped on every claim and release (seqlock) +- `pid` — atomic; OS process id of the owner +- `pid_starttime` — atomic; opaque OS-specific process start time +- `channel_type` — atomic; PubSub / Broadcast +- `role` — atomic; Publisher / Subscriber / Both +- `kind` — atomic; Pubsub / Broadcast / Mailbox (registry-level intent tag) +- `topic_name` — user-facing path (leading `/`, interior `/` preserved) +- `shm_name` — implementation-level POSIX path +- `node_name` — the `Node` that registered +- `created_at_ns` — atomic; registration timestamp + +A `Node` lazily opens-or-creates the registry on its first +`advertise` / `subscribe` / `join_broadcast` / `create_mailbox` / +`open_mailbox` call, scans for the first `Free` slot, CAS-claims it +(`Free → Claiming`), writes the fields, then release-stores +`Active`. The `Node`'s destructor deregisters every slot it claimed. + +The key property is **cross-platform parity**: Linux `/dev/shm` is +filesystem-visible, but macOS and Windows are not — we can't use +`readdir` to list topics there. Routing discovery through a regular +`SharedMemory` region means one code path, identical behaviour on +all three targets. + +### State machine + +``` +Free (0) ── CAS ──► Claiming (1) ── release-store ──► Active (2) + ▲ │ + │ │ + ├────────── deregister: store-release ◄──────────────┤ + │ │ + │ sweep_stale: │ + └── CAS(Reclaiming → Free) ◄── CAS(Active | Claiming → Reclaiming) +``` + +Snapshots acquire-load `state` per slot; only `Active` entries are +returned. The `Claiming` state is the publication fence for the +field bytes — a reader observing `Active` is guaranteed to see all +the field writes that happened-before the release-store. + +`Reclaiming` is the exclusive lock held by `sweep_stale` while it +verifies the dead-pid condition and finalizes the slot to `Free`. +It blocks concurrent registrants (they need `Free → Claiming`) and +prevents ABA on the state CAS: without it, a full `dereg + register` +cycle on another CPU could restore the slot to `Active` between the +sweeper's pid check and its CAS, causing the sweeper to stomp a live +tenant's registration. `sweep_stale` releases `Reclaiming` back to +the pre-CAS value if the re-verified pid differs from what it +observed (ABA detected), so the live tenant is restored. + +### Role upgrade + +A `Node` that both advertises and subscribes to the same topic is +recorded as a single entry with `role = Both` rather than two +separate entries. `Node::touch_registry` detects the existing slot +and upgrades via deregister + re-register. Upgrades happen at +connect time only — zero hot-path cost. + +### Liveness + +The registry does not track heartbeats. A crashed process that +never ran its `Node` destructor leaves its entries stuck at +`Active` (or `Claiming`, if it died mid-register) until reclaimed. +Two recovery paths: + +- **Query-time filter**: diagnostic tools probe each entry's pid via + `process_exists()` and hide dead entries from the user without + touching the registry. Non-invasive; safe under live traffic. +- **`Registry::sweep_stale()`**: CAS-resets any `Active` or `Claiming` + slot whose `pid` is dead. Opt-in cleanup for an operator or + supervisor sweep; also called opportunistically from + `register_participant` when the registry is full, so long-running + deployments don't silently drop new registrations as crashed-process + residue accumulates. + +The `Claiming` reclaim branch skips slots where `pid == 0`: that state +is the brief window between the `Free→Claiming` CAS and the +registrant's first field store. Claiming the slot in that window +would stomp a live registrant. Cost: an early crash (before the pid +store) leaks one slot until the region is unlinked. + +**PID-reuse mitigation.** `pid_starttime` is captured at register +time: `/proc//stat` on Linux, `sysctl(KERN_PROC_PID)` on Darwin, +`GetProcessTimes()` on Windows. Sweep compares the stored starttime +against the live process's starttime; a PID-reuse after wraparound +yields a different value and we reclaim. If the OS returns 0 (the +process is gone by the time we query), the check degrades to +PID-alone. + +### Sizing + +Default capacity is 4096 slots × 512 B = 2 MB per namespace. A robot +telemetry Node typically holds a few hundred topics; the registry is +sized for an order of magnitude of headroom above that. Exhaustion +is non-fatal: `register_participant` returns `INVALID_SLOT` and the +Node continues to work without a discovery row — registration is a +diagnostic nicety, not a correctness dependency. + +### Implicit invariants + +Load-bearing assumptions for anyone editing the registry: + +- **Field order is ABI.** `sizeof(ParticipantEntry) == 512` and + `offsetof(…, _padding) == 368` are statically asserted; any field + reorder or resize must update the padding and bump `registry::VERSION`. +- **State publication fence.** `state.store(Active, release)` in + `register_participant` is the one fence that publishes all field + writes that preceded it. `pid` has its own earlier release-store so + `sweep_stale` can acquire-load it while state is still `Claiming` + (before the Active fence). Any new field that needs to be visible + during `Claiming` must use its own release/acquire pair. +- **Generation bump on every mutation.** `generation` is bumped by + `register_participant` *and* by `deregister` *and* by + `sweep_stale`'s reclaim path. A snapshot's seqlock recheck detects + only mutations that bump gen — adding a future write-path that + modifies fields without bumping gen will cause torn reads. +- **`touch_registry` must never throw.** `Node::advertise` and friends + treat registration as best-effort. A failure is logged once per + `Node` (latched via `registry_disabled_`) and subsequent calls + become no-ops. Don't change this contract without also changing + `Node::advertise`'s error handling. +- **Role upgrade has a brief visibility gap.** `touch_registry` + upgrades Publisher/Subscriber → Both via `deregister` + re-register + (no atomic "change role in place" primitive). A concurrent + `snapshot()` may see this Node absent for ~1 µs during the swap. + Callers treating an empty `list_topics` result as "definitely + absent" are wrong; the right reading is "absent or in a brief gap." +- **`Kind` and ring state enums are stringified in two places.** The + native binding's `__repr__` methods and the Python `_KIND_NAME` / + ring-state maps in `diagnostics.py` must stay in sync when a new + enum value is added. + + ## Design Tradeoffs ### Silent data loss on slow subscribers diff --git a/CMakeLists.txt b/CMakeLists.txt index 5176465..2d7d318 100644 --- a/CMakeLists.txt +++ b/CMakeLists.txt @@ -23,22 +23,34 @@ if(WIN32) src/os/windows/Time.cc src/os/windows/Futex.cc src/os/windows/SharedMemory.cc + src/os/windows/Process.cc ) set(OS_LIBRARIES synchronization) -elseif(APPLE) - set(OS_SOURCES - src/os/darwin/Time.cc - src/os/darwin/Futex.cc - src/os/darwin/SharedMemory.cc - ) - set(OS_LIBRARIES pthread) else() + # Linux and macOS share most of the POSIX OS layer; only sleep(), + # create(), and the futex backend diverge. set(OS_SOURCES - src/os/linux/Time.cc - src/os/linux/Futex.cc - src/os/linux/SharedMemory.cc + src/os/posix/Time.cc + src/os/posix/SharedMemory.cc + src/os/posix/Process.cc ) - set(OS_LIBRARIES rt pthread) + if(APPLE) + list(APPEND OS_SOURCES + src/os/darwin/Time.cc + src/os/darwin/SharedMemory.cc + src/os/darwin/Futex.cc + src/os/darwin/Process.cc + ) + set(OS_LIBRARIES pthread) + else() + list(APPEND OS_SOURCES + src/os/linux/Time.cc + src/os/linux/SharedMemory.cc + src/os/linux/Futex.cc + src/os/linux/Process.cc + ) + set(OS_LIBRARIES rt pthread) + endif() endif() # --- Library --- @@ -49,6 +61,7 @@ add_library(kickmsg STATIC src/Publisher.cc src/Subscriber.cc src/Region.cc + src/Registry.cc src/Node.cc ${OS_SOURCES} ) diff --git a/README.md b/README.md index 6931b2a..06f54d8 100644 --- a/README.md +++ b/README.md @@ -116,6 +116,49 @@ region.reset_retired_rings(); region.reclaim_orphaned_slots(); ``` +## CLI (`kickmsg`) + +Installing the Python wheel puts a `kickmsg` command on `$PATH` that +inspects running channels via a shared participant registry (one per +namespace, backed by a SHM region at `/{namespace}_registry`). Works +identically on Linux, macOS, and Windows — no `/dev/shm` filesystem +walk required. + +```bash +kickmsg list # topic-centric enumeration +kickmsg list -o name,pub,sub,stall # ps-style column selection +kickmsg info # static header metadata +kickmsg stats # runtime counters (write_pos / dropped / lost) +kickmsg watch # top-like live view with msg/s rates +kickmsg diagnose # wraps SharedRegion::diagnose() +kickmsg repair [--locked] # run repair primitives +kickmsg schema # focused schema descriptor view +kickmsg schema-diff # field-by-field schema comparison +``` + +All subcommands accept `--json` for scripting. + +### Programmatic use (GUIs, exporters) + +The same data the CLI renders is available as typed dataclasses through +`kickmsg.diagnostics`, so a GUI can consume it without shelling out: + +```python +from kickmsg import diagnostics as diag + +for topic in diag.list_topics(namespace="kickmsg"): + print(topic.shm_name, len(topic.producers), len(topic.consumers)) + +stats = diag.stats("/kickmsg_telemetry") +for ring in stats.rings: + if ring.state == "live": + print(ring.write_pos, ring.dropped_count, ring.lost_count) + +# Live updates (generator — caller drives the loop) +for frame in diag.watch("/kickmsg_telemetry", interval=1.0): + gui.update(frame.stats, frame.rates_msg_per_sec) +``` + ## Building ### Prerequisites diff --git a/benchmarks/bench.cc b/benchmarks/bench.cc index 1ed48e9..3c242b3 100644 --- a/benchmarks/bench.cc +++ b/benchmarks/bench.cc @@ -139,7 +139,7 @@ static void run_latency(BenchConfig const& bc, bool zerocopy) while (pub.send(payload.data(), payload.size()) < 0) { - kickmsg::sleep(0ns); + kickmsg::yield(); } if (zerocopy) diff --git a/examples/hello_pubsub.cc b/examples/hello_pubsub.cc index 8fbd13b..361810d 100644 --- a/examples/hello_pubsub.cc +++ b/examples/hello_pubsub.cc @@ -6,7 +6,7 @@ /// - "display" subscribes to "temperature" and prints them /// /// Single-process for simplicity; in production, each node lives in -/// its own process sharing the same prefix. +/// its own process sharing the same namespace. #include #include diff --git a/examples/python/cli_playground.py b/examples/python/cli_playground.py new file mode 100644 index 0000000..504e93a --- /dev/null +++ b/examples/python/cli_playground.py @@ -0,0 +1,117 @@ +"""Long-running pub/sub demo for poking at the `kickmsg` CLI. + +Starts one node that advertises a topic and a second node that subscribes +to it, then publishes forever at a configurable rate. Ctrl-C to stop — +teardown deregisters both participants from the registry. + +While it's running, try: + + kickmsg ls # default namespace "demo" + kickmsg ls -o all + kickmsg stats /demo_telemetry + kickmsg watch /demo_telemetry -i 0.5 + kickmsg diag /demo_telemetry + kickmsg schema /demo_telemetry +""" + +from __future__ import annotations + +import argparse +import struct +import sys +import time + +import kickmsg + + +def main() -> int: + ap = argparse.ArgumentParser(description=__doc__, + formatter_class=argparse.RawDescriptionHelpFormatter) + ap.add_argument("-n", "--namespace", default="demo") + ap.add_argument("-t", "--topic", default="telemetry") + ap.add_argument("-r", "--rate-hz", type=float, default=10.0, + help="publish frequency (default: 10 Hz)") + ap.add_argument("-c", "--consumers", type=int, default=1, + help="number of subscriber nodes (default: 1)") + ap.add_argument("--payload-size", type=int, default=64, + help="bytes per message (default: 64)") + args = ap.parse_args() + + if args.consumers < 1: + ap.error("--consumers must be >= 1") + + cfg = kickmsg.Config() + cfg.max_subscribers = max(args.consumers, 4) + cfg.sub_ring_capacity = 64 + cfg.pool_size = 128 + cfg.max_payload_size = max(args.payload_size, 16) + + # Clean any leftover SHM from a crashed prior run so startup is deterministic. + cleanup = kickmsg.Node("cleanup", namespace=args.namespace) + cleanup.unlink_topic(args.topic) + del cleanup + + sensor = kickmsg.Node("sensor", namespace=args.namespace) + pub = sensor.advertise(args.topic, cfg) + + loggers = [kickmsg.Node(f"logger_{i}", namespace=args.namespace) + for i in range(args.consumers)] + subs = [n.subscribe(args.topic) for n in loggers] + + shm_name = f"/{args.namespace}_{args.topic}" + period = 1.0 / args.rate_hz + + print(f"publishing to {shm_name} at {args.rate_hz:g} Hz " + f"({args.payload_size} B per message)") + print(f"nodes: sensor (publisher) logger_0..{args.consumers - 1} " + f"({args.consumers} subscriber{'s' if args.consumers != 1 else ''})") + print(f"try: kickmsg ls --namespace {args.namespace}") + print(f" kickmsg watch {shm_name}") + print("Ctrl-C to stop.") + + seq = 0 + received = 0 + t_start = time.monotonic() + pad = b"\x00" * max(0, args.payload_size - 12) + next_tick = t_start + + try: + while True: + # Payload: (uint32 seq, uint64 ns_since_start, padding…) + elapsed_ns = int((time.monotonic() - t_start) * 1e9) + pub.send(struct.pack("= next_tick + period: + # Large stall (GC, suspend) — skip ahead instead of + # bursting to catch up. + next_tick = now + period + else: + delay = next_tick - now + if delay > 0: + time.sleep(delay) + except KeyboardInterrupt: + print() # newline after ^C + elapsed = time.monotonic() - t_start + total_lost = sum(s.lost for s in subs) + print(f"sent {seq} messages, received {received} " + f"(across {args.consumers} subscriber" + f"{'s' if args.consumers != 1 else ''}), " + f"over {elapsed:.1f}s " + f"(~{seq / max(elapsed, 0.001):.0f} msg/s)") + print(f"pub.dropped = {pub.dropped}, sub.lost total = {total_lost}") + return 0 + + +if __name__ == "__main__": + sys.exit(main()) diff --git a/include/kickmsg/Hash.h b/include/kickmsg/Hash.h index ed2be79..ed8edfc 100644 --- a/include/kickmsg/Hash.h +++ b/include/kickmsg/Hash.h @@ -5,6 +5,7 @@ #include #include #include +#include namespace kickmsg { @@ -39,6 +40,24 @@ namespace kickmsg /// descriptor-string hashing is by far the most common use. uint64_t fnv1a_64(std::string_view s) noexcept; + /// 64-bit FNV-1a of a trivially-copyable scalar or POD. Lets + /// callers chain fields without spelling out `&v, sizeof(v)`: + /// h = fnv1a_64(cfg.max_subscribers, h); + /// h = fnv1a_64(cfg.pool_size, h); + /// Pointers, nullptr, and string_view are excluded so the + /// dedicated overloads keep their priority. + template + auto fnv1a_64(T const& value, uint64_t seed = FNV1A_64_OFFSET_BASIS) noexcept + -> std::enable_if_t< + std::is_trivially_copyable_v + and not std::is_pointer_v + and not std::is_null_pointer_v + and not std::is_same_v, + uint64_t> + { + return fnv1a_64(&value, sizeof(T), seed); + } + /// Convenience: pack a 64-bit FNV-1a of `descriptor` into the /// leading eight bytes of a 64-byte identity slot, zero-padding /// the remaining 56 bytes. Intended as a drop-in for filling diff --git a/include/kickmsg/Node.h b/include/kickmsg/Node.h index c12e86d..6139837 100644 --- a/include/kickmsg/Node.h +++ b/include/kickmsg/Node.h @@ -8,6 +8,7 @@ #include "kickmsg/Region.h" #include "kickmsg/Publisher.h" #include "kickmsg/Subscriber.h" +#include "kickmsg/Registry.h" namespace kickmsg { @@ -28,15 +29,19 @@ namespace kickmsg class Node { public: - // Name components (node name, namespace/prefix, topic, channel, + // Name components (node name, namespace, topic, channel, // owner, tag) are sanitized into a POSIX-shm-compatible form: // leading '/' is stripped, interior '/' becomes '.', and any char // outside [A-Za-z0-9._-] becomes '_'. This lets callers pass // ROS-style paths like "/robot/arm/joint1" directly — the region - // ends up at "/_robot.arm.joint1" in /dev/shm, still + // ends up at "/_robot.arm.joint1" in /dev/shm, still // human-readable (no hashing). A component that sanitizes to the // empty string throws std::invalid_argument. - Node(std::string const& name, std::string const& prefix = "kickmsg"); + Node(std::string const& name, std::string const& kmsg_namespace = "kickmsg"); + + /// Deregisters every participant entry this Node holds in the + /// namespace's registry. + ~Node(); // Explicit non-copyable / move-only. Node already holds SharedRegion // values (move-only), so it's non-copyable de facto; declaring it @@ -123,8 +128,8 @@ namespace kickmsg /// else got there first (read back with topic_schema()). bool try_claim_topic_schema(char const* topic, SchemaInfo const& info); - std::string const& name() const { return name_; } - std::string const& prefix() const { return prefix_; } + std::string const& name() const { return name_; } + std::string const& kmsg_namespace() const { return namespace_; } private: std::string make_topic_name(char const* topic) const; @@ -137,8 +142,19 @@ namespace kickmsg SharedRegion* find_region(std::string const& shm_name); SharedRegion const* find_region(std::string const& shm_name) const; + Registry& lazy_registry(); + + /// Register `shm_name` with `role`, or upgrade the existing entry + /// to `Both` if this Node already has one with the complementary + /// role. Registry failures are logged and swallowed. + void touch_registry(std::string const& shm_name, + std::string const& topic_name, + channel::Type channel_type, + registry::Kind kind, + registry::Role role); + std::string name_; - std::string prefix_; + std::string namespace_; // Keyed by SHM name for O(1) lookup. A telemetry node on a // humanoid robot can easily hold 100-300 topics (joints × (meas, // target) + cameras + IMUs + force sensors + hands), so O(N) @@ -148,6 +164,12 @@ namespace kickmsg // guarantees reference stability for elements (the mmap addresses // used by Publisher/Subscriber don't move on rehash). std::unordered_map regions_; + + struct RegistrySlot { uint32_t slot_index; registry::Role role; }; + std::unordered_map registry_slots_; + std::optional registry_; + bool registry_disabled_ = false; ///< latched on first registry failure + bool registry_full_warned_ = false; ///< latched after first INVALID_SLOT log }; } diff --git a/include/kickmsg/Region.h b/include/kickmsg/Region.h index 72f3dea..e58c91a 100644 --- a/include/kickmsg/Region.h +++ b/include/kickmsg/Region.h @@ -3,12 +3,61 @@ #include #include +#include #include "kickmsg/types.h" #include "kickmsg/os/SharedMemory.h" namespace kickmsg { + /// Runtime snapshot of a single subscriber ring. + /// Values are relaxed/acquire-loaded, so the snapshot is internally + /// consistent per-ring but may race mildly across rings — fine for a + /// diagnostic view; not intended as a strongly-consistent read. + struct RingStats + { + uint32_t state; ///< ring::State as a raw int (0=Free, 1=Live, 2=Draining) + uint32_t in_flight; ///< Publishers currently admitted to this ring + uint64_t write_pos; ///< Monotonic claim counter (rough throughput proxy) + uint64_t dropped_count; ///< Cumulative publisher drops on this ring + uint64_t lost_count; ///< Cumulative subscriber losses on this ring + }; + + /// Aggregate region snapshot returned by SharedRegion::stats(). + /// Safe to call under live traffic: all reads are relaxed/acquire, + /// no writes. + struct RegionStats + { + std::vector rings; ///< One entry per subscriber-ring slot (length == max_subs) + uint64_t total_writes; ///< Max of write_pos across all rings: publish events observed by the channel, monotonic across subscriber churn + uint64_t total_drops; ///< Sum of dropped_count across all rings + uint64_t total_losses; ///< Sum of lost_count across all rings + uint64_t live_rings; ///< Number of rings currently Live + uint64_t pool_free; ///< Approximate free-slot count (walks Treiber stack — racy under churn) + uint64_t pool_size; ///< Total pool capacity (static) + }; + + /// Static header metadata returned by SharedRegion::info(). + /// All fields are written once at creation and never mutated, so this + /// read is a plain copy of stable bytes. + struct RegionInfo + { + std::string shm_name; + channel::Type channel_type; + uint32_t version; + uint64_t config_hash; + uint64_t total_size; + uint64_t max_subs; + uint64_t sub_ring_capacity; + uint64_t pool_size; + uint64_t max_payload_size; + uint64_t commit_timeout_us; + uint64_t creator_pid; + uint64_t created_at_ns; + std::string creator_name; + }; + + class SharedRegion { public: @@ -134,6 +183,25 @@ namespace kickmsg /// Returns the number of rings reset. std::size_t reset_retired_rings(); + /// Runtime counter snapshot — safe under live traffic. + /// + /// Reads the cross-process per-ring counters (`write_pos`, + /// `dropped_count`, `lost_count`) plus ring state and an approximate + /// pool-free count. Intended for external monitoring and the CLI's + /// `stats` / `watch` subcommands. + /// + /// Cheap (no syscalls, no locks, a handful of atomic loads) but not a + /// strongly-consistent view: individual per-ring values are consistent + /// with themselves (sequential loads on one variable), but different + /// rings may be read at slightly different instants. The free-stack + /// walk for `pool_free` is bounded by `pool_size` so it can't loop + /// forever under racing pushes/pops. + RegionStats stats() const; + + /// Static header snapshot — geometry + creator metadata. All + /// fields are written once at creation, so this is a plain copy. + RegionInfo info() const; + /// Reclaim orphaned slots (refcount > 0 but not referenced by any ring entry). /// These are caused by publisher crashes between allocate and publish, or by /// skipped drain on subscriber teardown timeout. diff --git a/include/kickmsg/Registry.h b/include/kickmsg/Registry.h new file mode 100644 index 0000000..f204a8d --- /dev/null +++ b/include/kickmsg/Registry.h @@ -0,0 +1,193 @@ +#ifndef KICKMSG_REGISTRY_H +#define KICKMSG_REGISTRY_H + +#include +#include +#include +#include +#include +#include + +#include "kickmsg/types.h" +#include "kickmsg/os/SharedMemory.h" + +namespace kickmsg +{ + namespace registry + { + constexpr uint32_t VERSION = 3; + constexpr uint64_t MAGIC = 0x214745524B43494BULL; // "KICKREG!" + // Supports up to ~200-400 topics with a few participants each, + // plus headroom for transient tasks. 4096 × 512 B = 2 MB per + // namespace. + constexpr uint32_t DEFAULT_CAPACITY = 4096; + constexpr std::size_t SHM_NAME_MAX = 128; + constexpr std::size_t TOPIC_NAME_MAX = 128; + constexpr std::size_t NODE_NAME_MAX = 64; + + enum Role : uint32_t + { + Publisher = 1, + Subscriber = 2, + Both = 3, ///< Node is both producer and consumer on this channel + }; + + /// What the channel is used for, from the user-facing API's point + /// of view. channel_type (in types.h) is the low-level ring + /// geometry (PubSub vs Broadcast); Kind distinguishes Mailbox + /// from PubSub even though both share channel::PubSub geometry. + enum Kind : uint32_t + { + Pubsub = 1, + Broadcast = 2, + Mailbox = 3, + }; + + /// Only `Active` slots are visible to snapshot readers. + /// `Reclaiming` is the exclusive lock held by `sweep_stale` to + /// prevent ABA on the state CAS. + enum SlotState : uint32_t + { + Free = 0, + Claiming = 1, + Active = 2, + Reclaiming = 3, + }; + } + + /// In-SHM entry, 512 B. Readers go through `Registry::snapshot()`. + /// Scalar fields are atomic so a snapshot reader racing with a new- + /// tenant writer never hits a C++ data race; the seqlock (generation + /// + state) discards torn copies. Do not reorder fields without + /// bumping `registry::VERSION`. + struct ParticipantEntry + { + std::atomic state; + std::atomic channel_type; + std::atomic role; + std::atomic kind; + std::atomic generation; ///< seqlock version, bumped on every mutation + std::atomic pid; ///< release/acquire-accessed; inspected while state==Claiming + std::atomic pid_starttime; ///< OS-reported start time, or 0 if unavailable + std::atomic created_at_ns; + char shm_name[registry::SHM_NAME_MAX]; + char topic_name[registry::TOPIC_NAME_MAX]; + char node_name[registry::NODE_NAME_MAX]; + uint8_t _padding[144]; + }; + static_assert(sizeof(ParticipantEntry) == 512, + "ParticipantEntry layout is part of the registry ABI"); + static_assert(offsetof(ParticipantEntry, _padding) == 368, + "ParticipantEntry field offsets must match expected 368 B prefix"); + + /// Plain copyable snapshot of one participant. + struct Participant + { + uint64_t pid; + uint64_t pid_starttime; ///< OS-reported boot-relative start time (0 if unknown) + uint64_t created_at_ns; + uint32_t channel_type; + uint32_t role; + uint32_t kind; ///< registry::Kind + std::string shm_name; ///< POSIX SHM path — implementation detail + std::string topic_name; ///< User-facing logical path, leading '/' + std::string node_name; + }; + + /// Topic-centric grouping of registry entries: all participants on + /// one shm_name, split by role (producer / consumer) and by pid + /// liveness (alive / stall). A Role::Both participant appears in + /// both producers and consumers. + struct TopicSummary + { + std::string shm_name; + std::string topic_name; ///< User-facing logical path + uint32_t channel_type; ///< channel::Type + uint32_t kind; ///< registry::Kind + std::vector producers; + std::vector consumers; + std::vector stall_producers; + std::vector stall_consumers; + }; + + struct RegistryHeader + { + std::atomic magic; + uint32_t version; + uint32_t capacity; + uint8_t _padding[48]; + }; + static_assert(sizeof(RegistryHeader) == CACHE_LINE, + "RegistryHeader must be exactly one cache line"); + + /// Shared-memory participant registry. One per namespace at + /// `/{namespace}_registry`. Persists beyond any single process; + /// remove with `unlink()`. + class Registry + { + public: + Registry() = default; + ~Registry() = default; + Registry(Registry const&) = delete; + Registry& operator=(Registry const&) = delete; + Registry(Registry&&) noexcept = default; + Registry& operator=(Registry&&) noexcept = default; + + /// `capacity` is only used on the create branch; an existing + /// registry keeps its creator's capacity. + static Registry open_or_create(std::string const& kmsg_namespace, + uint32_t capacity = registry::DEFAULT_CAPACITY); + + /// Returns nullopt if the region doesn't exist. For read-only + /// tools that must not create a 2 MB SHM as a side effect of + /// inspection. Throws on version mismatch. + static std::optional try_open(std::string const& kmsg_namespace); + + static void unlink(std::string const& kmsg_namespace); + + /// Returns the claimed slot index, or `INVALID_SLOT` if the + /// registry is full. + uint32_t register_participant(std::string const& shm_name, + std::string const& topic_name, + channel::Type channel_type, + registry::Kind kind, + registry::Role role, + std::string const& node_name); + + /// Idempotent — `INVALID_SLOT` or already-Free slots are no-ops. + void deregister(uint32_t slot_index); + + /// Copy of all `Active` entries. Does not filter by process + /// liveness — callers use `process_exists()` if they need that. + std::vector snapshot() const; + + /// Topic-centric view of the snapshot: groups participants by + /// shm_name and splits each group into producer/consumer and + /// alive/stall lists (alive checked via `process_exists()`). + /// Results are sorted by shm_name for stable output. + std::vector list_topics() const; + + /// CAS-resets `Active` slots whose `pid` no longer exists. + /// Returns the number of slots freed. + uint32_t sweep_stale(); + + std::string const& name() const { return name_; } + uint32_t capacity() const; + + private: + static std::size_t region_size(uint32_t capacity); + static std::string make_shm_name(std::string const& kmsg_namespace); + static std::optional spin_open(std::string const& name); + void init_as_creator(uint32_t capacity); + + RegistryHeader* header(); + RegistryHeader const* header() const; + ParticipantEntry* entries(); + ParticipantEntry const* entries() const; + + SharedMemory shm_; + std::string name_; + }; +} + +#endif diff --git a/include/kickmsg/os/Process.h b/include/kickmsg/os/Process.h new file mode 100644 index 0000000..b3dcc2b --- /dev/null +++ b/include/kickmsg/os/Process.h @@ -0,0 +1,28 @@ +#ifndef KICKMSG_OS_PROCESS_H +#define KICKMSG_OS_PROCESS_H + +#include + +namespace kickmsg +{ + /// PID of the current process. + uint64_t current_pid() noexcept; + + /// Return true if a process with \p pid currently exists on this host. + /// Inherently racy: the process may exit between the probe and any + /// action taken on the result. + bool process_exists(uint64_t pid) noexcept; + + /// Opaque start time of \p pid, or 0 if unavailable. The value is + /// only meaningful for equality: two reads of the same live process + /// return the same value, and a PID-reuse after wraparound almost + /// always yields a different one. Used by sweep_stale as a PID- + /// reuse guard. + /// + /// Linux: clock ticks since boot (/proc//stat field 22). + /// Darwin: microseconds since epoch (sysctl kinfo_proc.p_starttime). + /// Windows: 100-ns intervals since 1601 (GetProcessTimes creation). + uint64_t process_starttime(uint64_t pid) noexcept; +} + +#endif diff --git a/include/kickmsg/os/Time.h b/include/kickmsg/os/Time.h index 9cdec5c..f5041d0 100644 --- a/include/kickmsg/os/Time.h +++ b/include/kickmsg/os/Time.h @@ -9,6 +9,18 @@ namespace kickmsg void sleep(nanoseconds ns); + /// Release the current timeslice back to the scheduler. + void yield(); + + /// Monotonic time since an unspecified origin (typically boot). + /// Use for duration measurements — never leaks forward across + /// clock adjustments. NOT suitable for display timestamps: see + /// since_epoch(). + nanoseconds monotonic_ns(); + + /// Wall-clock time since 1970-01-01 UTC (CLOCK_REALTIME). Use + /// for human-facing timestamps. Subject to jumps on NTP + /// adjustment — do NOT use for timeouts or duration math. nanoseconds since_epoch(); nanoseconds elapsed_time(nanoseconds start); diff --git a/include/kickmsg/types.h b/include/kickmsg/types.h index 59a3b18..e9933b4 100644 --- a/include/kickmsg/types.h +++ b/include/kickmsg/types.h @@ -20,7 +20,7 @@ namespace kickmsg "Kickmsg requires lock-free 32-bit atomics."); constexpr uint64_t MAGIC = 0x4B49434B4D534721ULL; // "KICKMSG!" - constexpr uint32_t VERSION = 4; + constexpr uint32_t VERSION = 5; constexpr uint32_t INVALID_SLOT = UINT32_MAX; constexpr uint64_t LOCKED_SEQUENCE = UINT64_MAX; constexpr std::size_t CACHE_LINE = 64; @@ -240,15 +240,23 @@ namespace kickmsg /// Per-subscriber ring header in shared memory. /// state_flight packs ring state and in_flight publisher count into one /// atomic, enabling single-CAS admission without cross-variable fences. - /// write_pos and has_waiter share a cache line: the publisher writes - /// write_pos then reads has_waiter, the subscriber reads write_pos then - /// writes has_waiter — both access the same line, one cache miss each. + /// write_pos, has_waiter, dropped_count, lost_count share a cache line: + /// the hot path already owns this line when incrementing write_pos, so + /// the extra fetch_add on a drop/loss path introduces no new cache- + /// coherency traffic. Writers on different rings target different + /// lines (128 B stride), so no cross-ring false sharing either. struct SubRingHeader { alignas(CACHE_LINE) std::atomic state_flight; ///< Packed [in_flight:30 | state:2] alignas(CACHE_LINE) std::atomic write_pos; ///< Monotonically increasing position counter std::atomic has_waiter; ///< Set by subscriber before futex_wait + std::atomic dropped_count; ///< Cumulative publisher drops on this ring (all publishers) + std::atomic lost_count; ///< Cumulative subscriber losses on this ring (all subscribers) }; + static_assert(sizeof(SubRingHeader) == 2 * CACHE_LINE, + "SubRingHeader must stay 2 cache lines — the counter fields fit in " + "the existing write_pos line padding; expanding this struct requires " + "reconsidering ring-stride math in Region.cc"); /// Slot header: prepended to each payload buffer in the pool. /// Packed to guarantee binary layout across compilers. diff --git a/py_bindings/CMakeLists.txt b/py_bindings/CMakeLists.txt index eab85b4..337616f 100644 --- a/py_bindings/CMakeLists.txt +++ b/py_bindings/CMakeLists.txt @@ -22,7 +22,37 @@ if (SKBUILD) $<$:-Wl,--gc-sections -Wl,--strip-all>) endif() - # Output name is 'kickmsg' so Python imports as `import kickmsg`. - set_target_properties(kickmsg_py PROPERTIES OUTPUT_NAME "kickmsg") - install(TARGETS kickmsg_py DESTINATION .) + # Source layout: + # py_bindings/src/kickmsg_py.cc C++ binding (compiled to _native.so) + # python/kickmsg/ pure-Python package + # + # At install / editable-build time both land in the same package dir. + set_target_properties(kickmsg_py PROPERTIES + OUTPUT_NAME "_native" + LIBRARY_OUTPUT_DIRECTORY "${CMAKE_CURRENT_BINARY_DIR}/kickmsg") + + # Stage pure-Python sources alongside the native module for + # PYTHONPATH-based dev iteration. Symlinks so .py edits don't + # require a rebuild. + set(PY_PACKAGE_SRC ${CMAKE_SOURCE_DIR}/python/kickmsg) + set(PY_PACKAGE_DST ${CMAKE_CURRENT_BINARY_DIR}/kickmsg) + file(GLOB PY_PACKAGE_FILES + "${PY_PACKAGE_SRC}/*.py" + "${PY_PACKAGE_SRC}/*.pyi" + "${PY_PACKAGE_SRC}/py.typed") + foreach(_src ${PY_PACKAGE_FILES}) + get_filename_component(_name ${_src} NAME) + file(CREATE_LINK ${_src} ${PY_PACKAGE_DST}/${_name} + SYMBOLIC COPY_ON_ERROR) + endforeach() + + install(TARGETS kickmsg_py DESTINATION kickmsg) + install( + DIRECTORY ${PY_PACKAGE_SRC}/ + DESTINATION kickmsg + FILES_MATCHING + PATTERN "*.py" + PATTERN "*.pyi" + PATTERN "py.typed" + ) endif() diff --git a/py_bindings/src/kickmsg_py.cc b/py_bindings/src/kickmsg_py.cc index 877c2ca..4f02fd8 100644 --- a/py_bindings/src/kickmsg_py.cc +++ b/py_bindings/src/kickmsg_py.cc @@ -7,12 +7,16 @@ /// Config — channel::Config /// SchemaInfo — payload schema descriptor /// HealthReport — SharedRegion::diagnose() result -/// SharedRegion — factory methods + schema/health/repair +/// RingStats / RegionStats — SharedRegion::stats() result +/// SharedRegion — factory methods + schema/health/repair/stats /// Publisher — send(bytes) + allocate() → AllocatedSlot /// AllocatedSlot — writable zero-copy handle + .publish() /// Subscriber — try_receive / receive (GIL release) / *_view /// SampleView — read-only zero-copy sample (buffer protocol) /// BroadcastHandle — NamedTuple-like (pub, sub) +/// Role — registry::Role enum (Publisher/Subscriber/Both) +/// Participant — registry snapshot entry +/// Registry — per-namespace participant discovery /// Node — high-level topic / broadcast / mailbox /// schema (submodule) /// Diff — enum (bitmask) @@ -64,12 +68,15 @@ #include #include #include +#include #include "kickmsg/Node.h" #include "kickmsg/Publisher.h" #include "kickmsg/Region.h" +#include "kickmsg/Registry.h" #include "kickmsg/Subscriber.h" #include "kickmsg/Hash.h" +#include "kickmsg/os/Process.h" #include "kickmsg/types.h" namespace nb = nanobind; @@ -208,9 +215,12 @@ namespace namespace kickmsg { - NB_MODULE(kickmsg, m) + // Native module name is `_native`; the outer `kickmsg/__init__.py` does + // `from ._native import *` so user-visible import paths (kickmsg.Publisher, + // kickmsg.Node, …) are unchanged. + NB_MODULE(_native, m) { - m.doc() = "Kickmsg — lock-free shared-memory IPC"; + m.doc() = "Kickmsg — lock-free shared-memory IPC (native bindings)"; // ------------------------------------------------------------------- // Enums & simple types @@ -339,6 +349,73 @@ namespace kickmsg ", schema_stuck=" + (r.schema_stuck ? "True" : "False") + ")"; }); + // ------------------------------------------------------------------- + // RingStats / RegionStats — runtime counter snapshot via stats() + // ------------------------------------------------------------------- + + nb::class_(m, "RingStats") + .def_ro("state", &RingStats::state) + .def_ro("in_flight", &RingStats::in_flight) + .def_ro("write_pos", &RingStats::write_pos) + .def_ro("dropped_count", &RingStats::dropped_count) + .def_ro("lost_count", &RingStats::lost_count) + .def("__repr__", [](RingStats const& r) + { + char const* state_name = "?"; + switch (r.state) + { + case ring::Free: state_name = "Free"; break; + case ring::Live: state_name = "Live"; break; + case ring::Draining: state_name = "Draining"; break; + } + return std::string{"RingStats(state="} + state_name + + ", in_flight=" + std::to_string(r.in_flight) + + ", write_pos=" + std::to_string(r.write_pos) + + ", dropped=" + std::to_string(r.dropped_count) + + ", lost=" + std::to_string(r.lost_count) + ")"; + }); + + nb::class_(m, "RegionStats") + .def_ro("rings", &RegionStats::rings) + .def_ro("total_writes", &RegionStats::total_writes) + .def_ro("total_drops", &RegionStats::total_drops) + .def_ro("total_losses", &RegionStats::total_losses) + .def_ro("live_rings", &RegionStats::live_rings) + .def_ro("pool_free", &RegionStats::pool_free) + .def_ro("pool_size", &RegionStats::pool_size) + .def("__repr__", [](RegionStats const& s) + { + return std::string{"RegionStats(live_rings="} + + std::to_string(s.live_rings) + + ", total_writes=" + std::to_string(s.total_writes) + + ", total_drops=" + std::to_string(s.total_drops) + + ", total_losses=" + std::to_string(s.total_losses) + + ", pool_free=" + std::to_string(s.pool_free) + + "/" + std::to_string(s.pool_size) + ")"; + }); + + nb::class_(m, "RegionInfo") + .def_ro("shm_name", &RegionInfo::shm_name) + .def_ro("channel_type", &RegionInfo::channel_type) + .def_ro("version", &RegionInfo::version) + .def_ro("config_hash", &RegionInfo::config_hash) + .def_ro("total_size", &RegionInfo::total_size) + .def_ro("max_subs", &RegionInfo::max_subs) + .def_ro("sub_ring_capacity", &RegionInfo::sub_ring_capacity) + .def_ro("pool_size", &RegionInfo::pool_size) + .def_ro("max_payload_size", &RegionInfo::max_payload_size) + .def_ro("commit_timeout_us", &RegionInfo::commit_timeout_us) + .def_ro("creator_pid", &RegionInfo::creator_pid) + .def_ro("creator_name", &RegionInfo::creator_name) + .def_ro("created_at_ns", &RegionInfo::created_at_ns) + .def("__repr__", [](RegionInfo const& i) + { + return std::string{"RegionInfo(shm='"} + i.shm_name + + "', version=" + std::to_string(i.version) + + ", creator_pid=" + std::to_string(i.creator_pid) + + ", creator='" + i.creator_name + "')"; + }); + // ------------------------------------------------------------------- // SharedRegion // ------------------------------------------------------------------- @@ -364,6 +441,11 @@ namespace kickmsg .def("try_claim_schema", &SharedRegion::try_claim_schema, "info"_a) .def("reset_schema_claim", &SharedRegion::reset_schema_claim) .def("diagnose", &SharedRegion::diagnose) + .def("stats", &SharedRegion::stats, + "Runtime counter snapshot (per-ring + aggregate). " + "Safe under live traffic.") + .def("info", &SharedRegion::info, + "Static header metadata: geometry, creator, version.") .def("repair_locked_entries", &SharedRegion::repair_locked_entries) .def("reset_retired_rings", &SharedRegion::reset_retired_rings) .def("reclaim_orphaned_slots",&SharedRegion::reclaim_orphaned_slots) @@ -379,6 +461,96 @@ namespace kickmsg m.def("unlink_shm", [](std::string const& name) { SharedMemory::unlink(name); }, "name"_a, "Unlink a shared-memory entry by name (no-op if absent)."); + // ------------------------------------------------------------------- + // Registry — per-namespace participant directory + // ------------------------------------------------------------------- + + nb::enum_(m, "Role") + .value("Publisher", registry::Publisher) + .value("Subscriber", registry::Subscriber) + .value("Both", registry::Both); + + nb::enum_(m, "Kind") + .value("Pubsub", registry::Pubsub) + .value("Broadcast", registry::Broadcast) + .value("Mailbox", registry::Mailbox); + + nb::class_(m, "Participant") + .def_ro("pid", &Participant::pid) + .def_ro("pid_starttime", &Participant::pid_starttime) + .def_ro("created_at_ns", &Participant::created_at_ns) + .def_ro("channel_type", &Participant::channel_type) + .def_ro("role", &Participant::role) + .def_ro("kind", &Participant::kind) + .def_ro("shm_name", &Participant::shm_name) + .def_ro("topic_name", &Participant::topic_name) + .def_ro("node_name", &Participant::node_name) + .def("__repr__", [](Participant const& p) + { + char const* role_name = "?"; + switch (p.role) + { + case registry::Publisher: role_name = "Publisher"; break; + case registry::Subscriber: role_name = "Subscriber"; break; + case registry::Both: role_name = "Both"; break; + } + return std::string{"Participant(topic='"} + p.topic_name + + "', node='" + p.node_name + + "', pid=" + std::to_string(p.pid) + + ", role=" + role_name + ")"; + }); + + nb::class_(m, "TopicSummary") + .def_ro("shm_name", &TopicSummary::shm_name) + .def_ro("topic_name", &TopicSummary::topic_name) + .def_ro("channel_type", &TopicSummary::channel_type) + .def_ro("kind", &TopicSummary::kind) + .def_ro("producers", &TopicSummary::producers) + .def_ro("consumers", &TopicSummary::consumers) + .def_ro("stall_producers", &TopicSummary::stall_producers) + .def_ro("stall_consumers", &TopicSummary::stall_consumers) + .def("__repr__", [](TopicSummary const& t) + { + return std::string{"TopicSummary(topic='"} + t.topic_name + + "', producers=" + std::to_string(t.producers.size()) + + ", consumers=" + std::to_string(t.consumers.size()) + + ", stalled=" + + std::to_string(t.stall_producers.size() + + t.stall_consumers.size()) + ")"; + }); + + nb::class_(m, "Registry") + .def_static("open_or_create", &Registry::open_or_create, + "namespace"_a, "capacity"_a = registry::DEFAULT_CAPACITY, + nb::rv_policy::move, + "Open the registry SHM for `namespace`, creating it if absent.") + .def_static("try_open", &Registry::try_open, "namespace"_a, + nb::rv_policy::move, + "Open an existing registry; returns None if none exists.") + .def_static("unlink", &Registry::unlink, "namespace"_a, + "Remove the registry SHM for `namespace` from the filesystem.") + .def("snapshot", &Registry::snapshot, + "Copy all currently Active participant entries. Does not " + "filter by process liveness.") + .def("list_topics", &Registry::list_topics, + "Topic-centric view: groups participants by shm_name and " + "splits them into producer/consumer × alive/stall lanes.") + .def("sweep_stale", &Registry::sweep_stale, + "Reclaim slots owned by processes that no longer exist. " + "Returns the number of slots freed.") + .def_prop_ro("name", &Registry::name) + .def_prop_ro("capacity", &Registry::capacity) + .def("__repr__", [](Registry const& r) + { + return std::string{"Registry(name='"} + r.name() + + "', capacity=" + std::to_string(r.capacity()) + ")"; + }); + + m.def("process_exists", &process_exists, "pid"_a, + "Return True if a process with `pid` exists on this host."); + m.def("current_pid", ¤t_pid, + "Return the PID of the current process."); + // SampleRef (the C++ byte-copy sample) is not bound directly — // try_receive() / receive() auto-convert it to `bytes` at the // Python boundary. Users who want ring-position information @@ -683,7 +855,7 @@ namespace kickmsg // keep_alive<0, 1>: the return value (0) pins the Node (1 = self). nb::class_(m, "Node") .def(nb::init(), - "name"_a, "prefix"_a = std::string{"kickmsg"}) + "name"_a, "namespace"_a = std::string{"kickmsg"}) .def("advertise", [](Node& n, char const* topic, channel::Config const& cfg) { return n.advertise(topic, cfg); }, @@ -726,12 +898,12 @@ namespace kickmsg .def("topic_schema", &Node::topic_schema, "topic"_a) .def("try_claim_topic_schema", &Node::try_claim_topic_schema, "topic"_a, "info"_a) - .def_prop_ro("name", &Node::name) - .def_prop_ro("prefix", &Node::prefix) + .def_prop_ro("name", &Node::name) + .def_prop_ro("namespace", &Node::kmsg_namespace) .def("__repr__", [](Node const& n) { return std::string{"Node(name='"} + n.name() + - "', prefix='" + n.prefix() + "')"; + "', namespace='" + n.kmsg_namespace() + "')"; }); } } diff --git a/pyproject.toml b/pyproject.toml index ce60d84..8cee6d2 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -12,6 +12,9 @@ maintainers = [ {name = "Philippe Leduc", email = "philippe.leduc@mailfence.com"} ] +[project.scripts] +kickmsg = "kickmsg.cli:main" + [tool.scikit-build] minimum-version = "0.11" cmake.version = ">=3.30" diff --git a/python/kickmsg/__init__.py b/python/kickmsg/__init__.py new file mode 100644 index 0000000..3e6ab8f --- /dev/null +++ b/python/kickmsg/__init__.py @@ -0,0 +1,61 @@ +"""Kickmsg — lock-free shared-memory IPC.""" + +# Explicit re-exports from the native bindings. Using an enumerated +# list (rather than `from ._native import *`) makes the public API +# visible here and avoids future name collisions if we add pure-Python +# symbols that share a name with a native one. +from ._native import ( + AllocatedSlot, + BroadcastHandle, + ChannelType, + Config, + HealthReport, + Kind, + Node, + Participant, + Publisher, + RegionStats, + Registry, + RingStats, + Role, + SampleView, + SchemaInfo, + SharedRegion, + Subscriber, + TopicSummary, + current_pid, + hash, + process_exists, + schema, + unlink_shm, +) + +# Diagnostics submodule (typed dataclass API for CLI + GUI). +from . import diagnostics + +__all__ = [ + "AllocatedSlot", + "BroadcastHandle", + "ChannelType", + "Config", + "HealthReport", + "Kind", + "Node", + "Participant", + "Publisher", + "RegionStats", + "Registry", + "RingStats", + "Role", + "SampleView", + "SchemaInfo", + "SharedRegion", + "Subscriber", + "TopicSummary", + "current_pid", + "diagnostics", + "hash", + "process_exists", + "schema", + "unlink_shm", +] diff --git a/py_bindings/kickmsg.pyi b/python/kickmsg/_native.pyi similarity index 66% rename from py_bindings/kickmsg.pyi rename to python/kickmsg/_native.pyi index fd6d000..5722613 100644 --- a/py_bindings/kickmsg.pyi +++ b/python/kickmsg/_native.pyi @@ -118,6 +118,79 @@ class HealthReport: def __repr__(self) -> str: ... +class RingStats: + @property + def state(self) -> int: + """Ring state as raw int: 0=Free, 1=Live, 2=Draining.""" + + @property + def in_flight(self) -> int: ... + + @property + def write_pos(self) -> int: ... + + @property + def dropped_count(self) -> int: ... + + @property + def lost_count(self) -> int: ... + + def __repr__(self) -> str: ... + +class RegionStats: + @property + def rings(self) -> list[RingStats]: ... + + @property + def total_writes(self) -> int: ... + + @property + def total_drops(self) -> int: ... + + @property + def total_losses(self) -> int: ... + + @property + def live_rings(self) -> int: ... + + @property + def pool_free(self) -> int: ... + + @property + def pool_size(self) -> int: ... + + def __repr__(self) -> str: ... + +class RegionInfo: + @property + def shm_name(self) -> str: ... + @property + def channel_type(self) -> ChannelType: ... + @property + def version(self) -> int: ... + @property + def config_hash(self) -> int: ... + @property + def total_size(self) -> int: ... + @property + def max_subs(self) -> int: ... + @property + def sub_ring_capacity(self) -> int: ... + @property + def pool_size(self) -> int: ... + @property + def max_payload_size(self) -> int: ... + @property + def commit_timeout_us(self) -> int: ... + @property + def creator_pid(self) -> int: ... + @property + def creator_name(self) -> str: ... + @property + def created_at_ns(self) -> int: ... + + def __repr__(self) -> str: ... + class SharedRegion: @staticmethod def create(name: str, type: ChannelType, cfg: Config, creator: str = '') -> SharedRegion: ... @@ -142,6 +215,12 @@ class SharedRegion: def diagnose(self) -> HealthReport: ... + def stats(self) -> RegionStats: + """Runtime counter snapshot (per-ring + aggregate). Safe under live traffic.""" + + def info(self) -> RegionInfo: + """Static header metadata: geometry, creator, version.""" + def repair_locked_entries(self) -> int: ... def reset_retired_rings(self) -> int: ... @@ -155,6 +234,107 @@ class SharedRegion: def unlink_shm(name: str) -> None: """Unlink a shared-memory entry by name (no-op if absent).""" +class Role(enum.Enum): + Publisher = 1 + + Subscriber = 2 + + Both = 3 + +class Kind(enum.Enum): + Pubsub = 1 + + Broadcast = 2 + + Mailbox = 3 + +class Participant: + @property + def pid(self) -> int: ... + + @property + def pid_starttime(self) -> int: ... + + @property + def created_at_ns(self) -> int: ... + + @property + def channel_type(self) -> int: ... + + @property + def role(self) -> int: ... + + @property + def kind(self) -> int: ... + + @property + def shm_name(self) -> str: ... + + @property + def topic_name(self) -> str: ... + + @property + def node_name(self) -> str: ... + + def __repr__(self) -> str: ... + +class TopicSummary: + @property + def shm_name(self) -> str: ... + + @property + def topic_name(self) -> str: ... + + @property + def channel_type(self) -> int: ... + + @property + def kind(self) -> int: ... + + @property + def producers(self) -> list[Participant]: ... + + @property + def consumers(self) -> list[Participant]: ... + + @property + def stall_producers(self) -> list[Participant]: ... + + @property + def stall_consumers(self) -> list[Participant]: ... + + def __repr__(self) -> str: ... + +class Registry: + @staticmethod + def open_or_create(namespace: str, capacity: int = ...) -> Registry: + """Open the registry SHM for `namespace`, creating it if absent.""" + + @staticmethod + def try_open(namespace: str) -> Registry | None: + """Open an existing registry; returns None if none exists.""" + + @staticmethod + def unlink(namespace: str) -> None: + """Remove the registry SHM for `namespace` from the filesystem.""" + + def snapshot(self) -> list[Participant]: + """Copy all currently Active participant entries. Does not filter by process liveness.""" + + def list_topics(self) -> list[TopicSummary]: + """Topic-centric view: groups participants by shm_name and splits them into producer/consumer × alive/stall lanes.""" + + def sweep_stale(self) -> int: + """Reclaim slots owned by processes that no longer exist. Returns the number of slots freed.""" + + @property + def name(self) -> str: ... + + @property + def capacity(self) -> int: ... + + def __repr__(self) -> str: ... + class SampleView: def __buffer__(self, flags, /): """ @@ -278,7 +458,7 @@ class BroadcastHandle: def __repr__(self) -> str: ... class Node: - def __init__(self, name: str, prefix: str = 'kickmsg') -> None: ... + def __init__(self, name: str, namespace: str = 'kickmsg') -> None: ... def advertise(self, topic: str, cfg: Config = ...) -> Publisher: ... @@ -308,6 +488,6 @@ class Node: def name(self) -> str: ... @property - def prefix(self) -> str: ... + def namespace(self) -> str: ... def __repr__(self) -> str: ... diff --git a/python/kickmsg/cli.py b/python/kickmsg/cli.py new file mode 100644 index 0000000..5d01bb3 --- /dev/null +++ b/python/kickmsg/cli.py @@ -0,0 +1,556 @@ +"""Argparse entry point — thin formatting layer over `kickmsg.diagnostics`. + +Every data-fetching path lives in `kickmsg.diagnostics`; this module only +renders results. +""" + +from __future__ import annotations + +import argparse +import json +import os +import sys +from dataclasses import asdict +from typing import Sequence + +from . import _native +from . import diagnostics as diag + + +# Env default stays CLI-only so the core library remains env-agnostic. +_ENV_VAR = "KICKMSG_NAMESPACE" + + +def _default_namespace() -> str: + value = os.environ.get(_ENV_VAR) + if value: + return value + return "kickmsg" + + +def _normalize_shm_name(name: str) -> str: + """POSIX shm_open requires a leading '/'; accept names without it.""" + if name.startswith("/"): + return name + return "/" + name + + +def _normalize_topic(name: str) -> str: + if not name.startswith("/"): + return "/" + name + return name + + +def _add_region_target(sp: argparse.ArgumentParser) -> None: + """Attach the (topic | --shm) target selector.""" + sp.add_argument("topic", nargs="?", default=None, + help="Topic path (leading '/' optional). Combined with " + "--namespace to find the region. Use --shm for " + "explicit SHM names.") + sp.add_argument("--shm", default=None, + help="Raw SHM name (leading '/' optional). Overrides " + "positional topic when both are given.") + sp.add_argument("-n", "--namespace", default=_default_namespace(), + help=f"Namespace used with positional topic. Defaults to " + f"${_ENV_VAR} if set, else \"kickmsg\".") + + +def _resolve_shm_name(args) -> str: + if args.shm is not None: + return _normalize_shm_name(args.shm) + + if args.topic is None: + raise SystemExit( + "error: need a topic (positional) or --shm SHM_NAME") + + topic = _normalize_topic(args.topic) + + # Registry lookup only — no pubsub-pattern fallback, since guessing + # silently gave the wrong SHM name for broadcast/mailbox topics. + registry = _native.Registry.try_open(args.namespace) + if registry is None: + raise SystemExit( + f"error: no registry for namespace '{args.namespace}'. " + f"No kickmsg participant has registered under this namespace yet.") + + for t in registry.list_topics(): + if t.topic_name == topic: + return t.shm_name + + raise SystemExit( + f"error: topic '{topic}' not found in namespace '{args.namespace}' " + f"registry. The region may not be published yet, or you may need " + f"--shm for a region whose creator isn't registered.") + + +# ---------------------------------------------------------------------- +# Output helpers +# ---------------------------------------------------------------------- + + +def _json_default(v): + if isinstance(v, bytes): + return v.hex() + return str(v) + + +def _to_json(obj) -> str: + def _normalize(x): + if hasattr(x, "__dataclass_fields__"): + return asdict(x) + if isinstance(x, list): + return [_normalize(y) for y in x] + if isinstance(x, dict): + return {k: _normalize(v) for k, v in x.items()} + return x + return json.dumps(_normalize(obj), indent=2, default=_json_default) + + +def _humanize_age(seconds: float | None) -> str: + if seconds is None: + return "-" + s = int(seconds) + if s < 60: + return f"{s}s" + if s < 3600: + return f"{s // 60}m{s % 60:02d}s" + return f"{s // 3600}h{(s % 3600) // 60:02d}m" + + +def _hex_or_dash(b: bytes | None) -> str: + if b is None: + return "-" + return b.hex() + + +# ---------------------------------------------------------------------- +# Subcommand: list +# ---------------------------------------------------------------------- + + +def _col_topic(t): return t.topic_name +def _col_shm(t): return t.shm_name +def _col_kind(t): return t.kind +def _col_namespace(t): return t.kmsg_namespace +def _col_producers(t): return str(len(t.producers)) +def _col_consumers(t): return str(len(t.consumers)) +def _col_producer_pids(t): + joined = ",".join(str(p.pid) for p in t.producers) + if not joined: + return "-" + return joined +def _col_consumer_pids(t): + joined = ",".join(str(p.pid) for p in t.consumers) + if not joined: + return "-" + return joined +def _col_producer_names(t): + joined = ",".join(p.node_name for p in t.producers) + if not joined: + return "-" + return joined +def _col_consumer_names(t): + joined = ",".join(p.node_name for p in t.consumers) + if not joined: + return "-" + return joined +def _col_stall(t): return str(len(t.stall_producers) + len(t.stall_consumers)) +def _col_stall_pub(t): return str(len(t.stall_producers)) +def _col_stall_sub(t): return str(len(t.stall_consumers)) +def _col_schema(t): + if t.schema_name is None: + return "-" + return t.schema_name +def _col_ver(t): + if t.schema_version is None: + return "-" + return str(t.schema_version) +def _col_age(t): return _humanize_age(t.age_seconds) + + +_LIST_COLUMNS = { + "topic": ("TOPIC", _col_topic), + "shm": ("SHM", _col_shm), + "ns": ("NS", _col_namespace), + "namespace": ("NS", _col_namespace), + "kind": ("KIND", _col_kind), + "producers": ("PUB", _col_producers), + "pub": ("PUB", _col_producers), + "producers_pid": ("PUB_PID", _col_producer_pids), + "pub_pid": ("PUB_PID", _col_producer_pids), + "producers_names": ("PUB_NODES", _col_producer_names), + "pub_names": ("PUB_NODES", _col_producer_names), + "consumers": ("SUB", _col_consumers), + "sub": ("SUB", _col_consumers), + "consumers_pid": ("SUB_PID", _col_consumer_pids), + "sub_pid": ("SUB_PID", _col_consumer_pids), + "consumers_names": ("SUB_NODES", _col_consumer_names), + "sub_names": ("SUB_NODES", _col_consumer_names), + "age": ("AGE", _col_age), + "stall_pub": ("STALL_PUB", _col_stall_pub), + "stall_sub": ("STALL_SUB", _col_stall_sub), + "stall": ("STALL", _col_stall), + "schema": ("SCHEMA", _col_schema), + "ver": ("VER", _col_ver), +} + +_LIST_DEFAULT_COLS = ["topic", "ns", "kind", "pub", "sub", "age", "stall"] +_LIST_ALL_COLS = [ + "topic", "ns", "kind", + "pub", "pub_pid", "pub_names", + "sub", "sub_pid", "sub_names", + "age", "stall_pub", "stall_sub", "stall", + "schema", "ver", "shm", +] + + +def _column_width(header: str, rows: Sequence[Sequence[str]], col: int) -> int: + width = len(header) + for r in rows: + if len(r[col]) > width: + width = len(r[col]) + return width + + +def _render_table(headers: Sequence[str], rows: Sequence[Sequence[str]]) -> str: + widths = [_column_width(h, rows, i) for i, h in enumerate(headers)] + lines = [" ".join(h.ljust(widths[i]) for i, h in enumerate(headers))] + for r in rows: + lines.append(" ".join(c.ljust(widths[i]) for i, c in enumerate(r))) + return "\n".join(lines) + + +def _parse_columns(spec: str | None) -> list[str]: + if spec == "all": + return list(_LIST_ALL_COLS) + if not spec: + return list(_LIST_DEFAULT_COLS) + return [c.strip() for c in spec.split(",")] + + +def cmd_list(args) -> int: + topics = diag.list_topics(kmsg_namespace=args.namespace) + + if args.json: + print(_to_json(topics)) + return 0 + + cols = _parse_columns(args.columns) + bad = [c for c in cols if c not in _LIST_COLUMNS] + if bad: + print(f"unknown columns: {', '.join(bad)}", file=sys.stderr) + print(f"available: {', '.join(sorted(_LIST_COLUMNS))}", file=sys.stderr) + return 2 + + headers = [_LIST_COLUMNS[c][0] for c in cols] + rows = [[_LIST_COLUMNS[c][1](t) for c in cols] for t in topics] + print(_render_table(headers, rows)) + if not topics: + # Friendly hint but no error — empty output is a valid state + # (no participants yet, or all cleanly deregistered). + print(f"\n(no participants in namespace '{args.namespace}')", + file=sys.stderr) + return 0 + + +# ---------------------------------------------------------------------- +# Subcommand: info +# ---------------------------------------------------------------------- + + +def cmd_info(args) -> int: + i = diag.info(_resolve_shm_name(args)) + if args.json: + print(_to_json(i)) + return 0 + print(f"region: {i.shm_name}") + print(f"type: {i.channel_type}") + s = i.schema + if s.state == "set": + print(f"schema: name={s.name!r}, version={s.version}") + print(f" identity: {_hex_or_dash(s.identity)}") + print(f" layout: {_hex_or_dash(s.layout)}") + elif s.state == "claiming": + print("schema: claim in progress (possibly wedged — see `kickmsg diagnose`)") + else: + print("schema: unset") + return 0 + + +# ---------------------------------------------------------------------- +# Subcommand: stats +# ---------------------------------------------------------------------- + + +def cmd_stats(args) -> int: + s = diag.stats(_resolve_shm_name(args)) + if args.json: + print(_to_json(s)) + return 0 + print(f"region: {s.shm_name}") + print(f"subscribers: {s.live_rings} live") + print(f"pool: {s.pool_free} / {s.pool_size} free") + print() + headers = ["ring", "state", "in_flight", "write_pos", "dropped", "lost"] + rows = [ + [ + str(r.index), r.state, str(r.in_flight), str(r.write_pos), + str(r.dropped_count), str(r.lost_count), + ] + for r in s.rings + ] + print(_render_table(headers, rows)) + print() + print(f"total writes: {s.total_writes}") + print(f"total drops: {s.total_drops}") + print(f"total losses: {s.total_losses}") + return 0 + + +# ---------------------------------------------------------------------- +# Subcommand: diagnose +# ---------------------------------------------------------------------- + + +def cmd_diagnose(args) -> int: + h = diag.diagnose(_resolve_shm_name(args)) + if args.json: + print(_to_json(h)) + return 0 + print(f"region: {h.shm_name}") + print(f"locked_entries: {h.locked_entries}") + print(f"retired_rings: {h.retired_rings}") + print(f"draining_rings: {h.draining_rings}") + print(f"live_rings: {h.live_rings}") + print(f"schema_stuck: {h.schema_stuck}") + print() + print(f"status: {h.status}") + if h.status == "crash residue": + flags = [] + if h.locked_entries: + flags.append("--locked") + if h.retired_rings: + flags.append("--retired") + print(f"suggested: kickmsg repair {h.shm_name} {' '.join(flags)}") + if h.status == "healthy": + return 0 + return 1 + + +# ---------------------------------------------------------------------- +# Subcommand: repair +# ---------------------------------------------------------------------- + + +_UNSAFE_WARNINGS = { + "retired": ( + "--retired calls reset_retired_rings(), which is ONLY safe after the " + "crashed publisher is confirmed gone. Resetting under a live " + "publisher will corrupt in-flight messages."), + "reclaim": ( + "--reclaim calls reclaim_orphaned_slots(), which requires full " + "quiescence: no active publishers AND no live SampleView pins. " + "Reclaiming under traffic will free in-use slots."), +} + + +def _confirm_unsafe(unsafe_flags: list[str], assume_yes: bool) -> bool: + for flag in unsafe_flags: + print(f"warning: {_UNSAFE_WARNINGS[flag]}", file=sys.stderr) + if assume_yes: + return True + if not sys.stdin.isatty(): + print("error: refusing to run unsafe repair non-interactively without --yes", + file=sys.stderr) + return False + flags = ", ".join(f"--{f}" for f in unsafe_flags) + answer = input(f"\nProceed with {flags}? [y/N] ").strip().lower() + return answer in ("y", "yes") + + +def cmd_repair(args) -> int: + target = _resolve_shm_name(args) + if not (args.locked or args.retired or args.reclaim): + args.locked = True # default: safe operation only + + unsafe = [] + if args.retired: + unsafe.append("retired") + if args.reclaim: + unsafe.append("reclaim") + if unsafe and not _confirm_unsafe(unsafe, args.yes): + return 2 + + if args.locked: + n = diag.repair_locked(target) + print(f"--locked: repaired {n} entries") + if args.retired: + n = diag.reset_retired(target) + print(f"--retired: reset {n} rings") + if args.reclaim: + n = diag.reclaim_orphaned(target) + print(f"--reclaim: reclaimed {n} slots") + return 0 + + +# ---------------------------------------------------------------------- +# Subcommand: watch +# ---------------------------------------------------------------------- + + +def _format_rate(rate: float) -> str: + return f"{rate:.0f} msg/s" + + +def cmd_watch(args) -> int: + target = _resolve_shm_name(args) + is_tty = sys.stdout.isatty() + try: + for frame in diag.watch(target, interval=args.interval): + if is_tty: + sys.stdout.write("\033[2J\033[H") # clear screen + home + else: + sys.stdout.write("\n--- frame ---\n") + print(f"{frame.stats.shm_name} live: {frame.stats.live_rings} " + f"pool: {frame.stats.pool_free}/{frame.stats.pool_size}") + print() + headers = ["ring", "state", "in_flight", "write_pos", "dropped", "lost", "rate"] + rows = [ + [ + str(r.index), r.state, str(r.in_flight), str(r.write_pos), + str(r.dropped_count), str(r.lost_count), + _format_rate(rate), + ] + for r, rate in zip(frame.stats.rings, frame.rates_msg_per_sec) + ] + print(_render_table(headers, rows)) + sys.stdout.flush() + except KeyboardInterrupt: + pass + return 0 + + +# ---------------------------------------------------------------------- +# Subcommand: schema / schema-diff +# ---------------------------------------------------------------------- + + +def cmd_schema(args) -> int: + s = diag.schema(_resolve_shm_name(args)) + if args.json: + print(_to_json(s)) + return 0 + if s.state == "unset": + print("unset (no schema published)") + return 1 + if s.state == "claiming": + print("claiming (claim in progress — possibly wedged)") + return 1 + print(f"name: {s.name}") + print(f"version: {s.version}") + print(f"identity: {_hex_or_dash(s.identity)}") + print(f"layout: {_hex_or_dash(s.layout)}") + print(f"identity_algo: {s.identity_algo}") + print(f"layout_algo: {s.layout_algo}") + print(f"flags: {s.flags}") + return 0 + + +def cmd_schema_diff(args) -> int: + d = diag.schema_diff(args.shm_a, args.shm_b) + if args.json: + print(_to_json(d)) + if d.equal: + return 0 + return 1 + for field_name in ("identity", "layout", "version", "name", "identity_algo", "layout_algo"): + differs = getattr(d, field_name) + if differs: + status = "differ" + else: + status = "match" + print(f"{field_name:14} {status}") + if d.equal: + return 0 + return 1 + + +# ---------------------------------------------------------------------- +# Entry point +# ---------------------------------------------------------------------- + + +def build_parser() -> argparse.ArgumentParser: + p = argparse.ArgumentParser(prog="kickmsg", + description="Inspect running kickmsg channels.") + sub = p.add_subparsers(dest="cmd", required=True) + + sp = sub.add_parser("list", aliases=["ls"], + help="List running topics (registry-backed)") + sp.add_argument("-n", "--namespace", default=_default_namespace(), + help=f"Namespace to inspect. Defaults to ${_ENV_VAR} if " + f"set, else \"kickmsg\".") + sp.add_argument("-o", "--columns", default=None, + help="Comma-separated column list, or 'all'. " + f"Default: {','.join(_LIST_DEFAULT_COLS)}") + sp.add_argument("--json", action="store_true") + sp.set_defaults(func=cmd_list) + + sp = sub.add_parser("info", help="Static header metadata for one region") + _add_region_target(sp) + sp.add_argument("--json", action="store_true") + sp.set_defaults(func=cmd_info) + + sp = sub.add_parser("stats", help="Runtime counter snapshot") + _add_region_target(sp) + sp.add_argument("--json", action="store_true") + sp.set_defaults(func=cmd_stats) + + sp = sub.add_parser("diagnose", aliases=["diag"], + help="Health check") + _add_region_target(sp) + sp.add_argument("--json", action="store_true") + sp.set_defaults(func=cmd_diagnose) + + sp = sub.add_parser("repair", help="Run repair primitives") + _add_region_target(sp) + sp.add_argument("--locked", action="store_true", + help="repair_locked_entries() — safe under live traffic") + sp.add_argument("--retired", action="store_true", + help="reset_retired_rings() — only after crashed publisher confirmed gone") + sp.add_argument("--reclaim", action="store_true", + help="reclaim_orphaned_slots() — requires full quiescence") + sp.add_argument("-y", "--yes", action="store_true", + help="skip the confirmation prompt for --retired / --reclaim") + sp.set_defaults(func=cmd_repair) + + sp = sub.add_parser("watch", help="top-like live stats view") + _add_region_target(sp) + sp.add_argument("-i", "--interval", type=float, default=1.0) + sp.set_defaults(func=cmd_watch) + + sp = sub.add_parser("schema", help="Inspect the schema descriptor") + _add_region_target(sp) + sp.add_argument("--json", action="store_true") + sp.set_defaults(func=cmd_schema) + + sp = sub.add_parser("schema-diff", aliases=["sdiff"], + help="Compare schemas across two regions") + sp.add_argument("shm_a", type=_normalize_shm_name, + help="SHM name of region A (leading '/' optional)") + sp.add_argument("shm_b", type=_normalize_shm_name, + help="SHM name of region B (leading '/' optional)") + sp.add_argument("--json", action="store_true") + sp.set_defaults(func=cmd_schema_diff) + + return p + + +def main(argv: Sequence[str] | None = None) -> int: + parser = build_parser() + args = parser.parse_args(argv) + return args.func(args) + + +if __name__ == "__main__": + raise SystemExit(main()) diff --git a/python/kickmsg/diagnostics.py b/python/kickmsg/diagnostics.py new file mode 100644 index 0000000..247003b --- /dev/null +++ b/python/kickmsg/diagnostics.py @@ -0,0 +1,447 @@ +"""Diagnostic API for kickmsg. + +Typed dataclass wrappers around the native bindings, intended for both +the `kickmsg` CLI and third-party code (GUIs, exporters) that wants to +inspect running channels without shelling out. +""" + +from __future__ import annotations + +import time +from dataclasses import dataclass, field +from typing import Iterator + +from . import _native + + +# ---------------------------------------------------------------------- +# Result dataclasses +# ---------------------------------------------------------------------- + + +@dataclass(frozen=True) +class Creator: + pid: int + name: str + created_at_ns: int + + +@dataclass(frozen=True) +class Geometry: + max_subscribers: int + sub_ring_capacity: int + pool_size: int + max_payload_size: int + commit_timeout_us: int + total_size: int + + +@dataclass(frozen=True) +class SchemaSnapshot: + """Result of `schema(shm_name)`. + + `state` is one of "unset" / "claiming" / "set". `info` is populated + only when state == "set". + """ + state: str + name: str | None = None + version: int | None = None + identity: bytes | None = None + layout: bytes | None = None + identity_algo: int | None = None + layout_algo: int | None = None + flags: int | None = None + + +@dataclass(frozen=True) +class SchemaDiff: + """Bit-decomposed schema::diff() result.""" + equal: bool + identity: bool + layout: bool + version: bool + name: bool + identity_algo: bool + layout_algo: bool + + +@dataclass(frozen=True) +class RegionInfo: + """Static (non-runtime) snapshot of a SharedRegion header.""" + shm_name: str + channel_type: str # "pubsub" | "broadcast" + version: int + config_hash: int + geometry: Geometry + creator: Creator + schema: SchemaSnapshot + + +@dataclass(frozen=True) +class RingStats: + index: int + state: str # "free" | "live" | "draining" + in_flight: int + write_pos: int + dropped_count: int + lost_count: int + + +@dataclass(frozen=True) +class RegionStats: + """Runtime counter snapshot — cheap, safe under live traffic.""" + shm_name: str + rings: list[RingStats] + total_writes: int + total_drops: int + total_losses: int + live_rings: int + pool_free: int + pool_size: int + + +@dataclass(frozen=True) +class HealthSnapshot: + """Interpretation of `SharedRegion::diagnose()`.""" + shm_name: str + locked_entries: int + retired_rings: int + draining_rings: int + live_rings: int + schema_stuck: bool + status: str # "healthy" | "crash residue" | "schema wedged" + + +@dataclass(frozen=True) +class Participant: + pid: int + pid_starttime: int # boot-relative (Linux), 0 elsewhere + node_name: str + topic_name: str # user-facing logical path + shm_name: str # POSIX SHM path (implementation detail) + kind: str # "pubsub" | "broadcast" | "mailbox" + channel_type: str # "pubsub" | "broadcast" + role: str # "publisher" | "subscriber" | "both" + created_at_ns: int + # Liveness is carried by which list the entry was in on TopicSummary + # (producers vs stall_producers). Callers needing a fresh probe use + # `kickmsg.process_exists(p.pid)`. + + +@dataclass(frozen=True) +class TopicSummary: + topic_name: str # user-facing logical path + shm_name: str # POSIX SHM path + kind: str # "pubsub" | "broadcast" | "mailbox" + channel_type: str + kmsg_namespace: str + age_seconds: float | None + producers: list[Participant] = field(default_factory=list) + consumers: list[Participant] = field(default_factory=list) + stall_producers: list[Participant] = field(default_factory=list) + stall_consumers: list[Participant] = field(default_factory=list) + schema_name: str | None = None + schema_version: int | None = None + + +@dataclass(frozen=True) +class WatchSnapshot: + """One frame of `watch()` output. Rates are msg/s deltas since the + previous snapshot; zero on the first frame.""" + stats: RegionStats + rates_msg_per_sec: list[float] + + +# ---------------------------------------------------------------------- +# Internal helpers +# ---------------------------------------------------------------------- + + +_CHANNEL_NAME = { + _native.ChannelType.PubSub.value: "pubsub", + _native.ChannelType.Broadcast.value: "broadcast", +} + +_ROLE_NAME = { + _native.Role.Publisher.value: "publisher", + _native.Role.Subscriber.value: "subscriber", + _native.Role.Both.value: "both", +} + +_KIND_NAME = { + _native.Kind.Pubsub.value: "pubsub", + _native.Kind.Broadcast.value: "broadcast", + _native.Kind.Mailbox.value: "mailbox", +} + +_RING_STATE = {0: "free", 1: "live", 2: "draining"} + + +def _schema_snapshot(region: _native.SharedRegion) -> SchemaSnapshot: + # `SharedRegion.schema()` returns None when the slot is Unset or + # Claiming — the native API collapses both. We fall back to + # diagnose().schema_stuck to split "never set" from "mid-claim wedge". + info = region.schema() + if info is not None: + return SchemaSnapshot( + state="set", + name=info.name, + version=info.version, + identity=info.identity, + layout=info.layout, + identity_algo=info.identity_algo, + layout_algo=info.layout_algo, + flags=info.flags, + ) + if region.diagnose().schema_stuck: + return SchemaSnapshot(state="claiming") + return SchemaSnapshot(state="unset") + + +def _open(shm_name: str) -> _native.SharedRegion: + return _native.SharedRegion.open(shm_name) + + +# ---------------------------------------------------------------------- +# Public API +# ---------------------------------------------------------------------- + + +def info(shm_name: str) -> RegionInfo: + """Static header metadata for a region.""" + region = _open(shm_name) + header = region.info() + return RegionInfo( + shm_name=header.shm_name, + channel_type=_CHANNEL_NAME.get(header.channel_type.value, "?"), + version=header.version, + config_hash=header.config_hash, + geometry=Geometry( + max_subscribers=header.max_subs, + sub_ring_capacity=header.sub_ring_capacity, + pool_size=header.pool_size, + max_payload_size=header.max_payload_size, + commit_timeout_us=header.commit_timeout_us, + total_size=header.total_size, + ), + creator=Creator( + pid=header.creator_pid, + name=header.creator_name, + created_at_ns=header.created_at_ns, + ), + schema=_schema_snapshot(region), + ) + + +def stats(shm_name: str) -> RegionStats: + """Runtime counter snapshot.""" + region = _open(shm_name) + s = region.stats() + rings = [ + RingStats( + index=i, + state=_RING_STATE.get(r.state, "?"), + in_flight=r.in_flight, + write_pos=r.write_pos, + dropped_count=r.dropped_count, + lost_count=r.lost_count, + ) + for i, r in enumerate(s.rings) + ] + return RegionStats( + shm_name=region.name, + rings=rings, + total_writes=s.total_writes, + total_drops=s.total_drops, + total_losses=s.total_losses, + live_rings=s.live_rings, + pool_free=s.pool_free, + pool_size=s.pool_size, + ) + + +def diagnose(shm_name: str) -> HealthSnapshot: + """Wraps SharedRegion::diagnose() with an interpretation.""" + region = _open(shm_name) + r = region.diagnose() + + if r.locked_entries > 0 or r.retired_rings > 0: + status = "crash residue" + elif r.schema_stuck: + status = "schema wedged" + else: + status = "healthy" + + return HealthSnapshot( + shm_name=region.name, + locked_entries=r.locked_entries, + retired_rings=r.retired_rings, + draining_rings=r.draining_rings, + live_rings=r.live_rings, + schema_stuck=r.schema_stuck, + status=status, + ) + + +def watch(shm_name: str, interval: float = 1.0) -> Iterator[WatchSnapshot]: + """Generator yielding stats snapshots every `interval` seconds. + + First frame has zero rates. Caller drives the loop and breaks when + done. Cooperates with Ctrl-C via normal generator semantics. + """ + region = _open(shm_name) + prev: list[int] | None = None + prev_t = 0.0 + while True: + now = time.monotonic() + raw = region.stats() + if prev is None or now <= prev_t: + rates = [0.0] * len(raw.rings) + else: + dt = now - prev_t + rates = [ + max(0.0, (r.write_pos - p) / dt) for r, p in zip(raw.rings, prev) + ] + + rings = [ + RingStats( + index=i, + state=_RING_STATE.get(r.state, "?"), + in_flight=r.in_flight, + write_pos=r.write_pos, + dropped_count=r.dropped_count, + lost_count=r.lost_count, + ) + for i, r in enumerate(raw.rings) + ] + snap = RegionStats( + shm_name=region.name, + rings=rings, + total_writes=raw.total_writes, + total_drops=raw.total_drops, + total_losses=raw.total_losses, + live_rings=raw.live_rings, + pool_free=raw.pool_free, + pool_size=raw.pool_size, + ) + yield WatchSnapshot(stats=snap, rates_msg_per_sec=rates) + + prev = [r.write_pos for r in raw.rings] + prev_t = now + time.sleep(interval) + + +def schema(shm_name: str) -> SchemaSnapshot: + """Focused read of just the schema slot.""" + return _schema_snapshot(_open(shm_name)) + + +def schema_diff(shm_a: str, shm_b: str) -> SchemaDiff: + """Field-by-field diff of two schemas via `schema::diff()`. + + Raises ValueError if either region has no published schema (state != + 'set') — there's nothing meaningful to diff otherwise. + """ + a = _open(shm_a).schema() + b = _open(shm_b).schema() + if a is None or b is None: + raise ValueError("schema_diff requires both regions to have a published schema") + bits = _native.schema.diff(a, b) + return SchemaDiff( + equal=(bits == _native.schema.Diff.Equal.value), + identity=bool(bits & _native.schema.Diff.Identity.value), + layout=bool(bits & _native.schema.Diff.Layout.value), + version=bool(bits & _native.schema.Diff.Version.value), + name=bool(bits & _native.schema.Diff.Name.value), + identity_algo=bool(bits & _native.schema.Diff.IdentityAlgo.value), + layout_algo=bool(bits & _native.schema.Diff.LayoutAlgo.value), + ) + + +def repair_locked(shm_name: str) -> int: + """Commit entries stuck at LOCKED_SEQUENCE. Safe under live traffic.""" + return _open(shm_name).repair_locked_entries() + + +def reset_retired(shm_name: str) -> int: + """Reset retired rings. Only safe after confirming the crashed + publisher is gone.""" + return _open(shm_name).reset_retired_rings() + + +def reclaim_orphaned(shm_name: str) -> int: + """Reclaim orphaned pool slots. Requires full quiescence.""" + return _open(shm_name).reclaim_orphaned_slots() + + +# ---------------------------------------------------------------------- +# Topic-centric discovery +# ---------------------------------------------------------------------- + + +def _to_participant(p: "_native.Participant") -> Participant: + return Participant( + pid=p.pid, + pid_starttime=p.pid_starttime, + node_name=p.node_name, + topic_name=p.topic_name, + shm_name=p.shm_name, + kind=_KIND_NAME.get(p.kind, "?"), + channel_type=_CHANNEL_NAME.get(p.channel_type, "?"), + role=_ROLE_NAME.get(p.role, "?"), + created_at_ns=p.created_at_ns, + ) + + +def list_topics(kmsg_namespace: str = "kickmsg") -> list[TopicSummary]: + """Topic-centric enumeration grouped from the registry. + + Thin wrapper over `Registry::list_topics()` — the aggregation and + liveness classification happen in C++. This layer adds schema + enrichment (name + version from the region header, when published) + and converts the native dataclasses into Python ones. + """ + registry = _native.Registry.try_open(kmsg_namespace) + if registry is None: + return [] + + now_ns = time.time_ns() + out: list[TopicSummary] = [] + for native_t in registry.list_topics(): + schema_name = None + schema_version = None + try: + region = _native.SharedRegion.open(native_t.shm_name) + s = _schema_snapshot(region) + if s.state == "set": + schema_name = s.name + schema_version = s.version + except RuntimeError: + # Region may have been unlinked between snapshot and open — + # skip schema enrichment, keep the row. + pass + + # Topic age = now - earliest participant registration. + all_parts = (list(native_t.producers) + list(native_t.consumers) + + list(native_t.stall_producers) + list(native_t.stall_consumers)) + age_seconds = None + if all_parts: + earliest_ns = min(p.created_at_ns for p in all_parts) + if earliest_ns > 0: + age_seconds = max(0.0, (now_ns - earliest_ns) / 1e9) + + out.append(TopicSummary( + topic_name=native_t.topic_name, + shm_name=native_t.shm_name, + kind=_KIND_NAME.get(native_t.kind, "?"), + channel_type=_CHANNEL_NAME.get(native_t.channel_type, "?"), + kmsg_namespace=kmsg_namespace, + age_seconds=age_seconds, + producers=[_to_participant(p) for p in native_t.producers], + consumers=[_to_participant(p) for p in native_t.consumers], + stall_producers=[_to_participant(p) for p in native_t.stall_producers], + stall_consumers=[_to_participant(p) for p in native_t.stall_consumers], + schema_name=schema_name, + schema_version=schema_version, + )) + return out diff --git a/python/kickmsg/py.typed b/python/kickmsg/py.typed new file mode 100644 index 0000000..e69de29 diff --git a/src/Node.cc b/src/Node.cc index cec7dca..df7d6e9 100644 --- a/src/Node.cc +++ b/src/Node.cc @@ -1,75 +1,225 @@ #include "kickmsg/Node.h" +#include + #include "kickmsg/Naming.h" namespace kickmsg { - Node::Node(std::string const& name, std::string const& prefix) + Node::Node(std::string const& name, std::string const& kmsg_namespace) : name_{sanitize_shm_component(name, "node")} - , prefix_{sanitize_shm_component(prefix, "namespace")} + , namespace_{sanitize_shm_component(kmsg_namespace, "namespace")} + { + } + + Node::~Node() + { + if (registry_.has_value()) + { + for (auto const& [_, rs] : registry_slots_) + { + registry_->deregister(rs.slot_index); + } + } + registry_slots_.clear(); + } + + Registry& Node::lazy_registry() + { + if (not registry_.has_value()) + { + registry_.emplace(Registry::open_or_create(namespace_)); + } + return *registry_; + } + + namespace + { + /// Ensure the logical name starts with '/' for ROS-style display. + std::string with_leading_slash(std::string s) + { + if (s.empty() or s.front() != '/') + { + s.insert(s.begin(), '/'); + } + return s; + } + + /// Mailbox logical path: owner is part of the identity, so callers + /// (both sender and recipient) see the same "/owner/tag" topic. + std::string mailbox_topic(char const* owner, char const* tag) + { + std::string out = "/"; + out += owner; + out += '/'; + out += tag; + return out; + } + } + + void Node::touch_registry(std::string const& shm_name, + std::string const& topic_name, + channel::Type channel_type, + registry::Kind kind, + registry::Role role) { + if (registry_disabled_) + { + return; + } + auto warn_full = [&]() + { + if (registry_full_warned_) + { + return; + } + std::fprintf(stderr, + "kickmsg: registry for namespace '%s' is full; " + "participant '%s' on '%s' will not appear in discovery " + "(further registry-full events suppressed on this Node)\n", + namespace_.c_str(), name_.c_str(), shm_name.c_str()); + registry_full_warned_ = true; + }; + try + { + auto& reg = lazy_registry(); + auto it = registry_slots_.find(shm_name); + if (it != registry_slots_.end()) + { + if (it->second.role != role and it->second.role != registry::Both) + { + // Upgrade to Both via dereg + re-register; brief + // visibility gap during the swap is acceptable since + // the registry is diagnostic-only. On fill-failure + // of the Both re-register, fall back to re-registering + // the original role to keep at least partial discovery. + reg.deregister(it->second.slot_index); + uint32_t slot = reg.register_participant( + shm_name, topic_name, channel_type, kind, + registry::Both, name_); + if (slot == INVALID_SLOT) + { + registry::Role prior = it->second.role; + uint32_t fallback = reg.register_participant( + shm_name, topic_name, channel_type, kind, + prior, name_); + if (fallback == INVALID_SLOT) + { + warn_full(); + registry_slots_.erase(it); + return; + } + it->second = RegistrySlot{fallback, prior}; + return; + } + it->second = RegistrySlot{slot, registry::Both}; + } + return; + } + + uint32_t slot = reg.register_participant( + shm_name, topic_name, channel_type, kind, role, name_); + if (slot == INVALID_SLOT) + { + warn_full(); + return; + } + registry_slots_[shm_name] = RegistrySlot{slot, role}; + } + catch (std::exception const& e) + { + // Latch to avoid stderr spam on a Node that brings up many topics. + std::fprintf(stderr, + "kickmsg: registry unavailable for namespace '%s': %s " + "(further registry failures will be silent on this Node)\n", + namespace_.c_str(), e.what()); + registry_disabled_ = true; + } } Publisher Node::advertise(char const* topic, channel::Config const& cfg) { - auto shm_name = make_topic_name(topic); + auto shm_name = make_topic_name(topic); + auto topic_path = with_leading_slash(topic); auto [it, _] = regions_.emplace( shm_name, SharedRegion::create(shm_name.c_str(), channel::PubSub, cfg, name_.c_str())); + touch_registry(shm_name, topic_path, channel::PubSub, + registry::Pubsub, registry::Publisher); return Publisher(it->second); } Subscriber Node::subscribe(char const* topic) { - auto shm_name = make_topic_name(topic); + auto shm_name = make_topic_name(topic); + auto topic_path = with_leading_slash(topic); if (auto* r = find_region(shm_name)) { + touch_registry(shm_name, topic_path, channel::PubSub, + registry::Pubsub, registry::Subscriber); return Subscriber(*r); } auto [it, _] = regions_.emplace( shm_name, SharedRegion::open(shm_name.c_str())); + touch_registry(shm_name, topic_path, channel::PubSub, + registry::Pubsub, registry::Subscriber); return Subscriber(it->second); } Publisher Node::advertise_or_join(char const* topic, channel::Config const& cfg) { - auto shm_name = make_topic_name(topic); + auto shm_name = make_topic_name(topic); + auto topic_path = with_leading_slash(topic); if (auto* r = find_region(shm_name)) { + touch_registry(shm_name, topic_path, channel::PubSub, + registry::Pubsub, registry::Publisher); return Publisher(*r); } auto [it, _] = regions_.emplace( shm_name, SharedRegion::create_or_open( shm_name.c_str(), channel::PubSub, cfg, name_.c_str())); + touch_registry(shm_name, topic_path, channel::PubSub, + registry::Pubsub, registry::Publisher); return Publisher(it->second); } Subscriber Node::subscribe_or_create(char const* topic, channel::Config const& cfg) { - auto shm_name = make_topic_name(topic); + auto shm_name = make_topic_name(topic); + auto topic_path = with_leading_slash(topic); if (auto* r = find_region(shm_name)) { + touch_registry(shm_name, topic_path, channel::PubSub, + registry::Pubsub, registry::Subscriber); return Subscriber(*r); } auto [it, _] = regions_.emplace( shm_name, SharedRegion::create_or_open( shm_name.c_str(), channel::PubSub, cfg, name_.c_str())); + touch_registry(shm_name, topic_path, channel::PubSub, + registry::Pubsub, registry::Subscriber); return Subscriber(it->second); } BroadcastHandle Node::join_broadcast(char const* channel, channel::Config const& cfg) { - auto shm_name = make_broadcast_name(channel); + auto shm_name = make_broadcast_name(channel); + auto topic_path = with_leading_slash(channel); if (auto* r = find_region(shm_name)) { + touch_registry(shm_name, topic_path, channel::Broadcast, + registry::Broadcast, registry::Both); return BroadcastHandle{Publisher{*r}, Subscriber{*r}}; } auto [it, _] = regions_.emplace( shm_name, SharedRegion::create_or_open( shm_name.c_str(), channel::Broadcast, cfg, name_.c_str())); + touch_registry(shm_name, topic_path, channel::Broadcast, + registry::Broadcast, registry::Both); return BroadcastHandle{Publisher{it->second}, Subscriber{it->second}}; } @@ -77,22 +227,32 @@ namespace kickmsg { channel::Config mbx_cfg = cfg; mbx_cfg.max_subscribers = 1; - auto shm_name = make_mailbox_name(name_.c_str(), tag); + auto shm_name = make_mailbox_name(name_.c_str(), tag); + auto topic_path = mailbox_topic(name_.c_str(), tag); auto [it, _] = regions_.emplace( shm_name, SharedRegion::create(shm_name.c_str(), channel::PubSub, mbx_cfg, name_.c_str())); + // Mailbox owner is the one who receives — Subscriber role. + touch_registry(shm_name, topic_path, channel::PubSub, + registry::Mailbox, registry::Subscriber); return Subscriber(it->second); } Publisher Node::open_mailbox(char const* owner_node, char const* tag) { - auto shm_name = make_mailbox_name(owner_node, tag); + auto shm_name = make_mailbox_name(owner_node, tag); + auto topic_path = mailbox_topic(owner_node, tag); if (auto* r = find_region(shm_name)) { + touch_registry(shm_name, topic_path, channel::PubSub, + registry::Mailbox, registry::Publisher); return Publisher(*r); } auto [it, _] = regions_.emplace( shm_name, SharedRegion::open(shm_name.c_str())); + // Mailbox sender is the Publisher side. + touch_registry(shm_name, topic_path, channel::PubSub, + registry::Mailbox, registry::Publisher); return Publisher(it->second); } @@ -138,20 +298,20 @@ namespace kickmsg std::string Node::make_topic_name(char const* topic) const { - // prefix_ is pre-sanitized in the ctor; topic is user-supplied on + // namespace_ is pre-sanitized in the ctor; topic is user-supplied on // each call and may be a ROS-style "/a/b/c" path. - return "/" + prefix_ + "_" + sanitize_shm_component(topic, "topic"); + return "/" + namespace_ + "_" + sanitize_shm_component(topic, "topic"); } std::string Node::make_broadcast_name(char const* channel) const { - return "/" + prefix_ + "_broadcast_" + return "/" + namespace_ + "_broadcast_" + sanitize_shm_component(channel, "channel"); } std::string Node::make_mailbox_name(char const* owner, char const* tag) const { - return "/" + prefix_ + "_" + return "/" + namespace_ + "_" + sanitize_shm_component(owner, "mailbox owner") + "_mbx_" + sanitize_shm_component(tag, "mailbox tag"); } diff --git a/src/Publisher.cc b/src/Publisher.cc index 19f1102..ef14fe8 100644 --- a/src/Publisher.cc +++ b/src/Publisher.cc @@ -185,6 +185,7 @@ namespace kickmsg } ++dropped_; + ring->dropped_count.fetch_add(1, std::memory_order_relaxed); ++excess; ring->state_flight.fetch_sub(ring::IN_FLIGHT_ONE, std::memory_order_release); @@ -263,7 +264,7 @@ namespace kickmsg microseconds timeout) { constexpr int CHECK_INTERVAL = 1024; - nanoseconds start = kickmsg::since_epoch(); + nanoseconds start = kickmsg::monotonic_ns(); int i = 0; while (true) diff --git a/src/Region.cc b/src/Region.cc index f8afb02..5e25e4c 100644 --- a/src/Region.cc +++ b/src/Region.cc @@ -2,14 +2,9 @@ #include #include #include -#ifdef _WIN32 -#include -#define getpid _getpid -#else -#include -#endif #include "kickmsg/Region.h" +#include "kickmsg/os/Process.h" #include "kickmsg/os/Time.h" namespace kickmsg @@ -89,7 +84,7 @@ namespace kickmsg h->sub_ring_stride = ring_stride; h->commit_timeout_us = static_cast(cfg.commit_timeout.count()); h->config_hash = compute_config_hash(type, cfg); - h->creator_pid = static_cast(getpid()); + h->creator_pid = kickmsg::current_pid(); h->created_at_ns = static_cast(kickmsg::since_epoch().count()); h->creator_name_len = creator_len; std::memcpy(header_creator_name(h), creator_name, creator_len); @@ -120,9 +115,11 @@ namespace kickmsg for (uint32_t i = 0; i < cfg.max_subscribers; ++i) { auto* ring = sub_ring_at(base(), h, i); - ring->state_flight = ring::make_packed(ring::Free); - ring->write_pos = 0; - ring->has_waiter = 0; + ring->state_flight = ring::make_packed(ring::Free); + ring->write_pos = 0; + ring->has_waiter = 0; + ring->dropped_count = 0; + ring->lost_count = 0; } // Write magic LAST with release: create_or_open() polls magic with @@ -449,6 +446,94 @@ namespace kickmsg std::memory_order_relaxed); } + RegionStats SharedRegion::stats() const + { + auto const* b = base(); + auto const* h = header(); + + RegionStats out{}; + out.pool_size = h->pool_size; + out.rings.reserve(h->max_subs); + + for (uint64_t i = 0; i < h->max_subs; ++i) + { + // sub_ring_at needs a non-const base*/header*, but the operation + // is read-only — const_cast is safe here. + auto* ring = sub_ring_at(const_cast(b), + h, static_cast(i)); + uint32_t packed = ring->state_flight.load(std::memory_order_acquire); + + RingStats rs{}; + rs.state = static_cast(ring::get_state(packed)); + rs.in_flight = ring::get_in_flight(packed); + rs.write_pos = ring->write_pos.load(std::memory_order_acquire); + rs.dropped_count = ring->dropped_count.load(std::memory_order_relaxed); + rs.lost_count = ring->lost_count.load(std::memory_order_relaxed); + + if (rs.state == ring::Live) + { + ++out.live_rings; + } + // Max across ALL rings: a Free ring's write_pos is frozen at + // whatever value it had when its last subscriber left, so it's + // a valid past observation. Using max (not sum) matches the + // "publish events observed by the channel" semantic and stays + // monotonic across subscriber churn. + if (rs.write_pos > out.total_writes) + { + out.total_writes = rs.write_pos; + } + out.total_drops += rs.dropped_count; + out.total_losses += rs.lost_count; + + out.rings.push_back(rs); + } + + // Approximate free-slot count: walk the Treiber stack from the head, + // bounded by pool_size so a concurrent push/pop storm can't fool us + // into an unbounded loop. Under churn we can undercount (a slot + // being popped mid-walk) or overcount (a slot's next_free pointing + // to a just-pushed node we've already counted) — acceptable for a + // diagnostic view. + uint64_t top = h->free_top.load(std::memory_order_acquire); + uint32_t idx = tagged_idx(top); + uint64_t count = 0; + uint64_t const limit = h->pool_size; + while (idx != INVALID_SLOT and count < limit) + { + if (idx >= h->pool_size) break; + auto* slot = slot_at(const_cast(b), h, idx); + idx = slot->next_free.load(std::memory_order_relaxed); + ++count; + } + out.pool_free = count; + + return out; + } + + RegionInfo SharedRegion::info() const + { + auto const* h = header(); + RegionInfo out{}; + out.shm_name = name_; + out.channel_type = h->channel_type; + out.version = h->version; + out.config_hash = h->config_hash; + out.total_size = h->total_size; + out.max_subs = h->max_subs; + out.sub_ring_capacity = h->sub_ring_capacity; + out.pool_size = h->pool_size; + out.max_payload_size = h->slot_data_size; + out.commit_timeout_us = h->commit_timeout_us; + out.creator_pid = h->creator_pid; + out.created_at_ns = h->created_at_ns; + + // Creator name tail: bytes written at offset sizeof(Header). + auto const* tail = static_cast(base()) + sizeof(Header); + out.creator_name.assign(tail, h->creator_name_len); + return out; + } + std::size_t SharedRegion::reclaim_orphaned_slots() { auto* b = base(); diff --git a/src/Registry.cc b/src/Registry.cc new file mode 100644 index 0000000..4425bf0 --- /dev/null +++ b/src/Registry.cc @@ -0,0 +1,434 @@ +#include "kickmsg/Registry.h" + +#include +#include +#include +#include + +#include "kickmsg/Naming.h" +#include "kickmsg/os/Process.h" +#include "kickmsg/os/Time.h" + +namespace kickmsg +{ + std::size_t Registry::region_size(uint32_t capacity) + { + return sizeof(RegistryHeader) + + static_cast(capacity) * sizeof(ParticipantEntry); + } + + std::string Registry::make_shm_name(std::string const& kmsg_namespace) + { + return "/" + sanitize_shm_component(kmsg_namespace, "namespace") + + "_registry"; + } + + RegistryHeader* Registry::header() + { + return static_cast(shm_.address()); + } + + RegistryHeader const* Registry::header() const + { + return static_cast(shm_.address()); + } + + ParticipantEntry* Registry::entries() + { + return reinterpret_cast( + static_cast(shm_.address()) + sizeof(RegistryHeader)); + } + + ParticipantEntry const* Registry::entries() const + { + return reinterpret_cast( + static_cast(shm_.address()) + sizeof(RegistryHeader)); + } + + uint32_t Registry::capacity() const + { + return header()->capacity; + } + + void Registry::init_as_creator(uint32_t capacity) + { + std::memset(shm_.address(), 0, region_size(capacity)); + + auto* h = header(); + h->version = registry::VERSION; + h->capacity = capacity; + + // MAGIC published last — readers spin on it with acquire. + h->magic.store(registry::MAGIC, std::memory_order_release); + } + + std::optional Registry::spin_open(std::string const& name) + { + for (int i = 0; i < 200; ++i) + { + SharedMemory shm; + if (shm.try_open(name)) + { + auto const* h = static_cast(shm.address()); + if (h->magic.load(std::memory_order_acquire) == registry::MAGIC) + { + if (h->version != registry::VERSION) + { + throw std::runtime_error( + "Registry version mismatch on " + name); + } + Registry out; + out.name_ = name; + out.shm_ = std::move(shm); + return out; + } + } + kickmsg::sleep(10ms); + } + return std::nullopt; + } + + Registry Registry::open_or_create(std::string const& kmsg_namespace, uint32_t capacity) + { + if (capacity == 0) + { + throw std::invalid_argument("Registry capacity must be > 0"); + } + + std::string name = make_shm_name(kmsg_namespace); + std::size_t bytes = region_size(capacity); + + { + Registry r; + r.name_ = name; + if (r.shm_.try_create(name, bytes)) + { + r.init_as_creator(capacity); + return r; + } + } + + auto opened = spin_open(name); + if (opened.has_value()) + { + return std::move(*opened); + } + throw std::runtime_error("Timed out waiting for registry init: " + name); + } + + std::optional Registry::try_open(std::string const& kmsg_namespace) + { + std::string name = make_shm_name(kmsg_namespace); + SharedMemory probe; + if (not probe.try_open(name)) + { + return std::nullopt; + } + return spin_open(name); + } + + void Registry::unlink(std::string const& kmsg_namespace) + { + SharedMemory::unlink(make_shm_name(kmsg_namespace)); + } + + uint32_t Registry::register_participant(std::string const& shm_name, + std::string const& topic_name, + channel::Type channel_type, + registry::Kind kind, + registry::Role role, + std::string const& node_name) + { + auto try_claim = [&]() -> uint32_t + { + auto* h = header(); + auto* es = entries(); + uint32_t cap = h->capacity; + + auto copy_field = [](char* dst, std::size_t dst_size, + std::string const& src) + { + std::memset(dst, 0, dst_size); + std::size_t n = std::min(src.size(), dst_size - 1); + std::memcpy(dst, src.data(), n); + }; + + uint64_t my_pid = current_pid(); + uint64_t my_starttime = process_starttime(my_pid); + uint64_t now_ns = static_cast( + kickmsg::since_epoch().count()); + + for (uint32_t i = 0; i < cap; ++i) + { + uint32_t expected = registry::Free; + if (not es[i].state.compare_exchange_strong( + expected, registry::Claiming, + std::memory_order_acq_rel, + std::memory_order_relaxed)) + { + continue; + } + + // pid_starttime must be written before pid's release-store + // so a sweeper's acquire-load of pid sees a matching + // starttime. + es[i].pid_starttime.store(my_starttime, std::memory_order_relaxed); + es[i].pid.store(my_pid, std::memory_order_release); + + es[i].generation.fetch_add(1, std::memory_order_relaxed); + + es[i].channel_type.store(static_cast(channel_type), + std::memory_order_relaxed); + es[i].role.store(static_cast(role), + std::memory_order_relaxed); + es[i].kind.store(static_cast(kind), + std::memory_order_relaxed); + es[i].created_at_ns.store(now_ns, std::memory_order_relaxed); + + copy_field(es[i].shm_name, sizeof(es[i].shm_name), shm_name); + copy_field(es[i].topic_name, sizeof(es[i].topic_name), topic_name); + copy_field(es[i].node_name, sizeof(es[i].node_name), node_name); + std::memset(es[i]._padding, 0, sizeof(es[i]._padding)); + + es[i].state.store(registry::Active, std::memory_order_release); + return i; + } + return INVALID_SLOT; + }; + + uint32_t slot = try_claim(); + if (slot != INVALID_SLOT) + { + return slot; + } + // Registry full — sweep dead-pid residue and retry. Bounded to + // avoid livelock when many registrants race on a full registry. + for (int attempt = 0; attempt < 3; ++attempt) + { + if (sweep_stale() == 0) + { + break; + } + slot = try_claim(); + if (slot != INVALID_SLOT) + { + return slot; + } + } + return INVALID_SLOT; + } + + void Registry::deregister(uint32_t slot_index) + { + if (slot_index == INVALID_SLOT) + { + return; + } + auto* h = header(); + auto* es = entries(); + if (slot_index >= h->capacity) + { + return; + } + // Fields are intentionally not zeroed: a concurrent snapshot may + // still be reading them, and partial zeroing before state=Free + // would create torn reads that the seqlock can't catch. The + // next claim overwrites every field. + es[slot_index].generation.fetch_add(1, std::memory_order_relaxed); + es[slot_index].state.store(registry::Free, std::memory_order_release); + } + + std::vector Registry::snapshot() const + { + auto const* h = header(); + auto const* es = entries(); + uint32_t cap = h->capacity; + + std::vector out; + out.reserve(cap); + for (uint32_t i = 0; i < cap; ++i) + { + uint32_t s1 = es[i].state.load(std::memory_order_acquire); + if (s1 != registry::Active) + { + continue; + } + uint32_t g1 = es[i].generation.load(std::memory_order_acquire); + + Participant p{}; + p.pid = es[i].pid.load(std::memory_order_relaxed); + p.pid_starttime = es[i].pid_starttime.load(std::memory_order_relaxed); + p.created_at_ns = es[i].created_at_ns.load(std::memory_order_relaxed); + p.channel_type = es[i].channel_type.load(std::memory_order_relaxed); + p.role = es[i].role.load(std::memory_order_relaxed); + p.kind = es[i].kind.load(std::memory_order_relaxed); + p.shm_name.assign( + es[i].shm_name, + ::strnlen(es[i].shm_name, sizeof(es[i].shm_name))); + p.topic_name.assign( + es[i].topic_name, + ::strnlen(es[i].topic_name, sizeof(es[i].topic_name))); + p.node_name.assign( + es[i].node_name, + ::strnlen(es[i].node_name, sizeof(es[i].node_name))); + + // Seqlock recheck: reject if state changed or generation bumped. + uint32_t g2 = es[i].generation.load(std::memory_order_acquire); + uint32_t s2 = es[i].state.load(std::memory_order_acquire); + if (s2 != registry::Active or g1 != g2) + { + continue; + } + out.push_back(std::move(p)); + } + return out; + } + + std::vector Registry::list_topics() const + { + auto raw = snapshot(); + + std::unordered_map by_shm; + by_shm.reserve(raw.size()); + + for (auto const& p : raw) + { + auto [iter, inserted] = by_shm.try_emplace(p.shm_name); + auto& sum = iter->second; + if (inserted) + { + sum.shm_name = p.shm_name; + sum.topic_name = p.topic_name; + sum.channel_type = p.channel_type; + sum.kind = p.kind; + } + + bool alive = process_exists(p.pid); + bool is_pub = (p.role == registry::Publisher + or p.role == registry::Both); + bool is_sub = (p.role == registry::Subscriber + or p.role == registry::Both); + + if (is_pub) + { + if (alive) + { + sum.producers.push_back(p); + } + else + { + sum.stall_producers.push_back(p); + } + } + if (is_sub) + { + if (alive) + { + sum.consumers.push_back(p); + } + else + { + sum.stall_consumers.push_back(p); + } + } + } + + std::vector out; + out.reserve(by_shm.size()); + for (auto& [_, sum] : by_shm) + { + out.push_back(std::move(sum)); + } + std::sort(out.begin(), out.end(), + [](TopicSummary const& a, TopicSummary const& b) + { return a.shm_name < b.shm_name; }); + return out; + } + + uint32_t Registry::sweep_stale() + { + auto* h = header(); + auto* es = entries(); + uint32_t cap = h->capacity; + + // Live PID + matching boot-relative starttime = same process. + // If either starttime is 0 (platform doesn't expose it), degrade + // to trust-pid-alone. + auto is_dead = [](uint64_t pid, uint64_t stored_start) -> bool + { + if (not process_exists(pid)) + { + return true; + } + uint64_t live_start = process_starttime(pid); + if (stored_start == 0 or live_start == 0) + { + return false; + } + return stored_start != live_start; + }; + + uint32_t freed = 0; + for (uint32_t i = 0; i < cap; ++i) + { + uint32_t s = es[i].state.load(std::memory_order_acquire); + if (s != registry::Active and s != registry::Claiming) + { + continue; + } + // Acquire syncs with register_participant's release-store of + // pid, so we see a valid pid even while state==Claiming. + uint64_t pid = es[i].pid.load(std::memory_order_acquire); + if (pid == 0) + { + // Registrant hasn't stored its pid yet — reclaiming here + // would race with its pending field writes. + continue; + } + uint64_t stored_start = es[i].pid_starttime.load( + std::memory_order_relaxed); + if (not is_dead(pid, stored_start)) + { + continue; + } + + // Phase 1: CAS to Reclaiming to block concurrent registrants + // and close the ABA window on a direct state→Free CAS. + uint32_t expected = s; + if (not es[i].state.compare_exchange_strong( + expected, registry::Reclaiming, + std::memory_order_acq_rel, + std::memory_order_relaxed)) + { + continue; + } + + // Re-verify under our exclusive hold. A full dereg+register + // cycle could have slipped in between our initial pid read + // and the CAS above. + uint64_t post_pid = es[i].pid.load(std::memory_order_acquire); + uint64_t post_start = es[i].pid_starttime.load( + std::memory_order_relaxed); + if (post_pid != pid or post_start != stored_start or + not is_dead(post_pid, post_start)) + { + // Restore via CAS, not blind store: a live tenant may + // have legitimately dereg'd (state==Free) and a blind + // store of `s` would resurrect the slot. + uint32_t reclaiming_expected = registry::Reclaiming; + es[i].state.compare_exchange_strong( + reclaiming_expected, s, + std::memory_order_release, + std::memory_order_relaxed); + continue; + } + + // Phase 2: finalize. Fields are left untouched — same + // reasoning as deregister(). + es[i].generation.fetch_add(1, std::memory_order_relaxed); + es[i].state.store(registry::Free, std::memory_order_release); + ++freed; + } + return freed; + } +} diff --git a/src/Subscriber.cc b/src/Subscriber.cc index 076b3d2..c901bf6 100644 --- a/src/Subscriber.cc +++ b/src/Subscriber.cc @@ -71,7 +71,7 @@ namespace kickmsg // Wait for all admitted publishers to finish. bool quiesced = true; microseconds deadline{header_->commit_timeout_us}; - nanoseconds start = kickmsg::since_epoch(); + nanoseconds start = kickmsg::monotonic_ns(); while (ring::get_in_flight( ring->state_flight.load(std::memory_order_acquire)) > 0) { @@ -85,7 +85,7 @@ namespace kickmsg ++drain_timeouts_; break; } - kickmsg::sleep(0ns); + kickmsg::yield(); } if (quiesced) @@ -168,7 +168,9 @@ namespace kickmsg uint64_t capacity = header_->sub_ring_capacity; if (wp - read_pos_ > capacity) { - lost_ += (wp - read_pos_) - capacity; + uint64_t skipped = (wp - read_pos_) - capacity; + lost_ += skipped; + ring->lost_count.fetch_add(skipped, std::memory_order_relaxed); read_pos_ = wp - capacity; } @@ -189,6 +191,7 @@ namespace kickmsg } // Entry was overwritten (seq > expected): advance and retry. ++lost_; + ring->lost_count.fetch_add(1, std::memory_order_relaxed); ++read_pos_; continue; } @@ -199,6 +202,7 @@ namespace kickmsg if (slot_idx >= header_->pool_size or payload_len > header_->slot_data_size) { ++lost_; + ring->lost_count.fetch_add(1, std::memory_order_relaxed); ++read_pos_; continue; } @@ -224,6 +228,7 @@ namespace kickmsg { // refcount == 0: slot already freed, count as lost. ++lost_; + ring->lost_count.fetch_add(1, std::memory_order_relaxed); ++read_pos_; continue; } @@ -240,6 +245,7 @@ namespace kickmsg treiber_push(header_->free_top, slot, slot_idx); } ++lost_; + ring->lost_count.fetch_add(1, std::memory_order_relaxed); ++read_pos_; continue; } @@ -262,7 +268,7 @@ namespace kickmsg std::optional Subscriber::receive(nanoseconds timeout) { auto* ring = sub_ring_at(base_, header_, ring_idx_); - nanoseconds start = kickmsg::since_epoch(); + nanoseconds start = kickmsg::monotonic_ns(); while (true) { @@ -308,7 +314,9 @@ namespace kickmsg uint64_t capacity = header_->sub_ring_capacity; if (wp - read_pos_ > capacity) { - lost_ += (wp - read_pos_) - capacity; + uint64_t skipped = (wp - read_pos_) - capacity; + lost_ += skipped; + ring->lost_count.fetch_add(skipped, std::memory_order_relaxed); read_pos_ = wp - capacity; } @@ -324,6 +332,7 @@ namespace kickmsg return std::nullopt; } ++lost_; + ring->lost_count.fetch_add(1, std::memory_order_relaxed); ++read_pos_; continue; } @@ -334,6 +343,7 @@ namespace kickmsg if (slot_idx >= header_->pool_size or payload_len > header_->slot_data_size) { ++lost_; + ring->lost_count.fetch_add(1, std::memory_order_relaxed); ++read_pos_; continue; } @@ -355,6 +365,7 @@ namespace kickmsg if (not pinned) { ++lost_; + ring->lost_count.fetch_add(1, std::memory_order_relaxed); ++read_pos_; continue; } @@ -370,6 +381,7 @@ namespace kickmsg treiber_push(header_->free_top, slot, slot_idx); } ++lost_; + ring->lost_count.fetch_add(1, std::memory_order_relaxed); ++read_pos_; continue; } @@ -383,7 +395,7 @@ namespace kickmsg std::optional Subscriber::receive_view(nanoseconds timeout) { auto* ring = sub_ring_at(base_, header_, ring_idx_); - nanoseconds start = kickmsg::since_epoch(); + nanoseconds start = kickmsg::monotonic_ns(); while (true) { diff --git a/src/os/darwin/Process.cc b/src/os/darwin/Process.cc new file mode 100644 index 0000000..feec902 --- /dev/null +++ b/src/os/darwin/Process.cc @@ -0,0 +1,30 @@ +#include "kickmsg/os/Process.h" + +#include +#include +#include + +namespace kickmsg +{ + uint64_t process_starttime(uint64_t pid) noexcept + { + if (pid == 0) + { + return 0; + } + // sysctl({CTL_KERN, KERN_PROC, KERN_PROC_PID, pid}) → kinfo_proc. + // kp_proc.p_starttime is a struct timeval set at fork time; pack + // into microseconds-since-epoch for a single uint64_t. + int mib[4] = {CTL_KERN, KERN_PROC, KERN_PROC_PID, + static_cast(pid)}; + struct kinfo_proc kp; + std::size_t len = sizeof(kp); + if (::sysctl(mib, 4, &kp, &len, nullptr, 0) != 0 or len == 0) + { + return 0; + } + auto const& tv = kp.kp_proc.p_starttime; + return static_cast(tv.tv_sec) * 1'000'000ull + + static_cast(tv.tv_usec); + } +} diff --git a/src/os/darwin/SharedMemory.cc b/src/os/darwin/SharedMemory.cc index 4c48089..5efeae4 100644 --- a/src/os/darwin/SharedMemory.cc +++ b/src/os/darwin/SharedMemory.cc @@ -1,5 +1,5 @@ -// macOS uses POSIX shared memory (same as Linux). -// shm_open / ftruncate / mmap are available on all supported macOS versions. +// macOS-specific SharedMemory::create(). Other methods live in +// src/os/posix/SharedMemory.cc. #include "kickmsg/os/SharedMemory.h" #include @@ -8,173 +8,30 @@ #include #include #include -#include -#include namespace kickmsg { - static void throw_system_error(char const* context) - { - throw std::system_error(errno, std::system_category(), context); - } - - SharedMemory::SharedMemory(SharedMemory&& other) noexcept - : size_{other.size_} - , address_{other.address_} - , fd_{other.fd_} - { - other.size_ = 0; - other.address_ = nullptr; - other.fd_ = INVALID_SHM_HANDLE; - } - - SharedMemory& SharedMemory::operator=(SharedMemory&& other) noexcept - { - if (this != &other) - { - close(); - size_ = other.size_; - address_ = other.address_; - fd_ = other.fd_; - other.size_ = 0; - other.address_ = nullptr; - other.fd_ = INVALID_SHM_HANDLE; - } - return *this; - } - - SharedMemory::~SharedMemory() - { - close(); - } - - void SharedMemory::map(std::size_t size) - { - if (::ftruncate(fd_, static_cast(size)) < 0) - { - ::close(fd_); - fd_ = INVALID_SHM_HANDLE; - throw_system_error("SharedMemory: ftruncate()"); - } - - address_ = ::mmap(nullptr, size, PROT_READ | PROT_WRITE, MAP_SHARED, fd_, 0); - if (address_ == MAP_FAILED) - { - address_ = nullptr; - ::close(fd_); - fd_ = INVALID_SHM_HANDLE; - throw_system_error("SharedMemory: mmap()"); - } - - size_ = size; - } - void SharedMemory::create(std::string const& name, std::size_t size) { - // macOS has two shm_open / ftruncate quirks that the original - // `O_CREAT | O_TRUNC` Linux pattern trips over: - // 1. shm_open(O_CREAT|O_TRUNC) on an existing SHM object - // returns EINVAL — Linux accepts it, Darwin rejects it. - // 2. ftruncate() can only be called once per SHM object; a - // second call on the same object returns EINVAL. - // Unlink-then-exclusive-create sidesteps both: the subsequent - // shm_open sees a name that either didn't exist or was just - // detached, and the following ftruncate is always the first - // sizing call on a fresh object. + // Darwin's shm_open(O_CREAT|O_TRUNC) returns EINVAL on an existing + // object, and ftruncate can only be called once per object. Unlink + // first, then exclusive-create, to sidestep both. // - // This function is called by SharedRegion::create() (the strict - // factory where the caller intends exclusive ownership). The - // race-prone caller SharedRegion::create_or_open() was refactored - // to NOT re-enter this function after its try_create probe — it - // stamps the header directly on the probe's mapping. + // NOTE: the unlink + exclusive-create sequence is NOT safe under + // concurrent callers (two processes could both unlink, then both + // exclusive-create different objects with the same name). This + // function is called only from SharedRegion::create(), whose + // documented contract is "caller intends exclusive ownership" + // (publisher-arrives-first ordering). The race-prone multi- + // creator path uses try_create (in posix/SharedMemory.cc), which + // is a pure O_EXCL probe with no unlink and is safe. ::shm_unlink(name.c_str()); fd_ = ::shm_open(name.c_str(), O_RDWR | O_CREAT | O_EXCL, 0666); if (fd_ < 0) { - throw_system_error("SharedMemory: shm_open(create)"); + throw std::system_error(errno, std::system_category(), + "SharedMemory: shm_open(create)"); } map(size); } - - bool SharedMemory::try_create(std::string const& name, std::size_t size) - { - // Keep the fd and do the full setup (ftruncate + mmap) inline. - // We must NOT close the fd and call create() — create() would - // shm_unlink the name (to sidestep Darwin's O_TRUNC quirk) and - // recreate a different object, racing any concurrent caller that - // observed the original name between our close and create's CAS. - fd_ = ::shm_open(name.c_str(), O_RDWR | O_CREAT | O_EXCL, 0666); - if (fd_ < 0) - { - if (errno == EEXIST) - { - fd_ = INVALID_SHM_HANDLE; - return false; - } - throw_system_error("SharedMemory: shm_open(try_create)"); - } - map(size); - return true; - } - - void SharedMemory::open(std::string const& name) - { - if (not try_open(name)) - { - throw_system_error("SharedMemory: shm_open(open)"); - } - } - - bool SharedMemory::try_open(std::string const& name) - { - fd_ = ::shm_open(name.c_str(), O_RDWR, 0); - if (fd_ < 0) - { - if (errno == ENOENT) - { - fd_ = INVALID_SHM_HANDLE; - return false; - } - throw_system_error("SharedMemory: shm_open(try_open)"); - } - - struct stat st{}; - if (::fstat(fd_, &st) < 0) - { - ::close(fd_); - fd_ = INVALID_SHM_HANDLE; - throw_system_error("SharedMemory: fstat()"); - } - - size_ = static_cast(st.st_size); - address_ = ::mmap(nullptr, size_, PROT_READ | PROT_WRITE, MAP_SHARED, fd_, 0); - if (address_ == MAP_FAILED) - { - address_ = nullptr; - ::close(fd_); - fd_ = INVALID_SHM_HANDLE; - throw_system_error("SharedMemory: mmap()"); - } - return true; - } - - void SharedMemory::close() - { - if (address_ != nullptr) - { - ::munmap(address_, size_); - address_ = nullptr; - } - if (fd_ != INVALID_SHM_HANDLE) - { - ::close(fd_); - fd_ = INVALID_SHM_HANDLE; - } - size_ = 0; - } - - void SharedMemory::unlink(std::string const& name) - { - ::shm_unlink(name.c_str()); - } } diff --git a/src/os/darwin/Time.cc b/src/os/darwin/Time.cc index 0f67177..d7348b4 100644 --- a/src/os/darwin/Time.cc +++ b/src/os/darwin/Time.cc @@ -1,5 +1,5 @@ -// macOS uses POSIX clock_gettime (available since macOS 10.12). -// clock_nanosleep is NOT available on macOS — use nanosleep instead. +// macOS-specific sleep(). clock_nanosleep is unavailable; nanosleep is the +// POSIX fallback. Other Time entry points live in src/os/posix/Time.cc. #include "kickmsg/os/Time.h" #include @@ -18,7 +18,7 @@ namespace kickmsg while (true) { timespec required = remaining; - int result = nanosleep(&required, &remaining); + int result = ::nanosleep(&required, &remaining); if (result == 0) { return; @@ -30,16 +30,4 @@ namespace kickmsg throw std::system_error(errno, std::system_category(), "nanosleep()"); } } - - nanoseconds since_epoch() - { - timespec ts; - clock_gettime(CLOCK_MONOTONIC, &ts); - return seconds{ts.tv_sec} + nanoseconds{ts.tv_nsec}; - } - - nanoseconds elapsed_time(nanoseconds start) - { - return since_epoch() - start; - } } diff --git a/src/os/linux/Process.cc b/src/os/linux/Process.cc new file mode 100644 index 0000000..f839d90 --- /dev/null +++ b/src/os/linux/Process.cc @@ -0,0 +1,52 @@ +#include "kickmsg/os/Process.h" + +#include +#include + +namespace kickmsg +{ + uint64_t process_starttime(uint64_t pid) noexcept + { + if (pid == 0) + { + return 0; + } + // /proc//stat field 22 (`starttime`, clock ticks since boot). + // The `comm` field (2) can contain spaces and parens — skip to + // the last ')' and parse space-separated fields from there. + char path[64]; + std::snprintf(path, sizeof(path), "/proc/%llu/stat", + static_cast(pid)); + std::FILE* f = std::fopen(path, "r"); + if (f == nullptr) + { + return 0; + } + char buf[512]; + std::size_t n = std::fread(buf, 1, sizeof(buf) - 1, f); + std::fclose(f); + if (n == 0) + { + return 0; + } + buf[n] = '\0'; + char const* close_paren = std::strrchr(buf, ')'); + if (close_paren == nullptr) + { + return 0; + } + char const* p = close_paren + 1; + for (int i = 0; i < 19; ++i) + { + while (*p == ' ') ++p; + while (*p != '\0' and *p != ' ') ++p; + } + while (*p == ' ') ++p; + unsigned long long starttime = 0; + if (std::sscanf(p, "%llu", &starttime) != 1) + { + return 0; + } + return static_cast(starttime); + } +} diff --git a/src/os/linux/SharedMemory.cc b/src/os/linux/SharedMemory.cc index d2ac21b..3503a37 100644 --- a/src/os/linux/SharedMemory.cc +++ b/src/os/linux/SharedMemory.cc @@ -1,3 +1,5 @@ +// Linux-specific SharedMemory::create(). Other methods live in +// src/os/posix/SharedMemory.cc. #include "kickmsg/os/SharedMemory.h" #include @@ -6,155 +8,17 @@ #include #include #include -#include -#include namespace kickmsg { - static void throw_system_error(char const* context) - { - throw std::system_error(errno, std::system_category(), context); - } - - SharedMemory::SharedMemory(SharedMemory&& other) noexcept - : size_{other.size_} - , address_{other.address_} - , fd_{other.fd_} - { - other.size_ = 0; - other.address_ = nullptr; - other.fd_ = INVALID_SHM_HANDLE; - } - - SharedMemory& SharedMemory::operator=(SharedMemory&& other) noexcept - { - if (this != &other) - { - close(); - size_ = other.size_; - address_ = other.address_; - fd_ = other.fd_; - other.size_ = 0; - other.address_ = nullptr; - other.fd_ = INVALID_SHM_HANDLE; - } - return *this; - } - - SharedMemory::~SharedMemory() - { - close(); - } - - void SharedMemory::map(std::size_t size) - { - if (::ftruncate(fd_, static_cast(size)) < 0) - { - ::close(fd_); - fd_ = -1; - throw_system_error("SharedMemory.cc: ftruncate()"); - } - - address_ = ::mmap(nullptr, size, PROT_READ | PROT_WRITE, MAP_SHARED, fd_, 0); - if (address_ == MAP_FAILED) - { - address_ = nullptr; - ::close(fd_); - fd_ = -1; - throw_system_error("SharedMemory.cc: mmap()"); - } - - size_ = size; - } - void SharedMemory::create(std::string const& name, std::size_t size) { fd_ = ::shm_open(name.c_str(), O_RDWR | O_CREAT | O_TRUNC, 0666); if (fd_ < 0) { - throw_system_error("SharedMemory.cc: shm_open(create)"); + throw std::system_error(errno, std::system_category(), + "SharedMemory: shm_open(create)"); } map(size); } - - bool SharedMemory::try_create(std::string const& name, std::size_t size) - { - // Keep the fd and do the full setup (ftruncate + mmap) inline. - // SharedRegion::create_or_open consumes the resulting mapping - // directly — there's no reason to close here and re-enter create(), - // and the old round-trip pattern caused a subtle race on Darwin. - fd_ = ::shm_open(name.c_str(), O_RDWR | O_CREAT | O_EXCL, 0666); - if (fd_ < 0) - { - if (errno == EEXIST) - { - fd_ = INVALID_SHM_HANDLE; - return false; - } - throw_system_error("SharedMemory.cc: shm_open(try_create)"); - } - map(size); - return true; - } - - void SharedMemory::open(std::string const& name) - { - if (not try_open(name)) - { - throw_system_error("SharedMemory: shm_open(open)"); - } - } - - bool SharedMemory::try_open(std::string const& name) - { - fd_ = ::shm_open(name.c_str(), O_RDWR, 0); - if (fd_ < 0) - { - if (errno == ENOENT) - { - fd_ = INVALID_SHM_HANDLE; - return false; - } - throw_system_error("SharedMemory: shm_open(try_open)"); - } - - struct stat st{}; - if (::fstat(fd_, &st) < 0) - { - ::close(fd_); - fd_ = INVALID_SHM_HANDLE; - throw_system_error("SharedMemory: fstat()"); - } - - size_ = static_cast(st.st_size); - address_ = ::mmap(nullptr, size_, PROT_READ | PROT_WRITE, MAP_SHARED, fd_, 0); - if (address_ == MAP_FAILED) - { - address_ = nullptr; - ::close(fd_); - fd_ = INVALID_SHM_HANDLE; - throw_system_error("SharedMemory: mmap()"); - } - return true; - } - - void SharedMemory::close() - { - if (address_ != nullptr) - { - ::munmap(address_, size_); - address_ = nullptr; - } - if (fd_ != INVALID_SHM_HANDLE) - { - ::close(fd_); - fd_ = -1; - } - size_ = 0; - } - - void SharedMemory::unlink(std::string const& name) - { - ::shm_unlink(name.c_str()); - } } diff --git a/src/os/linux/Time.cc b/src/os/linux/Time.cc index 89d323a..efb8b97 100644 --- a/src/os/linux/Time.cc +++ b/src/os/linux/Time.cc @@ -1,10 +1,11 @@ +// Linux-specific sleep(). Other Time entry points live in +// src/os/posix/Time.cc. #include "kickmsg/os/Time.h" #include #include #include #include -#include namespace kickmsg { @@ -17,7 +18,7 @@ namespace kickmsg while (true) { timespec required = remaining; - int result = clock_nanosleep(CLOCK_MONOTONIC, 0, &required, &remaining); + int result = ::clock_nanosleep(CLOCK_MONOTONIC, 0, &required, &remaining); if (result == 0) { return; @@ -29,16 +30,4 @@ namespace kickmsg throw std::system_error(result, std::system_category(), "clock_nanosleep()"); } } - - nanoseconds since_epoch() - { - timespec ts; - clock_gettime(CLOCK_MONOTONIC, &ts); - return seconds{ts.tv_sec} + nanoseconds{ts.tv_nsec}; - } - - nanoseconds elapsed_time(nanoseconds start) - { - return since_epoch() - start; - } } diff --git a/src/os/posix/Process.cc b/src/os/posix/Process.cc new file mode 100644 index 0000000..4eeb77a --- /dev/null +++ b/src/os/posix/Process.cc @@ -0,0 +1,28 @@ +#include "kickmsg/os/Process.h" + +#include +#include +#include +#include + +namespace kickmsg +{ + uint64_t current_pid() noexcept + { + return static_cast(::getpid()); + } + + bool process_exists(uint64_t pid) noexcept + { + if (pid == 0) + { + return false; + } + if (::kill(static_cast(pid), 0) == 0) + { + return true; + } + // EPERM means the process exists but we can't signal it. + return errno == EPERM; + } +} diff --git a/src/os/posix/SharedMemory.cc b/src/os/posix/SharedMemory.cc new file mode 100644 index 0000000..b98a85e --- /dev/null +++ b/src/os/posix/SharedMemory.cc @@ -0,0 +1,155 @@ +// Parts of SharedMemory that are identical on Linux and macOS. +// Platform-specific create() lives in src/os/{linux,darwin}/SharedMemory.cc. +#include "kickmsg/os/SharedMemory.h" + +#include +#include +#include +#include +#include +#include +#include +#include + +namespace kickmsg +{ + namespace + { + [[noreturn]] void throw_system_error(char const* context) + { + throw std::system_error(errno, std::system_category(), context); + } + } + + SharedMemory::SharedMemory(SharedMemory&& other) noexcept + : size_{other.size_} + , address_{other.address_} + , fd_{other.fd_} + { + other.size_ = 0; + other.address_ = nullptr; + other.fd_ = INVALID_SHM_HANDLE; + } + + SharedMemory& SharedMemory::operator=(SharedMemory&& other) noexcept + { + if (this != &other) + { + close(); + size_ = other.size_; + address_ = other.address_; + fd_ = other.fd_; + other.size_ = 0; + other.address_ = nullptr; + other.fd_ = INVALID_SHM_HANDLE; + } + return *this; + } + + SharedMemory::~SharedMemory() + { + close(); + } + + void SharedMemory::map(std::size_t size) + { + if (::ftruncate(fd_, static_cast(size)) < 0) + { + ::close(fd_); + fd_ = INVALID_SHM_HANDLE; + throw_system_error("SharedMemory: ftruncate()"); + } + + address_ = ::mmap(nullptr, size, PROT_READ | PROT_WRITE, MAP_SHARED, fd_, 0); + if (address_ == MAP_FAILED) + { + address_ = nullptr; + ::close(fd_); + fd_ = INVALID_SHM_HANDLE; + throw_system_error("SharedMemory: mmap()"); + } + + size_ = size; + } + + bool SharedMemory::try_create(std::string const& name, std::size_t size) + { + // Keep the fd and do the full setup (ftruncate + mmap) inline. + // SharedRegion::create_or_open consumes the resulting mapping + // directly — there's no reason to close here and re-enter create(), + // and the old round-trip pattern caused a subtle race on Darwin. + fd_ = ::shm_open(name.c_str(), O_RDWR | O_CREAT | O_EXCL, 0666); + if (fd_ < 0) + { + if (errno == EEXIST) + { + fd_ = INVALID_SHM_HANDLE; + return false; + } + throw_system_error("SharedMemory: shm_open(try_create)"); + } + map(size); + return true; + } + + void SharedMemory::open(std::string const& name) + { + if (not try_open(name)) + { + throw_system_error("SharedMemory: shm_open(open)"); + } + } + + bool SharedMemory::try_open(std::string const& name) + { + fd_ = ::shm_open(name.c_str(), O_RDWR, 0); + if (fd_ < 0) + { + if (errno == ENOENT) + { + fd_ = INVALID_SHM_HANDLE; + return false; + } + throw_system_error("SharedMemory: shm_open(try_open)"); + } + + struct stat st{}; + if (::fstat(fd_, &st) < 0) + { + ::close(fd_); + fd_ = INVALID_SHM_HANDLE; + throw_system_error("SharedMemory: fstat()"); + } + + size_ = static_cast(st.st_size); + address_ = ::mmap(nullptr, size_, PROT_READ | PROT_WRITE, MAP_SHARED, fd_, 0); + if (address_ == MAP_FAILED) + { + address_ = nullptr; + ::close(fd_); + fd_ = INVALID_SHM_HANDLE; + throw_system_error("SharedMemory: mmap()"); + } + return true; + } + + void SharedMemory::close() + { + if (address_ != nullptr) + { + ::munmap(address_, size_); + address_ = nullptr; + } + if (fd_ != INVALID_SHM_HANDLE) + { + ::close(fd_); + fd_ = INVALID_SHM_HANDLE; + } + size_ = 0; + } + + void SharedMemory::unlink(std::string const& name) + { + ::shm_unlink(name.c_str()); + } +} diff --git a/src/os/posix/Time.cc b/src/os/posix/Time.cc new file mode 100644 index 0000000..69c0906 --- /dev/null +++ b/src/os/posix/Time.cc @@ -0,0 +1,42 @@ +// Parts of the Time API that are identical on Linux and macOS. +// Per-platform sleep() lives in src/os/{linux,darwin}/Time.cc. +#include "kickmsg/os/Time.h" + +#include +#include +#include +#include +#include + +namespace kickmsg +{ + void yield() + { + ::sched_yield(); + } + + nanoseconds monotonic_ns() + { + timespec ts; + if (::clock_gettime(CLOCK_MONOTONIC, &ts) != 0) + { + throw std::system_error(errno, std::system_category(), "clock_gettime(MONOTONIC)"); + } + return seconds{ts.tv_sec} + nanoseconds{ts.tv_nsec}; + } + + nanoseconds since_epoch() + { + timespec ts; + if (::clock_gettime(CLOCK_REALTIME, &ts) != 0) + { + throw std::system_error(errno, std::system_category(), "clock_gettime(REALTIME)"); + } + return seconds{ts.tv_sec} + nanoseconds{ts.tv_nsec}; + } + + nanoseconds elapsed_time(nanoseconds start) + { + return monotonic_ns() - start; + } +} diff --git a/src/os/windows/Process.cc b/src/os/windows/Process.cc new file mode 100644 index 0000000..3e0aee7 --- /dev/null +++ b/src/os/windows/Process.cc @@ -0,0 +1,59 @@ +#include "kickmsg/os/Process.h" + +#define WIN32_LEAN_AND_MEAN +#include + +namespace kickmsg +{ + uint64_t current_pid() noexcept + { + return static_cast(::GetCurrentProcessId()); + } + + uint64_t process_starttime(uint64_t pid) noexcept + { + if (pid == 0) + { + return 0; + } + HANDLE h = ::OpenProcess(PROCESS_QUERY_LIMITED_INFORMATION, + FALSE, + static_cast(pid)); + if (h == nullptr) + { + return 0; + } + FILETIME creation{}, exit{}, kernel{}, user{}; + BOOL ok = ::GetProcessTimes(h, &creation, &exit, &kernel, &user); + ::CloseHandle(h); + if (not ok) + { + return 0; + } + // FILETIME is 100-ns intervals since 1601-01-01; packing the two + // halves is enough for the equality comparison sweep_stale uses. + return (static_cast(creation.dwHighDateTime) << 32) + | static_cast(creation.dwLowDateTime); + } + + bool process_exists(uint64_t pid) noexcept + { + if (pid == 0) + { + return false; + } + // PROCESS_QUERY_LIMITED_INFORMATION is granted even for processes + // running under different integrity levels / sessions, which is + // what we want for a cross-user discovery tool. + HANDLE h = ::OpenProcess(PROCESS_QUERY_LIMITED_INFORMATION, + FALSE, + static_cast(pid)); + if (h != nullptr) + { + ::CloseHandle(h); + return true; + } + // ERROR_ACCESS_DENIED: process exists but we can't open it. + return ::GetLastError() == ERROR_ACCESS_DENIED; + } +} diff --git a/src/os/windows/Time.cc b/src/os/windows/Time.cc index 4042403..53334f3 100644 --- a/src/os/windows/Time.cc +++ b/src/os/windows/Time.cc @@ -10,13 +10,18 @@ namespace kickmsg auto ms = duration_cast(ns); if (ms.count() <= 0) { - SwitchToThread(); + yield(); return; } Sleep(static_cast(ms.count())); } - nanoseconds since_epoch() + void yield() + { + ::SwitchToThread(); + } + + nanoseconds monotonic_ns() { static LARGE_INTEGER freq{}; if (freq.QuadPart == 0) @@ -34,8 +39,22 @@ namespace kickmsg return seconds{secs} + nanoseconds{nanos}; } + nanoseconds since_epoch() + { + // FILETIME is 100-ns intervals since 1601-01-01 UTC. Shift the + // epoch to 1970-01-01 UTC: 11644473600 seconds. + FILETIME ft; + GetSystemTimePreciseAsFileTime(&ft); + ULARGE_INTEGER u; + u.LowPart = ft.dwLowDateTime; + u.HighPart = ft.dwHighDateTime; + constexpr uint64_t epoch_offset_100ns = 116444736000000000ULL; + uint64_t ns100 = u.QuadPart - epoch_offset_100ns; + return nanoseconds{ns100 * 100}; + } + nanoseconds elapsed_time(nanoseconds start) { - return since_epoch() - start; + return monotonic_ns() - start; } } diff --git a/src/types.cc b/src/types.cc index 1b86c62..b932c64 100644 --- a/src/types.cc +++ b/src/types.cc @@ -37,22 +37,17 @@ namespace kickmsg return reinterpret_cast(h) + sizeof(Header); } - // FNV-1a hash of config fields, used to detect parameter mismatches - // when opening an existing region. Chained through hash::fnv1a_64() - // so the hashed byte sequence (and therefore the resulting value) is - // identical to a single fnv1a_64 over the concatenation of the same - // raw field bytes in the same order — do NOT reorder these fields - // without bumping VERSION, since existing regions on disk are hashed - // with this ordering. + // FNV-1a over the config fields, detecting parameter mismatches at + // open time. Field order is part of the on-disk hash — do NOT + // reorder without bumping VERSION. uint64_t compute_config_hash(channel::Type type, channel::Config const& cfg) { - uint64_t h; - h = hash::fnv1a_64(&type, sizeof(type)); - h = hash::fnv1a_64(&cfg.max_subscribers, sizeof(cfg.max_subscribers), h); - h = hash::fnv1a_64(&cfg.sub_ring_capacity, sizeof(cfg.sub_ring_capacity), h); - h = hash::fnv1a_64(&cfg.pool_size, sizeof(cfg.pool_size), h); - h = hash::fnv1a_64(&cfg.max_payload_size, sizeof(cfg.max_payload_size), h); - h = hash::fnv1a_64(&cfg.commit_timeout, sizeof(cfg.commit_timeout), h); + uint64_t h = hash::fnv1a_64(type); + h = hash::fnv1a_64(cfg.max_subscribers, h); + h = hash::fnv1a_64(cfg.sub_ring_capacity, h); + h = hash::fnv1a_64(cfg.pool_size, h); + h = hash::fnv1a_64(cfg.max_payload_size, h); + h = hash::fnv1a_64(cfg.commit_timeout.count(), h); return h; } diff --git a/tests/CMakeLists.txt b/tests/CMakeLists.txt index 0adee45..fe81f25 100644 --- a/tests/CMakeLists.txt +++ b/tests/CMakeLists.txt @@ -5,6 +5,7 @@ add_executable(kickmsg_unit unit/subscriber-t.cc unit/naming-t.cc unit/node-t.cc + unit/registry-t.cc ) target_link_libraries(kickmsg_unit PRIVATE kickmsg GTest::gmock_main) set_target_properties(kickmsg_unit @@ -35,4 +36,10 @@ if(NOT WIN32) set_target_properties(kickmsg_crash_test PROPERTIES RUNTIME_OUTPUT_DIRECTORY ${CMAKE_BINARY_DIR}) + + add_executable(kickmsg_registry_stress_test registry_stress_test.cc) + target_link_libraries(kickmsg_registry_stress_test PRIVATE kickmsg) + set_target_properties(kickmsg_registry_stress_test + PROPERTIES + RUNTIME_OUTPUT_DIRECTORY ${CMAKE_BINARY_DIR}) endif() diff --git a/tests/crash_test.cc b/tests/crash_test.cc index 1726eb8..888d674 100644 --- a/tests/crash_test.cc +++ b/tests/crash_test.cc @@ -1,12 +1,22 @@ /// @file crash_test.cc /// @brief Multi-process crash recovery test for kickmsg. /// -/// Forks a child publisher, kills it mid-commit with SIGKILL, then verifies -/// the channel recovers via diagnose() + repair_locked_entries() + -/// reset_retired_rings() + reclaim_orphaned_slots(). A subscriber running -/// throughout validates that no corruption occurs before or after the crash. +/// Phase 1 — publisher crash rounds (the original scenario): fork a child +/// publisher, kill it mid-commit with SIGKILL, verify the channel +/// recovers via diagnose() + repair_locked_entries() + +/// reset_retired_rings() + reclaim_orphaned_slots(). A subscriber +/// running throughout validates that no corruption occurs. +/// +/// Phase 2 — subscriber crash: a subscriber that SIGKILLs itself while +/// holding live SampleView pins. Exercises reclaim_orphaned_slots(). +/// +/// Phase 3 — multi-publisher simultaneous crash: four publishers all +/// killed at once. Exercises the repair sequence when Case-A (locked +/// sequences) and Case-B (retired rings) residue coexist at multiple +/// ring positions. #include +#include #include #include #include @@ -15,6 +25,7 @@ #include #include #include +#include #include "kickmsg/os/Time.h" #include "kickmsg/Publisher.h" @@ -49,7 +60,7 @@ static void child_publisher_main(int /*round*/) auto* ptr = pub.allocate(sizeof(CrashPayload)); if (ptr == nullptr) { - kickmsg::sleep(0ns); + kickmsg::yield(); continue; } @@ -102,12 +113,17 @@ static void child_subscriber_main(int result_fd, int signal_fd) ++received; } - // Report results via pipe struct { uint64_t recv; uint64_t corrupt; } result = {received, corrupted}; ssize_t written = write(result_fd, &result, sizeof(result)); - (void)written; close(result_fd); close(signal_fd); + // Partial write → parent would read zero-initialized bytes and + // interpret that as "no corruption" — a silent false pass. Exit + // non-zero so waitpid surfaces the anomaly. + if (written != sizeof(result)) + { + std::_Exit(3); + } } struct RoundResult @@ -118,12 +134,25 @@ struct RoundResult bool subscriber_ok; }; +/// Aborts on fork failure: an unchecked -1 return would make kill(-1, ...) +/// wipe the entire process group. +static pid_t checked_fork(char const* site) +{ + pid_t p = fork(); + if (p < 0) + { + std::fprintf(stderr, "fork() failed at %s: errno=%d\n", site, errno); + std::_Exit(2); + } + return p; +} + static RoundResult run_one_round(int round) { RoundResult result{}; // Fork publisher - pid_t pub_pid = fork(); + pid_t pub_pid = checked_fork("run_one_round publisher"); if (pub_pid == 0) { child_publisher_main(round); @@ -168,7 +197,7 @@ static RoundResult run_one_round(int round) } // Fork a new publisher to verify the channel still works - pid_t pub2_pid = fork(); + pid_t pub2_pid = checked_fork("run_one_round verify publisher"); if (pub2_pid == 0) { auto reg = kickmsg::SharedRegion::open(SHM_NAME); @@ -182,7 +211,7 @@ static RoundResult run_one_round(int round) msg.checksum = compute_checksum(msg); while (pub.send(&msg, sizeof(msg)) < 0) { - kickmsg::sleep(0ns); + kickmsg::yield(); } } _exit(0); @@ -194,6 +223,236 @@ static RoundResult run_one_round(int round) return result; } +/// Phase 2: subscriber SIGKILLed while holding SampleView pins — +/// `reclaim_orphaned_slots` must release them. +static bool test_subscriber_crash() +{ + std::printf("\n--- Phase 2: subscriber killed mid-receive ---\n"); + + constexpr char const* SUB_SHM = "/kickmsg_crash_test_sub"; + kickmsg::SharedMemory::unlink(SUB_SHM); + + kickmsg::channel::Config cfg; + cfg.max_subscribers = 2; + cfg.sub_ring_capacity = 16; + cfg.pool_size = 32; + cfg.max_payload_size = sizeof(CrashPayload); + + auto region = kickmsg::SharedRegion::create( + SUB_SHM, kickmsg::channel::PubSub, cfg, "crash_test_sub"); + + // Fork subscriber that pins every sample it receives. + pid_t sub_pid = checked_fork("subscriber"); + if (sub_pid == 0) + { + auto r = kickmsg::SharedRegion::open(SUB_SHM); + kickmsg::Subscriber sub(r); + // receive_view() returns SampleView which pins the slot — exactly + // what we want to orphan on SIGKILL. + std::vector pins; + while (true) + { + auto s = sub.receive_view(200ms); + if (s) + { + pins.push_back(std::move(*s)); + } + } + } + // Let the subscriber attach. + kickmsg::sleep(50ms); + + // Pool saturates once pins outnumber released slots; send() returns <0. + kickmsg::Publisher pub(region); + uint32_t published = 0; + for (uint32_t i = 0; i < cfg.pool_size * 2; ++i) + { + CrashPayload msg{}; + msg.magic = CrashPayload::MAGIC; + msg.seq = i; + msg.checksum = compute_checksum(msg); + if (pub.send(&msg, sizeof(msg)) >= 0) + { + ++published; + } + kickmsg::sleep(1ms); + } + + kill(sub_pid, SIGKILL); + int status; + waitpid(sub_pid, &status, 0); + + auto pre = region.diagnose(); + + std::size_t repaired = region.repair_locked_entries(); + std::size_t reset = region.reset_retired_rings(); + std::size_t reclaimed = region.reclaim_orphaned_slots(); + + auto post = region.diagnose(); + bool clean = (post.locked_entries == 0 and post.retired_rings == 0); + + // Verify the channel is still writable after repair. + bool writable = false; + { + pid_t v = checked_fork("verify child"); + if (v == 0) + { + auto r = kickmsg::SharedRegion::open(SUB_SHM); + kickmsg::Publisher p(r); + for (uint32_t i = 0; i < 10; ++i) + { + CrashPayload msg{}; + msg.magic = CrashPayload::MAGIC; + msg.seq = 3000000 + i; + msg.checksum = compute_checksum(msg); + int rc = 0; + for (int k = 0; k < 100 and rc <= 0; ++k) + { + rc = p.send(&msg, sizeof(msg)); + if (rc <= 0) + { + kickmsg::yield(); + } + } + if (rc <= 0) + { + _exit(2); + } + } + _exit(0); + } + int v_status = 0; + waitpid(v, &v_status, 0); + writable = (WIFEXITED(v_status) and WEXITSTATUS(v_status) == 0); + } + + std::printf(" Published %u, pre: locked=%u retired=%u, " + "repaired=%zu reset=%zu reclaimed=%zu, " + "final_clean=%s, writable_after=%s\n", + published, pre.locked_entries, pre.retired_rings, + repaired, reset, reclaimed, + clean ? "yes" : "no", + writable ? "yes" : "no"); + + kickmsg::SharedMemory::unlink(SUB_SHM); + return clean and writable; +} + +/// Phase 3: four publishers SIGKILLed simultaneously — repair sequence +/// must handle Case-A + Case-B residue coexisting at multiple positions. +static bool test_multi_publisher_crash() +{ + std::printf("\n--- Phase 3: multi-publisher simultaneous crash ---\n"); + + constexpr char const* MULTI_SHM = "/kickmsg_crash_test_multi"; + kickmsg::SharedMemory::unlink(MULTI_SHM); + + kickmsg::channel::Config cfg; + cfg.max_subscribers = 2; + cfg.sub_ring_capacity = 16; + cfg.pool_size = 64; + cfg.max_payload_size = sizeof(CrashPayload); + + auto region = kickmsg::SharedRegion::create( + MULTI_SHM, kickmsg::channel::PubSub, cfg, "crash_test_multi"); + // Subscriber attaches so rings go live and can accumulate damage. + kickmsg::Subscriber sub(region); + + constexpr int N_PUBS = 4; + pid_t pubs[N_PUBS]; + for (int i = 0; i < N_PUBS; ++i) + { + pubs[i] = checked_fork("multi-pub child"); + if (pubs[i] == 0) + { + auto r = kickmsg::SharedRegion::open(MULTI_SHM); + kickmsg::Publisher p(r); + for (uint32_t seq = 0; ; ++seq) + { + auto* ptr = p.allocate(sizeof(CrashPayload)); + if (ptr == nullptr) + { + kickmsg::yield(); + continue; + } + CrashPayload msg{}; + msg.magic = CrashPayload::MAGIC; + msg.seq = seq; + msg.checksum = compute_checksum(msg); + std::memcpy(ptr, &msg, sizeof(msg)); + p.publish(); + } + } + } + + kickmsg::sleep(30ms); + + for (int i = 0; i < N_PUBS; ++i) + { + kill(pubs[i], SIGKILL); + } + for (int i = 0; i < N_PUBS; ++i) + { + int st; + waitpid(pubs[i], &st, 0); + } + + auto pre = region.diagnose(); + + std::size_t repaired = region.repair_locked_entries(); + std::size_t reset = region.reset_retired_rings(); + std::size_t reclaimed = region.reclaim_orphaned_slots(); + + auto post = region.diagnose(); + bool clean = (post.locked_entries == 0 and post.retired_rings == 0); + + // Confirm throughput resumes: fresh publisher, N messages, no errors. + bool resumed = false; + { + pid_t v = checked_fork("verify child"); + if (v == 0) + { + auto r = kickmsg::SharedRegion::open(MULTI_SHM); + kickmsg::Publisher p(r); + for (uint32_t i = 0; i < 50; ++i) + { + CrashPayload msg{}; + msg.magic = CrashPayload::MAGIC; + msg.seq = 4000000 + i; + msg.checksum = compute_checksum(msg); + int rc = 0; + for (int k = 0; k < 100 and rc <= 0; ++k) + { + rc = p.send(&msg, sizeof(msg)); + if (rc <= 0) + { + kickmsg::yield(); + } + } + if (rc <= 0) + { + _exit(2); + } + } + _exit(0); + } + int v_status = 0; + waitpid(v, &v_status, 0); + resumed = (WIFEXITED(v_status) and WEXITSTATUS(v_status) == 0); + } + + std::printf(" N=%d, pre: locked=%u retired=%u, " + "repaired=%zu reset=%zu reclaimed=%zu, " + "final_clean=%s, resumed=%s\n", + N_PUBS, pre.locked_entries, pre.retired_rings, + repaired, reset, reclaimed, + clean ? "yes" : "no", + resumed ? "yes" : "no"); + + kickmsg::SharedMemory::unlink(MULTI_SHM); + return clean and resumed; +} + int main() { std::printf("=== Kickmsg Multi-Process Crash Test ===\n\n"); @@ -214,10 +473,13 @@ int main() // result_pipe: subscriber writes results to [1], parent reads from [0]. int signal_pipe[2]; int result_pipe[2]; - pipe(signal_pipe); - pipe(result_pipe); + if (pipe(signal_pipe) != 0 or pipe(result_pipe) != 0) + { + std::fprintf(stderr, "pipe() failed: errno=%d\n", errno); + return 2; + } - pid_t sub_pid = fork(); + pid_t sub_pid = checked_fork("subscriber"); if (sub_pid == 0) { close(signal_pipe[1]); // close write end @@ -247,23 +509,36 @@ int main() // Signal subscriber to exit close(signal_pipe[1]); - // Read subscriber results + // Read subscriber results. A short read means the child either + // crashed before writing its result struct or we lost bytes on the + // pipe — either way, we can't trust a zero-initialised sub_result + // and should fail rather than silently pass the corruption check. struct { uint64_t recv; uint64_t corrupt; } sub_result{}; - read(result_pipe[0], &sub_result, sizeof(sub_result)); + ssize_t got = read(result_pipe[0], &sub_result, sizeof(sub_result)); close(result_pipe[0]); int sub_status; waitpid(sub_pid, &sub_status, 0); - std::printf(" Subscriber: received %" PRIu64 ", corrupted %" PRIu64 "\n", - sub_result.recv, sub_result.corrupt); - - if (sub_result.corrupt > 0) + if (got != static_cast(sizeof(sub_result))) { - std::fprintf(stderr, " [FAIL] Subscriber saw %" PRIu64 " corrupted messages!\n", - sub_result.corrupt); + std::fprintf(stderr, + " [FAIL] Subscriber result pipe short read: got=%zd expected=%zu\n", + got, sizeof(sub_result)); all_ok = false; } + else + { + std::printf(" Subscriber: received %" PRIu64 ", corrupted %" PRIu64 "\n", + sub_result.recv, sub_result.corrupt); + if (sub_result.corrupt > 0) + { + std::fprintf(stderr, + " [FAIL] Subscriber saw %" PRIu64 " corrupted messages!\n", + sub_result.corrupt); + all_ok = false; + } + } std::printf("\n Rounds: %d, rounds with recovery: %d\n", NUM_ROUNDS, any_recovery); @@ -290,13 +565,22 @@ int main() kickmsg::SharedMemory::unlink(SHM_NAME); + if (not test_subscriber_crash()) + { + all_ok = false; + } + if (not test_multi_publisher_crash()) + { + all_ok = false; + } + if (all_ok) { - std::printf(" [PASS]\n"); + std::printf("\n [PASS]\n"); } else { - std::printf(" [FAIL]\n"); + std::printf("\n [FAIL]\n"); } return all_ok ? 0 : 1; diff --git a/tests/python/test_diagnostics.py b/tests/python/test_diagnostics.py new file mode 100644 index 0000000..f54f11c --- /dev/null +++ b/tests/python/test_diagnostics.py @@ -0,0 +1,310 @@ +"""Tests for the `kickmsg.diagnostics` typed API. + +Covers the shape of the returned dataclasses and the round-trip +through the registry that backs `list_topics`. +""" + +from __future__ import annotations + +import os + +import pytest + +import kickmsg +from kickmsg import diagnostics as diag + + +@pytest.fixture +def kmsg_namespace() -> str: + """Namespace unique to this test run; cleaned before and after.""" + ns = f"pytest_diag_{os.getpid()}" + kickmsg.Registry.unlink(ns) + yield ns + kickmsg.Registry.unlink(ns) + + +# ---------------------------------------------------------------------- +# list_topics +# ---------------------------------------------------------------------- + + +def test_list_topics_empty_returns_empty(kmsg_namespace): + assert diag.list_topics(kmsg_namespace) == [] + + +def test_list_topics_aggregates_by_shm(kmsg_namespace, small_cfg): + pub_node = kickmsg.Node("producer", namespace=kmsg_namespace) + pub = pub_node.advertise("telemetry", small_cfg) + + sub_node = kickmsg.Node("consumer", namespace=kmsg_namespace) + sub = sub_node.subscribe("telemetry") + + topics = diag.list_topics(kmsg_namespace) + assert len(topics) == 1 + t = topics[0] + assert t.shm_name == f"/{kmsg_namespace}_telemetry" + assert t.channel_type == "pubsub" + assert len(t.producers) == 1 + assert len(t.consumers) == 1 + assert t.producers[0].node_name == "producer" + assert t.consumers[0].node_name == "consumer" + assert len(t.stall_producers) == 0 + assert len(t.stall_consumers) == 0 + + # Node destructors deregister — referenced to keep them alive above. + del pub, sub, pub_node, sub_node + kickmsg.unlink_shm(f"/{kmsg_namespace}_telemetry") + + +def test_list_topics_broadcast_is_both_role(kmsg_namespace, small_cfg): + node = kickmsg.Node("bcast", namespace=kmsg_namespace) + handle = node.join_broadcast("events", small_cfg) + + topics = diag.list_topics(kmsg_namespace) + assert len(topics) == 1 + t = topics[0] + assert t.channel_type == "broadcast" + # Broadcast node counts as both producer and consumer. + assert len(t.producers) == 1 + assert len(t.consumers) == 1 + assert t.producers[0].role == "both" + + del handle, node + kickmsg.unlink_shm(f"/{kmsg_namespace}_broadcast_events") + + +# ---------------------------------------------------------------------- +# stats +# ---------------------------------------------------------------------- + + +def test_stats_reports_write_pos_after_publishes(kmsg_namespace, small_cfg): + node = kickmsg.Node("writer", namespace=kmsg_namespace) + pub = node.advertise("data", small_cfg) + sub = node.subscribe("data") + + for _ in range(5): + pub.send(b"x" * 16) + + s = diag.stats(f"/{kmsg_namespace}_data") + assert s.live_rings == 1 + assert s.total_writes == 5 + assert s.total_drops == 0 + # RingStats shape + live_rings = [r for r in s.rings if r.state == "live"] + assert len(live_rings) == 1 + assert live_rings[0].write_pos == 5 + + del pub, sub, node + kickmsg.unlink_shm(f"/{kmsg_namespace}_data") + + +# ---------------------------------------------------------------------- +# diagnose +# ---------------------------------------------------------------------- + + +def test_diagnose_healthy(kmsg_namespace, small_cfg): + node = kickmsg.Node("healthy", namespace=kmsg_namespace) + pub = node.advertise("topic", small_cfg) + + h = diag.diagnose(f"/{kmsg_namespace}_topic") + assert h.status == "healthy" + assert h.locked_entries == 0 + assert h.retired_rings == 0 + + del pub, node + kickmsg.unlink_shm(f"/{kmsg_namespace}_topic") + + +# ---------------------------------------------------------------------- +# schema +# ---------------------------------------------------------------------- + + +def test_schema_unset_by_default(kmsg_namespace, small_cfg): + node = kickmsg.Node("n", namespace=kmsg_namespace) + pub = node.advertise("topic", small_cfg) + + s = diag.schema(f"/{kmsg_namespace}_topic") + assert s.state == "unset" + assert s.name is None + + del pub, node + kickmsg.unlink_shm(f"/{kmsg_namespace}_topic") + + +def test_schema_set_after_claim(kmsg_namespace, small_cfg): + info = kickmsg.SchemaInfo() + info.name = "my/Type" + info.version = 3 + info.identity = b"\x01" * 64 + info.layout = b"\x02" * 64 + + cfg = small_cfg + cfg.schema = info + + node = kickmsg.Node("n", namespace=kmsg_namespace) + pub = node.advertise("topic", cfg) + + s = diag.schema(f"/{kmsg_namespace}_topic") + assert s.state == "set" + assert s.name == "my/Type" + assert s.version == 3 + assert s.identity == b"\x01" * 64 + + del pub, node + kickmsg.unlink_shm(f"/{kmsg_namespace}_topic") + + +def test_schema_diff_detects_version_delta(kmsg_namespace, small_cfg): + info_a = kickmsg.SchemaInfo() + info_a.name = "Type" + info_a.version = 1 + info_a.identity = b"\x01" * 64 + info_a.layout = b"\x02" * 64 + + info_b = kickmsg.SchemaInfo() + info_b.name = "Type" + info_b.version = 2 # differs + info_b.identity = b"\x01" * 64 + info_b.layout = b"\x02" * 64 + + cfg_a = kickmsg.Config() + cfg_a.max_subscribers = 2 + cfg_a.sub_ring_capacity = 4 + cfg_a.pool_size = 8 + cfg_a.max_payload_size = 32 + cfg_a.schema = info_a + + cfg_b = kickmsg.Config() + cfg_b.max_subscribers = 2 + cfg_b.sub_ring_capacity = 4 + cfg_b.pool_size = 8 + cfg_b.max_payload_size = 32 + cfg_b.schema = info_b + + node = kickmsg.Node("n", namespace=kmsg_namespace) + pa = node.advertise("topic_a", cfg_a) + pb = node.advertise("topic_b", cfg_b) + + d = diag.schema_diff( + f"/{kmsg_namespace}_topic_a", + f"/{kmsg_namespace}_topic_b", + ) + assert not d.equal + assert d.version + assert not d.identity + assert not d.layout + + del pa, pb, node + kickmsg.unlink_shm(f"/{kmsg_namespace}_topic_a") + kickmsg.unlink_shm(f"/{kmsg_namespace}_topic_b") + + +# ---------------------------------------------------------------------- +# CLI smoke — just that the module parses and main() runs end-to-end. +# ---------------------------------------------------------------------- + + +def test_cli_list_empty_still_prints_header(kmsg_namespace, capsys): + from kickmsg.cli import main as cli_main + rc = cli_main(["list", "--namespace", kmsg_namespace]) + captured = capsys.readouterr() + # Empty list is not an error — exit 0, header still rendered so the + # column structure is visible. + assert rc == 0 + assert "TOPIC" in captured.out + assert "NS" in captured.out + + +def test_cli_list_shows_registered_topic(kmsg_namespace, small_cfg, capsys): + from kickmsg.cli import main as cli_main + + node = kickmsg.Node("cli_test", namespace=kmsg_namespace) + pub = node.advertise("cli_topic", small_cfg) + + rc = cli_main(["list", "--namespace", kmsg_namespace]) + captured = capsys.readouterr() + assert rc == 0 + assert "cli_topic" in captured.out + assert "pubsub" in captured.out + + del pub, node + kickmsg.unlink_shm(f"/{kmsg_namespace}_cli_topic") + + +@pytest.mark.parametrize("kind,setup,topic,expected_suffix", [ + # (label, fixture-style setup on a Node, logical topic path, SHM suffix) + ("pubsub", + lambda node, ns, cfg: node.advertise("reading", cfg), + "/reading", + "_reading"), + ("broadcast", + lambda node, ns, cfg: node.join_broadcast("events", cfg), + "/events", + "_broadcast_events"), + ("mailbox", + lambda node, ns, cfg: node.create_mailbox("reply", cfg), + None, # mailbox topic is //, filled below from node name + "_resolver_mbx_reply"), +]) +def test_cli_resolves_topic_via_registry(kmsg_namespace, small_cfg, + kind, setup, topic, expected_suffix): + """Every channel kind (pubsub / broadcast / mailbox) must resolve a + logical topic to the right SHM name via registry lookup. + + Regression: when `_native` wasn't imported in `_cli.py`, the lookup + silently fell through to a pubsub-pattern guess that worked only for + pubsub — broadcast and mailbox silently resolved to the wrong SHM. + """ + from kickmsg.cli import _resolve_shm_name + + node = kickmsg.Node("resolver", namespace=kmsg_namespace) + handle = setup(node, kmsg_namespace, small_cfg) + + # Mailbox logical topic is //. + logical = topic if topic is not None else f"/resolver/reply" + + class Args: + shm = None + namespace = kmsg_namespace + Args.topic = logical + + resolved = _resolve_shm_name(Args()) + assert resolved == f"/{kmsg_namespace}{expected_suffix}", ( + f"{kind}: expected /{kmsg_namespace}{expected_suffix}, got {resolved}" + ) + + del handle, node + kickmsg.unlink_shm(f"/{kmsg_namespace}{expected_suffix}") + + +def test_cli_resolve_rejects_unknown_topic(kmsg_namespace): + """With no fallback, an unknown topic is an explicit error, not a + silent wrong-region guess.""" + from kickmsg.cli import _resolve_shm_name + + class Args: + shm = None + topic = "/nonexistent" + namespace = kmsg_namespace + + with pytest.raises(SystemExit) as info: + _resolve_shm_name(Args()) + msg = str(info.value).lower() + # Either "no registry exists for this namespace" or "topic not found + # in this namespace" — both are explicit errors, not a silent guess. + assert "no registry" in msg or "not found" in msg + + +def test_cli_resolve_shm_overrides_topic(kmsg_namespace): + """--shm always wins, even if --namespace/topic are set.""" + from kickmsg.cli import _resolve_shm_name + + class Args: + shm = "raw_name" # auto-prepended leading '/' + topic = "/ignored" + namespace = kmsg_namespace + + assert _resolve_shm_name(Args()) == "/raw_name" diff --git a/tests/registry_stress_test.cc b/tests/registry_stress_test.cc new file mode 100644 index 0000000..5a318bb --- /dev/null +++ b/tests/registry_stress_test.cc @@ -0,0 +1,187 @@ +/// @file registry_stress_test.cc +/// @brief Multi-process stress test for the kickmsg registry. +/// +/// Forks several child processes that each hammer +/// register_participant / deregister / snapshot in tight loops, while +/// the parent periodically calls snapshot() and asserts structural +/// invariants: +/// +/// - Every Active entry has a non-empty topic_name and node_name. +/// - Every Active entry's pid is either this process or a live child. +/// - snapshot() never returns an entry whose state subsequently +/// reports Free/Claiming (the seqlock recheck in snapshot handles +/// this, so the invariant is 'no torn reads'). +/// +/// Exits 0 on success, non-zero on any assertion failure. + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "kickmsg/Registry.h" +#include "kickmsg/os/Process.h" +#include "kickmsg/os/Time.h" + +using namespace kickmsg; +using namespace std::chrono_literals; + +static constexpr char const* NS = "kickmsg_regstress"; +static constexpr int N_CHILDREN = 8; +static constexpr int OPS_PER_CHILD = 2000; + +static void child_loop(int child_id) +{ + auto reg = Registry::open_or_create(NS); + + std::string base_topic = "/child_" + std::to_string(child_id) + "/t_"; + std::string node_name = "child_" + std::to_string(child_id); + + for (int i = 0; i < OPS_PER_CHILD; ++i) + { + std::string topic = base_topic + std::to_string(i % 32); + std::string shm_name = "/" + std::string{NS} + + "_child_" + std::to_string(child_id) + + "_t_" + std::to_string(i % 32); + + uint32_t slot = reg.register_participant( + shm_name, topic, channel::PubSub, + registry::Pubsub, registry::Publisher, node_name); + + if (slot == INVALID_SLOT) + { + // Registry full — should be rare with opportunistic sweep. + continue; + } + reg.deregister(slot); + } +} + +int main() +{ + Registry::unlink(NS); + auto reg = Registry::open_or_create(NS); + + std::vector kids; + kids.reserve(N_CHILDREN); + for (int i = 0; i < N_CHILDREN; ++i) + { + pid_t p = fork(); + if (p == 0) + { + child_loop(i); + _exit(0); + } + if (p < 0) + { + std::fprintf(stderr, "fork failed\n"); + return 1; + } + kids.push_back(p); + } + + // Parent loop: keep snapshotting until all children exit. + uint64_t snapshots = 0; + uint64_t bad_rows = 0; + auto t_start = std::chrono::steady_clock::now(); + while (true) + { + auto snap = reg.snapshot(); + ++snapshots; + + for (auto const& p : snap) + { + // Active rows must have non-empty identity fields. A torn + // seqlock read (not caught by the generation recheck) would + // surface as empty topic_name or garbled node_name here. + if (p.topic_name.empty() or p.node_name.empty()) + { + ++bad_rows; + std::fprintf(stderr, + "bad row: topic=%zu node=%zu pid=%llu shm=%zu\n", + p.topic_name.size(), p.node_name.size(), + (unsigned long long)p.pid, p.shm_name.size()); + } + // PID must match one of our children (or us). + if (p.pid != current_pid()) + { + bool found = std::any_of(kids.begin(), kids.end(), + [&](pid_t k) { return static_cast(k) == p.pid; }); + if (not found) + { + ++bad_rows; + std::fprintf(stderr, "unknown pid in snapshot: %llu\n", + (unsigned long long)p.pid); + } + } + } + + // Periodic sweep to keep the registry healthy under churn. + reg.sweep_stale(); + + // Check if all children are done. + int status; + pid_t done = waitpid(-1, &status, WNOHANG); + if (done > 0) + { + // Remove from kids list. + kids.erase(std::remove(kids.begin(), kids.end(), done), kids.end()); + if (kids.empty()) + { + break; + } + } + } + + // Drain any remaining children. + while (not kids.empty()) + { + int status; + pid_t done = waitpid(-1, &status, 0); + if (done > 0) + { + kids.erase(std::remove(kids.begin(), kids.end(), done), kids.end()); + } + } + + auto elapsed = std::chrono::duration_cast( + std::chrono::steady_clock::now() - t_start).count(); + + std::printf("Registry stress: %d children × %d ops, " + "%llu snapshots in %lldms, bad_rows=%llu\n", + N_CHILDREN, OPS_PER_CHILD, + (unsigned long long)snapshots, + (long long)elapsed, + (unsigned long long)bad_rows); + + // Final state: all children gone, registry should settle to empty. + reg.sweep_stale(); + auto final_snap = reg.snapshot(); + if (not final_snap.empty()) + { + std::fprintf(stderr, "expected empty registry after child exit, " + "got %zu entries\n", final_snap.size()); + // This is likely a benign race (children just deregistered their + // last entry before exiting), so don't fail on it. + } + + Registry::unlink(NS); + + if (bad_rows == 0) + { + std::printf(" [PASS]\n"); + return 0; + } + else + { + std::printf(" [FAIL] %llu torn/invalid rows\n", + (unsigned long long)bad_rows); + return 1; + } +} diff --git a/tests/stress/churn.cc b/tests/stress/churn.cc index 672a54b..68ad3db 100644 --- a/tests/stress/churn.cc +++ b/tests/stress/churn.cc @@ -33,7 +33,7 @@ bool run_subscriber_churn() while (pub.send(&msg, sizeof(msg)) < 0) { - kickmsg::sleep(0ns); + kickmsg::yield(); } } pub_done = true; diff --git a/tests/stress/edge_cases.cc b/tests/stress/edge_cases.cc index efe4114..8d4c0b8 100644 --- a/tests/stress/edge_cases.cc +++ b/tests/stress/edge_cases.cc @@ -21,7 +21,7 @@ bool run_single_slot_ring() auto region = kickmsg::SharedRegion::create( shm_name, kickmsg::channel::PubSub, cfg, "single_slot_ring"); - nanoseconds t0 = kickmsg::since_epoch(); + nanoseconds t0 = kickmsg::monotonic_ns(); std::vector sub_results(NUM_SUBS); std::vector sub_threads; @@ -54,7 +54,7 @@ bool run_single_slot_ring() t.join(); } - nanoseconds t1 = kickmsg::since_epoch(); + nanoseconds t1 = kickmsg::monotonic_ns(); int64_t elapsed_ms = std::chrono::duration_cast(t1 - t0).count(); uint64_t total_sent = static_cast(NUM_PUBS) * NUM_MSGS; @@ -210,7 +210,7 @@ bool run_subscriber_saturation() std::fprintf(stderr, " [FATAL] send() returned %d\n", rc); std::abort(); } - kickmsg::sleep(0ns); + kickmsg::yield(); } } } diff --git a/tests/stress/live_repair.cc b/tests/stress/live_repair.cc index 55c4d0b..f393d2a 100644 --- a/tests/stress/live_repair.cc +++ b/tests/stress/live_repair.cc @@ -50,7 +50,7 @@ bool run_live_repair() } else if (rc == -EAGAIN) { - kickmsg::sleep(0ns); + kickmsg::yield(); } } }; diff --git a/tests/stress/mpmc.cc b/tests/stress/mpmc.cc index e11c859..a76c808 100644 --- a/tests/stress/mpmc.cc +++ b/tests/stress/mpmc.cc @@ -25,7 +25,7 @@ bool run_stress_test(TestConfig const& tc) auto region = kickmsg::SharedRegion::create( shm_name, kickmsg::channel::PubSub, cfg, "stress_test"); - nanoseconds t0 = kickmsg::since_epoch(); + nanoseconds t0 = kickmsg::monotonic_ns(); std::vector sub_threads; std::vector sub_results(static_cast(tc.num_subscribers)); @@ -63,7 +63,7 @@ bool run_stress_test(TestConfig const& tc) t.join(); } - nanoseconds t1 = kickmsg::since_epoch(); + nanoseconds t1 = kickmsg::monotonic_ns(); int64_t elapsed_ms = std::chrono::duration_cast(t1 - t0).count(); uint64_t total_sent = static_cast(tc.num_publishers) * tc.msgs_per_pub; diff --git a/tests/stress/pool_exhaustion.cc b/tests/stress/pool_exhaustion.cc index 729f5c0..d22dd45 100644 --- a/tests/stress/pool_exhaustion.cc +++ b/tests/stress/pool_exhaustion.cc @@ -46,7 +46,7 @@ bool run_pool_exhaustion() std::abort(); } eagain_count.fetch_add(1, std::memory_order_relaxed); - kickmsg::sleep(0ns); + kickmsg::yield(); } } }; diff --git a/tests/stress/treiber.cc b/tests/stress/treiber.cc index 0bccfb0..6314a6f 100644 --- a/tests/stress/treiber.cc +++ b/tests/stress/treiber.cc @@ -30,7 +30,7 @@ bool run_treiber_stress() if (idx == kickmsg::INVALID_SLOT) { contention_hits.fetch_add(1); - kickmsg::sleep(0ns); + kickmsg::yield(); --i; continue; } diff --git a/tests/unit/node-t.cc b/tests/unit/node-t.cc index 5b63302..d0ba578 100644 --- a/tests/unit/node-t.cc +++ b/tests/unit/node-t.cc @@ -62,7 +62,7 @@ TEST_F(NodeTest, NamingConventions) { kickmsg::Node node("mynode", "app"); EXPECT_EQ(node.name(), "mynode"); - EXPECT_EQ(node.prefix(), "app"); + EXPECT_EQ(node.kmsg_namespace(), "app"); } TEST_F(NodeTest, JoinBroadcastTwoNodes) @@ -314,10 +314,10 @@ TEST_F(NodeTest, RosStyleTopicNamesAreSanitizedIntoShmPath) ASSERT_TRUE(got.has_value()); EXPECT_EQ(std::memcmp(got->data(), &val, sizeof(val)), 0); - // Also confirm Node::name() / prefix() return the sanitized form so - // callers can log/introspect the actual identifiers in use. - EXPECT_EQ(pub_node.prefix(), "test.ns"); - EXPECT_EQ(pub_node.name(), "drv"); + // Also confirm Node::name() / kmsg_namespace() return the sanitized + // form so callers can log/introspect the actual identifiers in use. + EXPECT_EQ(pub_node.kmsg_namespace(), "test.ns"); + EXPECT_EQ(pub_node.name(), "drv"); } TEST_F(NodeTest, EmptyTopicNameThrows) diff --git a/tests/unit/region-t.cc b/tests/unit/region-t.cc index 4415f3e..06e105a 100644 --- a/tests/unit/region-t.cc +++ b/tests/unit/region-t.cc @@ -8,12 +8,8 @@ #include #include #include -#ifdef _WIN32 -#include -#define getpid _getpid -#else -#include -#endif + +#include "kickmsg/os/Process.h" class RegionTest : public ::testing::Test { @@ -120,7 +116,7 @@ TEST_F(RegionTest, HeaderStoresCreatorMetadata) SHM_NAME, kickmsg::channel::PubSub, cfg, "my_node"); auto* hdr = region.header(); - EXPECT_EQ(hdr->creator_pid, static_cast(getpid())); + EXPECT_EQ(hdr->creator_pid, kickmsg::current_pid()); EXPECT_GT(hdr->created_at_ns, 0u); EXPECT_NE(hdr->config_hash, 0u); } @@ -918,3 +914,120 @@ TEST_F(RegionTest, SchemaDoesNotAffectConfigHash) ASSERT_TRUE(got.has_value()); EXPECT_STREQ(got->name, "creator/Type"); } + +// ----------------------------------------------------------------------------- +// stats() — cross-process counter snapshot +// ----------------------------------------------------------------------------- + +TEST_F(RegionTest, StatsOnFreshRegionReportsZeros) +{ + auto cfg = default_cfg(); + auto region = kickmsg::SharedRegion::create( + SHM_NAME, kickmsg::channel::PubSub, cfg, "stats"); + auto s = region.stats(); + + EXPECT_EQ(s.rings.size(), cfg.max_subscribers); + EXPECT_EQ(s.live_rings, 0u); + EXPECT_EQ(s.total_writes, 0u); + EXPECT_EQ(s.total_drops, 0u); + EXPECT_EQ(s.total_losses, 0u); + EXPECT_EQ(s.pool_size, cfg.pool_size); + // Fresh region: every slot is on the free stack. + EXPECT_EQ(s.pool_free, cfg.pool_size); + + for (auto const& r : s.rings) + { + EXPECT_EQ(r.state, kickmsg::ring::Free); + EXPECT_EQ(r.in_flight, 0u); + EXPECT_EQ(r.write_pos, 0u); + EXPECT_EQ(r.dropped_count, 0u); + EXPECT_EQ(r.lost_count, 0u); + } +} + +TEST_F(RegionTest, StatsWritePosAdvancesWithPublishes) +{ + auto cfg = default_cfg(); + auto region = kickmsg::SharedRegion::create( + SHM_NAME, kickmsg::channel::PubSub, cfg, "stats"); + + kickmsg::Subscriber sub(region); + kickmsg::Publisher pub(region); + + constexpr int N = 5; + uint32_t payload = 0xC0FFEE; + for (int i = 0; i < N; ++i) + { + ASSERT_GE(pub.send(&payload, sizeof(payload)), 0); + } + + auto s = region.stats(); + EXPECT_EQ(s.live_rings, 1u); + EXPECT_EQ(s.total_writes, static_cast(N)); + + // Exactly one ring should be Live and carry write_pos == N. + std::size_t live_seen = 0; + for (auto const& r : s.rings) + { + if (r.state == kickmsg::ring::Live) + { + ++live_seen; + EXPECT_EQ(r.write_pos, static_cast(N)); + } + } + EXPECT_EQ(live_seen, 1u); +} + +TEST_F(RegionTest, StatsLostCountMatchesSubscriberLostOnOverflow) +{ + auto cfg = default_cfg(); // sub_ring_capacity = 8 + auto region = kickmsg::SharedRegion::create( + SHM_NAME, kickmsg::channel::PubSub, cfg, "stats"); + + kickmsg::Subscriber sub(region); + kickmsg::Publisher pub(region); + + // Publish more than the ring can hold without draining — forces the + // subscriber's drain-ahead path to bump lost_count on its next read. + uint32_t payload = 0; + std::size_t const to_publish = cfg.sub_ring_capacity * 3; + for (std::size_t i = 0; i < to_publish; ++i) + { + payload = static_cast(i); + ASSERT_GE(pub.send(&payload, sizeof(payload)), 0); + } + + // Drive the subscriber: the first try_receive hits the drain-ahead + // branch and jumps read_pos forward, recording the skipped count. + while (sub.try_receive()) { /* drain */ } + + EXPECT_GT(sub.lost(), 0u); + + auto s = region.stats(); + // Exactly one ring is Live — its lost_count equals the subscriber's. + uint64_t ring_lost = 0; + for (auto const& r : s.rings) + { + ring_lost += r.lost_count; + } + EXPECT_EQ(ring_lost, sub.lost()); + EXPECT_EQ(s.total_losses, sub.lost()); +} + +TEST_F(RegionTest, StatsPoolFreeTracksAllocations) +{ + auto cfg = default_cfg(); + auto region = kickmsg::SharedRegion::create( + SHM_NAME, kickmsg::channel::PubSub, cfg, "stats"); + + kickmsg::Subscriber sub(region); + kickmsg::Publisher pub(region); + + // Hold a slot mid-publish (allocate without publish). + auto* ptr = pub.allocate(8); + ASSERT_NE(ptr, nullptr); + + auto s = region.stats(); + // One slot is popped from the free stack and not yet returned. + EXPECT_EQ(s.pool_free, cfg.pool_size - 1); +} diff --git a/tests/unit/registry-t.cc b/tests/unit/registry-t.cc new file mode 100644 index 0000000..0bd5389 --- /dev/null +++ b/tests/unit/registry-t.cc @@ -0,0 +1,401 @@ +#include +#include +#include + +#include + +#include "kickmsg/Node.h" +#include "kickmsg/Registry.h" + +class RegistryTest : public ::testing::Test +{ +protected: + static constexpr char const* KMSG_NAMESPACE = "kickmsg_regtest"; + + void SetUp() override + { + kickmsg::Registry::unlink(KMSG_NAMESPACE); + } + + void TearDown() override + { + kickmsg::Registry::unlink(KMSG_NAMESPACE); + for (auto const& name : shm_to_unlink_) + { + kickmsg::SharedMemory::unlink(name); + } + } + + void track(std::string name) + { + shm_to_unlink_.push_back(std::move(name)); + } + +private: + std::vector shm_to_unlink_; +}; + +TEST_F(RegistryTest, OpenOrCreateIsIdempotent) +{ + auto r1 = kickmsg::Registry::open_or_create(KMSG_NAMESPACE); + EXPECT_EQ(r1.name(), std::string{"/"} + KMSG_NAMESPACE + "_registry"); + + // Second call opens the existing region — same name, same capacity. + auto r2 = kickmsg::Registry::open_or_create(KMSG_NAMESPACE); + EXPECT_EQ(r1.capacity(), r2.capacity()); + EXPECT_EQ(r1.name(), r2.name()); +} + +TEST_F(RegistryTest, RegisterAndSnapshotRoundTrip) +{ + auto reg = kickmsg::Registry::open_or_create(KMSG_NAMESPACE); + + uint32_t s1 = reg.register_participant( + "/test_topic_a", "/topic_a", kickmsg::channel::PubSub, + kickmsg::registry::Pubsub, kickmsg::registry::Publisher, "node_alpha"); + ASSERT_NE(s1, kickmsg::INVALID_SLOT); + + uint32_t s2 = reg.register_participant( + "/test_topic_a", "/topic_a", kickmsg::channel::PubSub, + kickmsg::registry::Pubsub, kickmsg::registry::Subscriber, "node_beta"); + ASSERT_NE(s2, kickmsg::INVALID_SLOT); + EXPECT_NE(s1, s2); + + auto snap = reg.snapshot(); + ASSERT_EQ(snap.size(), 2u); + + // Collect into a set so we don't depend on iteration order. + std::unordered_set roles_by_node; + for (auto const& p : snap) + { + EXPECT_EQ(p.shm_name, "/test_topic_a"); + EXPECT_EQ(p.channel_type, kickmsg::channel::PubSub); + roles_by_node.insert(p.node_name + ":" + std::to_string(p.role)); + } + EXPECT_TRUE(roles_by_node.count("node_alpha:1")); // Publisher = 1 + EXPECT_TRUE(roles_by_node.count("node_beta:2")); // Subscriber = 2 + + reg.deregister(s1); + auto after = reg.snapshot(); + ASSERT_EQ(after.size(), 1u); + EXPECT_EQ(after[0].node_name, "node_beta"); +} + +TEST_F(RegistryTest, DeregisterInvalidSlotIsNoop) +{ + auto reg = kickmsg::Registry::open_or_create(KMSG_NAMESPACE); + // Should not crash or throw. + reg.deregister(kickmsg::INVALID_SLOT); + reg.deregister(99999); // Past capacity — silently ignored. + EXPECT_EQ(reg.snapshot().size(), 0u); +} + +TEST_F(RegistryTest, CapacityExhaustionReturnsInvalidSlot) +{ + // Small capacity so we can fill it quickly. + constexpr uint32_t CAP = 4; + auto reg = kickmsg::Registry::open_or_create(KMSG_NAMESPACE, CAP); + EXPECT_EQ(reg.capacity(), CAP); + + std::vector slots; + for (uint32_t i = 0; i < CAP; ++i) + { + auto topic = "/topic_" + std::to_string(i); + uint32_t s = reg.register_participant( + "/test_topic_" + std::to_string(i), topic, + kickmsg::channel::PubSub, kickmsg::registry::Pubsub, + kickmsg::registry::Publisher, "node"); + ASSERT_NE(s, kickmsg::INVALID_SLOT); + slots.push_back(s); + } + + // One more push tips it over. + uint32_t full = reg.register_participant( + "/overflow", "/overflow", kickmsg::channel::PubSub, + kickmsg::registry::Pubsub, kickmsg::registry::Publisher, "node"); + EXPECT_EQ(full, kickmsg::INVALID_SLOT); + + // Free a slot and try again — should succeed. + reg.deregister(slots[0]); + uint32_t reclaimed = reg.register_participant( + "/after_free", "/after_free", kickmsg::channel::PubSub, + kickmsg::registry::Pubsub, kickmsg::registry::Subscriber, "node2"); + EXPECT_NE(reclaimed, kickmsg::INVALID_SLOT); +} + +TEST_F(RegistryTest, VersionMismatchOnSmallerExistingRegionThrows) +{ + // Validate the open path still works when the region already exists: + // open_or_create should happily attach to an existing compatible + // region of a different capacity (capacity is only used on create). + auto created = kickmsg::Registry::open_or_create(KMSG_NAMESPACE, 8); + EXPECT_EQ(created.capacity(), 8u); + + auto opened = kickmsg::Registry::open_or_create(KMSG_NAMESPACE, 1024); + // Capacity from the existing region, not the requested one. + EXPECT_EQ(opened.capacity(), 8u); +} + +TEST_F(RegistryTest, SweepStaleRemovesDeadPidEntries) +{ + auto reg = kickmsg::Registry::open_or_create(KMSG_NAMESPACE); + + // Live entry — current process pid. + uint32_t alive = reg.register_participant( + "/live_topic", "/live_topic", kickmsg::channel::PubSub, + kickmsg::registry::Pubsub, kickmsg::registry::Publisher, "alive"); + ASSERT_NE(alive, kickmsg::INVALID_SLOT); + + // Live entry for this process (via another participant). + reg.register_participant( + "/live_topic2", "/live_topic2", kickmsg::channel::PubSub, + kickmsg::registry::Pubsub, kickmsg::registry::Subscriber, "alive2"); + + EXPECT_EQ(reg.snapshot().size(), 2u); + + // No sweep needed yet — both pids alive. + EXPECT_EQ(reg.sweep_stale(), 0u); + EXPECT_EQ(reg.snapshot().size(), 2u); +} + +TEST_F(RegistryTest, SweepStaleReclaimsWedgedClaimingSlot) +{ + // A registrant that dies between the Free→Claiming CAS and the + // release-store of Active leaves the slot stuck. sweep_stale must + // reclaim it — otherwise the registry leaks capacity on every such + // crash. Simulate by reaching into the raw SHM and patching a slot. + auto reg = kickmsg::Registry::open_or_create(KMSG_NAMESPACE); + + // Fill slot 0 with a legitimate entry. + ASSERT_NE(reg.register_participant( + "/keeper", "/keeper", kickmsg::channel::PubSub, + kickmsg::registry::Pubsub, kickmsg::registry::Publisher, "keeper"), + kickmsg::INVALID_SLOT); + + // Open the registry SHM directly to install a wedged Claiming slot. + auto shm_name = std::string{"/"} + KMSG_NAMESPACE + "_registry"; + kickmsg::SharedMemory raw; + raw.open(shm_name); + auto* entries = reinterpret_cast( + static_cast(raw.address()) + sizeof(kickmsg::RegistryHeader)); + + constexpr uint32_t wedge_slot = 5; + ASSERT_EQ(entries[wedge_slot].state.load(), kickmsg::registry::Free); + + entries[wedge_slot].pid = 0x7fffffff; // guaranteed-dead PID + entries[wedge_slot].state.store(kickmsg::registry::Claiming, + std::memory_order_release); + + // Sweep should reclaim the wedged Claiming slot. + EXPECT_EQ(reg.sweep_stale(), 1u); + EXPECT_EQ(entries[wedge_slot].state.load(), kickmsg::registry::Free); +} + +TEST_F(RegistryTest, SweepStaleSkipsClaimingSlotsWithoutPid) +{ + // A Claiming slot with pid==0 may be a registrant between CAS and its + // first field write. Reclaiming would race with its stores. Must skip. + auto reg = kickmsg::Registry::open_or_create(KMSG_NAMESPACE); + + auto shm_name = std::string{"/"} + KMSG_NAMESPACE + "_registry"; + kickmsg::SharedMemory raw; + raw.open(shm_name); + auto* entries = reinterpret_cast( + static_cast(raw.address()) + sizeof(kickmsg::RegistryHeader)); + + constexpr uint32_t wedge_slot = 3; + entries[wedge_slot].pid = 0; + entries[wedge_slot].state.store(kickmsg::registry::Claiming, + std::memory_order_release); + + EXPECT_EQ(reg.sweep_stale(), 0u); + EXPECT_EQ(entries[wedge_slot].state.load(), kickmsg::registry::Claiming); + + // Put the slot back so cleanup doesn't trip over it. + entries[wedge_slot].state.store(kickmsg::registry::Free, + std::memory_order_release); +} + +// ----------------------------------------------------------------------------- +// Node integration — Node advertise/subscribe/etc should populate the registry +// ----------------------------------------------------------------------------- + +TEST_F(RegistryTest, NodeAdvertiseRegistersPublisher) +{ + kickmsg::channel::Config cfg; + cfg.max_subscribers = 2; + cfg.sub_ring_capacity = 4; + cfg.pool_size = 8; + cfg.max_payload_size = 32; + + { + kickmsg::Node n("pub_node", KMSG_NAMESPACE); + auto pub = n.advertise("topicX", cfg); + track("/" + std::string{KMSG_NAMESPACE} + "_topicX"); + + auto reg = kickmsg::Registry::open_or_create(KMSG_NAMESPACE); + auto snap = reg.snapshot(); + ASSERT_EQ(snap.size(), 1u); + EXPECT_EQ(snap[0].node_name, "pub_node"); + EXPECT_EQ(snap[0].role, kickmsg::registry::Publisher); + EXPECT_EQ(snap[0].shm_name, + std::string{"/"} + KMSG_NAMESPACE + "_topicX"); + } + + // Node went out of scope — entry should be gone. + auto reg = kickmsg::Registry::open_or_create(KMSG_NAMESPACE); + EXPECT_EQ(reg.snapshot().size(), 0u); +} + +TEST_F(RegistryTest, NodeBroadcastRegistersBoth) +{ + kickmsg::channel::Config cfg; + cfg.max_subscribers = 2; + cfg.sub_ring_capacity = 4; + cfg.pool_size = 8; + cfg.max_payload_size = 32; + + kickmsg::Node n("bcast_node", KMSG_NAMESPACE); + auto bh = n.join_broadcast("chanX", cfg); + track("/" + std::string{KMSG_NAMESPACE} + "_broadcast_chanX"); + + auto reg = kickmsg::Registry::open_or_create(KMSG_NAMESPACE); + auto snap = reg.snapshot(); + ASSERT_EQ(snap.size(), 1u); + EXPECT_EQ(snap[0].role, kickmsg::registry::Both); + EXPECT_EQ(snap[0].channel_type, kickmsg::channel::Broadcast); +} + +TEST_F(RegistryTest, NodeAdvertiseThenSubscribeUpgradesToBoth) +{ + // A Node that both advertises and subscribes to the same topic should + // appear once in the registry with role=Both (not two entries). + kickmsg::channel::Config cfg; + cfg.max_subscribers = 2; + cfg.sub_ring_capacity = 4; + cfg.pool_size = 8; + cfg.max_payload_size = 32; + + kickmsg::Node n("dual_node", KMSG_NAMESPACE); + auto pub = n.advertise("dualtopic", cfg); + auto sub = n.subscribe("dualtopic"); + track("/" + std::string{KMSG_NAMESPACE} + "_dualtopic"); + + auto reg = kickmsg::Registry::open_or_create(KMSG_NAMESPACE); + auto snap = reg.snapshot(); + ASSERT_EQ(snap.size(), 1u); + EXPECT_EQ(snap[0].role, kickmsg::registry::Both); + EXPECT_EQ(snap[0].node_name, "dual_node"); +} + +TEST_F(RegistryTest, MultipleNodesEachAppearOnce) +{ + kickmsg::channel::Config cfg; + cfg.max_subscribers = 4; + cfg.sub_ring_capacity = 4; + cfg.pool_size = 8; + cfg.max_payload_size = 32; + + kickmsg::Node pub("pub_a", KMSG_NAMESPACE); + auto p = pub.advertise("shared", cfg); + track("/" + std::string{KMSG_NAMESPACE} + "_shared"); + + kickmsg::Node s1("sub_a", KMSG_NAMESPACE); + auto s1_h = s1.subscribe("shared"); + kickmsg::Node s2("sub_b", KMSG_NAMESPACE); + auto s2_h = s2.subscribe("shared"); + + auto reg = kickmsg::Registry::open_or_create(KMSG_NAMESPACE); + auto snap = reg.snapshot(); + EXPECT_EQ(snap.size(), 3u); + + std::unordered_set nodes; + for (auto const& part : snap) + { + nodes.insert(part.node_name); + } + EXPECT_TRUE(nodes.count("pub_a")); + EXPECT_TRUE(nodes.count("sub_a")); + EXPECT_TRUE(nodes.count("sub_b")); +} + +// ----------------------------------------------------------------------------- +// list_topics — topic-centric aggregation +// ----------------------------------------------------------------------------- + +TEST_F(RegistryTest, ListTopicsGroupsByShmName) +{ + kickmsg::channel::Config cfg; + cfg.max_subscribers = 4; + cfg.sub_ring_capacity = 4; + cfg.pool_size = 8; + cfg.max_payload_size = 32; + + kickmsg::Node pub("pub_a", KMSG_NAMESPACE); + auto p = pub.advertise("telemetry", cfg); + track("/" + std::string{KMSG_NAMESPACE} + "_telemetry"); + + kickmsg::Node s1("sub_a", KMSG_NAMESPACE); + auto s1_h = s1.subscribe("telemetry"); + kickmsg::Node s2("sub_b", KMSG_NAMESPACE); + auto s2_h = s2.subscribe("telemetry"); + + auto reg = kickmsg::Registry::open_or_create(KMSG_NAMESPACE); + auto topics = reg.list_topics(); + + ASSERT_EQ(topics.size(), 1u); + auto const& t = topics[0]; + EXPECT_EQ(t.shm_name, std::string{"/"} + KMSG_NAMESPACE + "_telemetry"); + EXPECT_EQ(t.channel_type, kickmsg::channel::PubSub); + EXPECT_EQ(t.producers.size(), 1u); + EXPECT_EQ(t.consumers.size(), 2u); + EXPECT_EQ(t.stall_producers.size(), 0u); + EXPECT_EQ(t.stall_consumers.size(), 0u); + EXPECT_EQ(t.producers[0].node_name, "pub_a"); +} + +TEST_F(RegistryTest, ListTopicsBroadcastRoleBothInEveryLane) +{ + kickmsg::channel::Config cfg; + cfg.max_subscribers = 4; + cfg.sub_ring_capacity = 4; + cfg.pool_size = 8; + cfg.max_payload_size = 32; + + kickmsg::Node node("bcast", KMSG_NAMESPACE); + auto bh = node.join_broadcast("events", cfg); + track("/" + std::string{KMSG_NAMESPACE} + "_broadcast_events"); + + auto reg = kickmsg::Registry::open_or_create(KMSG_NAMESPACE); + auto topics = reg.list_topics(); + + ASSERT_EQ(topics.size(), 1u); + // A Both role counts as one producer AND one consumer. + EXPECT_EQ(topics[0].producers.size(), 1u); + EXPECT_EQ(topics[0].consumers.size(), 1u); + EXPECT_EQ(topics[0].producers[0].pid, topics[0].consumers[0].pid); +} + +TEST_F(RegistryTest, ListTopicsSortedByShmName) +{ + kickmsg::channel::Config cfg; + cfg.max_subscribers = 2; + cfg.sub_ring_capacity = 4; + cfg.pool_size = 8; + cfg.max_payload_size = 32; + + kickmsg::Node node("n", KMSG_NAMESPACE); + auto pc = node.advertise("c_topic", cfg); + track("/" + std::string{KMSG_NAMESPACE} + "_c_topic"); + auto pa = node.advertise("a_topic", cfg); + track("/" + std::string{KMSG_NAMESPACE} + "_a_topic"); + auto pb = node.advertise("b_topic", cfg); + track("/" + std::string{KMSG_NAMESPACE} + "_b_topic"); + + auto reg = kickmsg::Registry::open_or_create(KMSG_NAMESPACE); + auto topics = reg.list_topics(); + + ASSERT_EQ(topics.size(), 3u); + EXPECT_LT(topics[0].shm_name, topics[1].shm_name); + EXPECT_LT(topics[1].shm_name, topics[2].shm_name); +} diff --git a/tests/unit/subscriber-t.cc b/tests/unit/subscriber-t.cc index 2634a1a..bb46de6 100644 --- a/tests/unit/subscriber-t.cc +++ b/tests/unit/subscriber-t.cc @@ -168,9 +168,9 @@ TEST_F(SubscriberTest, BlockingReceiveTimesOut) auto region = kickmsg::SharedRegion::create(SHM_NAME, kickmsg::channel::PubSub, cfg); kickmsg::Subscriber sub(region); - nanoseconds start = kickmsg::since_epoch(); + nanoseconds start = kickmsg::monotonic_ns(); auto sample = sub.receive(milliseconds{50}); - nanoseconds elapsed = kickmsg::since_epoch() - start; + nanoseconds elapsed = kickmsg::monotonic_ns() - start; EXPECT_FALSE(sample.has_value()); EXPECT_GE(elapsed, milliseconds{40}); @@ -442,7 +442,7 @@ TEST_F(SubscriberTest, SlowPublisherNoCorruption) // Wait for signal to start publishing while (not pub_start.load(std::memory_order_acquire)) { - kickmsg::sleep(0ns); + kickmsg::yield(); } // Publish several messages slowly (each takes > commit_timeout)