Skip to content

Port :pg to Elixir, add cluster-replicated metadata, expose Registry-shaped API#8

Merged
twinn merged 13 commits intomainfrom
pg-elixir-port
Apr 8, 2026
Merged

Port :pg to Elixir, add cluster-replicated metadata, expose Registry-shaped API#8
twinn merged 13 commits intomainfrom
pg-elixir-port

Conversation

@twinn
Copy link
Copy Markdown
Owner

@twinn twinn commented Apr 7, 2026

Summary

This branch turns PgRegistry into a self-contained, distributed, metadata-aware process registry. Key changes:

  1. PgRegistry.Pg — an Elixir port of OTP-27 pg.erl as a single GenServer per scope, owning an ETS :duplicate_bag of {key, pid, meta, tag} rows. The tag is an internal per-entry identity that preserves :pg's ref-counted multi-join semantics in the flat-row layout and makes cross-node leaves precise.

  2. Per-member metadata — entries carry values that are gossiped between nodes. Pg.join/4, Pg.update_meta/4, a new :update subscription verb, and a {:update_meta, ...} wire message.

  3. Full Registry-shaped API on PgRegistry:

    • Registering: register/3, unregister/2, register_name/2 (incl. 3-tuple via name), unregister_name/1
    • Reading: lookup/2, lookup_local/2 (via Pg), get_members/2, get_local_members/2, keys/2, count/1, values/3, which_groups/1
    • Match-spec queries: match/3, match/4, count_match/3,4, select/2, count_select/2, unregister_match/3,4 — all run as native ETS queries with user-supplied {key, pid, value} shape transparently rewritten to operate on the 4-tuple storage
    • Updating: update_value/3 (self) and update_value/4 (any pid)
    • Registry-level metadata: meta/2, put_meta/3, delete_meta/2 (local-only, sibling ETS table)
    • Dispatch: dispatch/3, dispatch/4
    • Subscriptions: monitor_scope/1, monitor/2, demonitor/2 — runtime, ref-based, non-blocking
  4. Listeners — Registry-shaped {:register, scope, key, pid, value} / {:unregister, scope, key, pid} messages to statically-configured registered names. Addressed by atom, silently drops missing listeners (never crashes the scope).

  5. Per-node :unique mode — each node enforces local uniqueness; the cluster as a whole can still have one holder per node. Useful for "one singleton per node" patterns. Deliberately NOT cluster-wide unique (use :global for that). Uses Process.alive?/1 to ignore dead holders and close the DOWN race.

  6. Registry-shaped keyword start_link/1PgRegistry.start_link(name: ..., listeners: ..., keys: :unique). Validates :keys, :partitions, :listeners, and unknown options in both the keyword form and the 2-arity start_link/2 with the same diagnostics.

  7. Wire-protocol version handshake@protocol_version 1 threaded through every discover message. Peers with mismatched versions refuse to peer with a Logger.warning instead of silently corrupting state. Bumping the constant on any future wire change is the migration story for rolling upgrades.

  8. Non-blocking monitor_scope/1 — the snapshot fold runs in the calling process via direct ETS reads, not inside the GenServer. Documented overlap window between subscription and snapshot completion.

Design decisions worth calling out

  • :set:duplicate_bag with per-entry monotonic tag. One row per entry makes match-specs run natively against ETS; the tag gives every entry its own identity even when (key, pid, meta) would otherwise duplicate.
  • Listeners are local-only. Each node fires for both locally-originated and gossiped events. No replication of listener configuration.
  • Scope-level metadata is local-only. Same reasoning — Registry's meta isn't clustered either.
  • No lock/3, no cluster-wide unique keys, no partitions > 1. Each is a deliberate "this would require consensus or complicated trade-offs that don't fit the gossip model" decision. :partitions and :keys validate loudly rather than silently shipping surprising behavior.
  • Sync-driven :update notifications are deliberately unimplemented. ETS state converges correctly across netsplits; only the notification stream during convergence is incomplete. Re-implementing would require entry-level identifiers on the wire and the multi-join diff is ambiguous. There's a long source comment on sync_one_group/4 explaining the reasoning.

Test plan

  • mix test137 tests, 0 failures
  • mix credo — clean
  • mix format --check-formatted — passes (styler plugin enabled)
  • mix docs — builds without warnings
  • CI green on Elixir 1.15/OTP 26, 1.17/OTP 27, 1.19/OTP 28
  • Single-node: 42 tests for Pg (join/leave/update_meta/monitor/listeners/unique mode/down races/etc), 79 for PgRegistry (Registry-shaped API surface)
  • Cluster: 7 tests via :peer + :erpc covering join/leave visibility, remote DOWN, remote process exit, monitor_scope across nodes, cross-node update_meta, remote join with metadata, per-node unique mode under remote DOWN
  • Listener delivery covered for join, leave, update (no listener message), fan-out, process-exit cleanup, missing-listener resilience, crashed-listener resilience
  • Wire protocol version handshake reject paths tested for both 2-arity (legacy) and 3-arity version-mismatch discovers
  • Option validation tested for both the keyword start_link/1 form and the 2-arity start_link/2 form

What's NOT in this PR

  • lock/3 — explicitly skipped
  • Cluster-wide unique keys — would require consensus, deliberately not attempted (see the README's "Per-node uniqueness" section)
  • Partitions — we use a single ETS table per scope; partitions > 1 raises
  • Sync-driven :update notifications — see design decisions above
  • flush/1 BEAM receive optimization — open investigation item; no action without benchmarks

Migration note

PgRegistry no longer uses Erlang :pg as its backend. Anyone reaching past the public API into :pg.get_local_members(scope, key) directly must switch to PgRegistry.Pg.get_local_members/2. The two storage worlds are now disjoint — a scope started by PgRegistry.start_link/1 lives in PgRegistry.Pg's ETS table, not in :pg's.

Version is bumped to 0.3.0. See CHANGELOG.md for the full list of new features, behaviour changes, and bug fixes.

twinn added 13 commits April 7, 2026 13:32
Reproduces OTP 27 pg.erl as a self-contained Elixir GenServer so we
can extend it with metadata support without forking OTP. Wire format
is byte-compatible with :pg, so a node running PgRegistry.Pg can mesh
with a node running :pg under the same scope name.

Tests cover both single-node behavior and a multi-node setup using
:peer + :erpc, with a small helper module under test/support so its
beam is loadable on the peer node.
Extends the :pg port with cluster-replicated metadata. Each entry in
the ETS table is now a {pid, meta} pair instead of a bare pid; meta
defaults to nil so the existing API is unchanged. New surface:

  Pg.join(scope, group, pid_or_pids, meta)
  Pg.lookup(scope, group)         -> [{pid, meta}]
  Pg.lookup_local(scope, group)   -> [{pid, meta}]
  Pg.update_meta(scope, group, pid, new_meta)

Subscription messages also carry entries now, plus a new verb:

  {ref, :join,   group, [{pid, meta}]}
  {ref, :leave,  group, [{pid, meta}]}
  {ref, :update, group, [{pid, old_meta, new_meta}]}

The wire protocol gossips entries between scope GenServers and a new
{:update_meta, ...} message lets remote nodes apply metadata changes
without leave+rejoin. Sync-driven updates after a netsplit do not yet
emit :update notifications (the diff is ambiguous for multi-join);
state still converges correctly, only notifications during recovery
are imperfect.

Adds credo and styler as dev tooling. Both pass clean.
PgRegistry.Pg now stores one ETS row per entry instead of aggregating
entries into a list per group. Each row is {key, pid, meta, tag} where
tag is an opaque per-node monotonic integer that gives every entry its
own identity. Tags are internal — callers never see them — but they let
ref-counted multi-join semantics survive a flat-row layout, and they
make cross-node leaves precise (each leave names a specific entry).

Wire protocol now carries tags in join/sync messages and uses
{group, pid, tag} triples in leave messages. The 71 existing tests
all pass against the new layout, including the 6 cluster tests.

PgRegistry now uses PgRegistry.Pg as its backend and exposes the full
Registry-shaped API: register/3, unregister/2, values/3, match/3,
match/4, count_match/3, count_match/4, select/2, count_select/2,
unregister_match/3, unregister_match/4, dispatch/4, meta/2, put_meta/3,
delete_meta/2. The match-spec functions translate user-supplied
{key, pid, value} match-specs to the 4-tuple storage shape by
appending :_ for the tag, and run natively against ETS.

Scope-level metadata (meta/2 etc) is local-only and lives in a sibling
ETS table named :"#{scope}_meta", created and torn down by Pg's init
and terminate. Reads bypass the GenServer; writes go through it.

86 tests passing, credo clean. lock/3 is intentionally not implemented.
Listeners are configured at scope start-up via `listeners: [Atom]` and
receive raw Registry-shaped messages on register/unregister:

  {:register,   scope, key, pid, value}
  {:unregister, scope, key, pid}

They fire for both locally-originated and gossiped events, and on
local DOWN, remote leave, and remote scope crash. Updates do not
fire listener messages (matching Registry).

Three new sugar surfaces for users porting from Registry:

  PgRegistry.update_value(scope, key, value)         # self() implicit
  PgRegistry.start_link(name: :reg, listeners: [L])  # keyword shape
  {PgRegistry, name: :reg, listeners: [L]}           # child_spec keyword

The keyword form validates :keys and :partitions instead of silently
accepting anything: :duplicate / 1 are no-ops, :unique and partitions > 1
raise ArgumentError with multi-line messages explaining why and
pointing at :global / scope-splitting respectively. The intent is
that users porting from Registry hit a load-bearing error rather
than discovering the semantic gap in production.

100 tests passing, credo clean.
The test process and the local scope GenServer were both monitoring
the remote scope pid independently. After killing the remote scope,
the test waited only for its own DOWN, then called :sys.get_state on
the local scope. But the local scope's DOWN may still be in flight on
the dist channel — :sys.get_state only drains messages already in the
mailbox, not in-flight ones, so the assertion that members are gone
could fire before the local DOWN handler had run.

Fix: subscribe to monitor_scope before killing, then assert_receive
on the leave notification. The notification is fired from inside the
DOWN handler, so receiving it proves the ETS row has been removed.
Elixir 1.17's formatter rewrites the inline `receive do: (pattern -> body)`
shorthand to a broken `receive do: ->(pattern, body)` shape that fails
to parse. The multi-line form formats consistently across versions.
Covers all the new surface: register/3, lookup, match/select, listeners,
scope metadata, the {key, pid, value, tag} storage layout, and the
deliberate gaps (no :unique, no partitions, no sync-driven :update
notifications) with the reasoning behind each.
PgRegistry.Pg now accepts keys: :unique. Each node enforces local
uniqueness — a second join under the same key returns
{:error, {:already_registered, pid}} — but the cluster as a whole
can still have one holder per node, with no consensus or locking.

Pg.join's return shape grows: :ok | {:error, {:already_registered, pid}}.
Multi-pid joins raise ArgumentError in unique mode (the API check
runs in the public Pg.join wrapper so callers see a proper raise
instead of a gen_server exit).

PgRegistry.register/3 propagates the error tuple. PgRegistry.register_name/2
returns :no on collision so the :via tuple integrates correctly with
GenServer.start_link, surfacing {:error, {:already_started, pid}}
naturally. parse_keyword_opts now accepts keys: :unique and threads
it through.

README documents the per-node-only semantics with explicit pointers
to :global for the cluster-wide-unique case and to leader-election
libraries for the singleton case.

13 new tests covering the Pg layer, the PgRegistry layer, the via
tuple integration, and the cluster-level "two nodes, two holders"
property. 112 tests passing, credo clean.
1. unregister_match used to delete the LIFO head of self()'s entries
   under the key, not the entry that matched. Concrete: join :a then
   :b, unregister_match :a → :b was deleted, :a remained.

   Fix: add Pg.unregister_match/5 as a gen-server-side primitive that
   builds an ETS match-spec returning the full row + tag for each
   match, then deletes by exact tag and updates state.local entry-by-
   entry. PgRegistry.unregister_match becomes a thin wrapper.

2. count_select stripped the user's body and substituted [true],
   silently ignoring any filter the user expressed in the body
   (e.g. body returning false → count was wrong). Fix: pass through
   the rewritten spec as-is so :ets.select_count respects the user's
   body.

3. count/1 walked which_groups + lookup per key, O(N×M) over the whole
   table. Replaced with :ets.info(scope, :size) — O(1).

4. notify_listeners called send/2 with a registered name as
   destination. If the name wasn't currently bound — listener crashed,
   not yet started, mid-restart — send/2 raised ArgumentError inside
   the gen_server, killing the entire scope. Fix: deliver_listener/2
   uses Process.whereis first and silently drops missing listeners,
   matching what Elixir's Registry does internally.

5. The wire format had no version negotiation. Rolling deploys
   between this branch and any older version (or any future
   incompatible version) would corrupt state silently. Added
   @protocol_version 1, threaded through every discover message,
   reject 2-arity (legacy) discovers and version-mismatched discovers
   with a Logger.warning. Bumping the constant on any future wire
   change is now the migration story.

Eight new tests cover all five fixes including the LIFO ordering bug,
the count_select [false] case, the missing-listener crash path, the
crashed-listener path, and both protocol-version reject paths.

120 tests passing, credo clean.
A. down_local_pid no longer fabricates a {pid, nil} leave notification
   when the ETS row is missing during cleanup. The row being gone
   means some other path already removed it (and notified), so we
   skip the entry instead of synthesizing a fake meta.

B. keys: :unique check now ignores dead local holders. Closes the
   race where a :DOWN for the previous holder is queued in the
   GenServer mailbox but hasn't been processed yet — without this,
   a subsequent register would spuriously fail with
   {:already_registered, dead_pid}.

C. do_unregister no longer uses && — both branches of Pg.leave's
   return are truthy, so the && was decorative and confusing.

D. Pg.leave with an empty pid list now returns :ok instead of
   :not_joined. Added a head clause for the empty case.

L. Option validation now runs in the 2-arity start_link/2 form too,
   not just the keyword form. parse_keyword_opts and the 2-arity
   path both call a shared validate_pg_opts! that rejects unknown
   options, validates :keys, :partitions, and :listeners with the
   same diagnostics. Refactored into per-concern helpers
   (validate_known_opts!, validate_keys!, validate_partitions!,
   validate_listeners!) to keep cyclomatic complexity reasonable.

H. PgRegistry now exposes monitor_scope/1, monitor/2, and
   demonitor/2 as defdelegates. Users no longer have to drop
   into PgRegistry.Pg for runtime subscriptions.

Bumped to 0.3.0 with a CHANGELOG entry covering the metadata
extension, the Registry-shaped API, listeners, per-node :unique
mode, the wire-protocol break, and the migration story.

Eight new tests for the fixes (including a deterministic test for
the dead-holder race that uses :sys.replace_state to inject the
stale row, and a test that proves down_local_pid no longer fires
spurious notifications). 129 tests passing, credo clean.
unmangle pipes, doc clarifications, cluster test for unique-mode
remote DOWN

Functional changes:

- pop_first_group/2 is now tail-recursive with an accumulator and
  Enum.reverse(acc, tail). For pids joined to many groups, popping
  one entry no longer rebuilds the entire prefix body-recursively.

- dispatch/4 raises ArgumentError on any non-empty option list rather
  than silently ignoring them. parallel: true silently going serial
  was a footgun; if anyone passes an option, they'll find out
  immediately. Updated docstring to point at "spawn tasks from the
  callback yourself" as the parallelism story.

- which_groups/1 and all_local_entries/1 no longer have the
  styler-mangled "lambda |> :ets.foldl(acc, table)" shape. Both use
  named module-level helpers (which_groups_collect/2,
  collect_local_entry/2) and call :ets.foldl directly with positional
  args. Cleaner to read and stable across re-formats.

Doc clarifications (no behavior change):

- Pg.start_link/2: explicitly state listeners must be locally-
  registered names ({name, node()} is not supported), and that
  missing listener atoms are silently dropped.

- sync_one_group/4: long source comment explaining why no :update
  notifications fire during sync, with a pointer to the README's
  net-split section. Future contributors won't accidentally "fix"
  the deliberate gap.

- Pg.join/4: explicit "Failure modes" section listing both the
  {:error, {:already_registered, pid}} return for unique-mode
  collisions and the ArgumentError raise for multi-pid joins.

- PgRegistry.update_value/4: warning callout that all of pid's
  matching entries are collapsed to new_value, with a note that
  this is different from Registry.update_value/3 (which only exists
  in :unique mode).

- PgRegistry.register_name/2: explain that :no doesn't carry the
  holder pid, and tell direct callers to use whereis_name/1 to
  recover it.

- PgRegistry.whereis_name/1: explicit note about the asymmetry
  with the 3-tuple via name (value field is consumed only at
  registration, ignored on lookup).

- PgRegistry.match/3: note about the single-clause limitation,
  pointing at select/2 for multi-clause cases.

- README: comparison table is more honest about :global's
  net-split behavior (collisions resolved by user-supplied
  resolver, may kill processes) and PgRegistry's per-register cost
  (gossip per peer).

New cluster test: in unique mode, killing the holder on a peer
node propagates a leave to other nodes via DOWN, and the slot
becomes available locally. Subscribed via monitor_scope so the
assertion is deterministic.

133 tests passing, credo clean.
The snapshot fold used to run inside handle_call(:monitor, ...),
which blocked all other operations on the scope for the duration —
multi-millisecond pauses on scopes with thousands of entries, with
joins/leaves queueing behind it.

Split into a cheap subscribe call (Process.monitor + Map.put, runs
in the GenServer) and a snapshot read (the foldl, runs in the
caller's own process via direct ETS access). The GenServer is now
free for concurrent operations during the snapshot.

The trade-off is a race window: between the subscription being
established and the snapshot read finishing, concurrent activity
may result in entries appearing both in the snapshot and as
delivered events. Documented in the docstring; consumers should
absorb events idempotently (Map/MapSet keyed by (group, pid))
rather than relying on exactly-once delivery. State always
converges to correct.

Same fix applied to monitor/2 for the single-key case — its
snapshot is just an :ets.lookup so it's even cheaper, but the
structure now matches.

Four new tests covering: pre-subscribe entries appear in snapshot,
post-subscribe events fire, snapshot doesn't block concurrent
operations on a 2000-entry scope, snapshot is correct across many
keys. 137 tests passing, credo clean.
- README linked PgRegistry.Pg.sync_one_group/4 which is a private
  function, so mix docs warned on every build. Rewrote the prose to
  point at the source file directly and summarise the rationale
  (multi-join ambiguity) inline.

- mix.exs description still said "backed by Erlang's :pg module".
  That was true at the start of this branch but we've since moved to
  PgRegistry.Pg. Updated the blurb to match what actually ships in
  0.3.0 — distributed, metadata-aware, Registry-shaped API, listeners,
  ETS match-spec queries.

Version still 0.3.0. 137 tests, 0 failures.
@twinn twinn merged commit 57eee5b into main Apr 8, 2026
3 checks passed
@twinn twinn deleted the pg-elixir-port branch April 10, 2026 15:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant