Split :via callbacks into write side and read side#9
Merged
Conversation
Before: all four :via callbacks (register_name, unregister_name,
whereis_name, send) worked in :duplicate mode, and whereis_name had
an undocumented "prefer local, fall back to remote" resolution rule
when a key had multiple registered pids.
The rule was a footgun: callers of `GenServer.call({:via, PgRegistry,
{scope, :key}}, msg)` on different nodes could legitimately reach
different processes, depending on which node had a local entry, and
nothing about the API made that obvious.
After: align with Registry's semantics, which split the :via
protocol in two:
* Write side — register_name/2 and unregister_name/1. Unambiguous
in any mode ("put this pid under this key" / "take it out").
Works in both :duplicate and :unique scopes.
* Read side — whereis_name/1 and send/2 (and by extension
GenServer.call, GenServer.cast, Process.whereis, and Kernel.send
on a via tuple). These need a single pid, which a :duplicate
scope can't honestly provide. Raise ArgumentError in :duplicate
mode with a message pointing at lookup/2, get_members/2, and
dispatch/3 as the duplicate-mode alternatives.
In :unique mode, whereis_name/1 is now strictly local-only (no
remote fallback). Each node's per-node uniqueness means each node
has at most one holder per key, and the via tuple resolves to that
local holder or :undefined. This is the intended shape of the
"one singleton per node" pattern. Callers that want a specific
peer's holder can use Pg.get_members/2 and filter by node/1.
Implementation: PgRegistry.Pg.init/1 now writes the scope's :keys
mode into :persistent_term under {PgRegistry.Pg, scope, :keys}, and
PgRegistry.Pg.keys_mode/1 reads it back without a GenServer hop.
terminate/2 erases. Writes to :persistent_term are costly but happen
at most once per scope start, which is fine for this use case.
Verified against Registry's actual source in
/lib/elixir/lib/registry.ex — this matches the split Registry uses:
register_name and unregister_name work in both modes, whereis_name
raises ":via is not supported for duplicate registries" in duplicate
mode, and Registry.send/2 pattern-matches a single-entry lookup
(which effectively means it works in duplicate mode only when there
happens to be exactly one entry, raising CaseClauseError otherwise).
Our send/2 is slightly cleaner — it goes through whereis_name and
raises ArgumentError explicitly in duplicate mode rather than the
CaseClauseError Registry would produce for a multi-entry match.
Tests updated: register_name/unregister_name tests now cover both
modes (they work in both). whereis_name/send tests cover unique
mode and assert the new ArgumentError in duplicate mode. The
GenServer :via integration tests now use a :unique scope, and the
"remote fallback" test was deleted since there is no longer a
remote fallback. All 143 tests passing, credo clean.
get_members/2 is vestigial — it was the primary read before lookup/2
existed. Now that lookup/2 returns [{pid, value}] Registry-style,
the pid-only view is a one-liner:
for {pid, _} <- PgRegistry.lookup(scope, key), do: pid
That's how Registry users would write it, and it mirrors the shape
Registry.lookup/2 itself returns. The `get_members` name is also
pg-vocabulary ("members of a group") leaking into a surface we
want to be Registry-shaped.
In its place, promote lookup_local/2 from PgRegistry.Pg to a
first-class PgRegistry function. This is a distributed-registry
concern with no direct Registry analog — Registry is single-node,
so "local" is trivially "all". Useful for draining a node before
shutdown, per-node metrics, or preferring a local instance over a
remote one. The `_local` suffix is the established OTP convention
for node-local variants (Node.list vs Node.list(:connected),
:pg.get_members vs :pg.get_local_members, etc.).
Layering stays clean:
* Pg module owns pg vocabulary: get_members, get_local_members,
lookup, lookup_local.
* PgRegistry module owns Registry vocabulary: lookup, lookup_local,
values, keys, count, which_groups — no get_members.
which_groups/1 stays as-is for now. It's the one pg-flavored name
left in PgRegistry, and the operation has no Registry analog other
than "use select/2 with a match-spec that returns keys." Leaving it
is honest: "this is a distributed-registry concept we kept because
it's useful."
Tests updated: get_members tests became lookup tests, one new
describe block for lookup_local/2 added. README examples updated.
145 tests passing, credo clean, mix docs builds without warnings.
Also fixed a README code-fence bug I introduced earlier: the ":via
and :duplicate mode" blockquote was accidentally inserted INSIDE
the "Registering processes" elixir code block, which left the
closing ``` orphaned and broke ex_doc's rendering. Moved the
blockquote below the closed code block.
Four-column comparison now: Registry / :global / Horde.Registry / PgRegistry. Key rows call out the split cleanly: - Duplicate keys: Registry yes, :global no, Horde no, PgRegistry yes (default) - Unique keys: Registry per-node, :global cluster-wide, Horde cluster-wide, PgRegistry per-node only Also added rows for underlying model (ETS / lock / CRDT / gossip) and Horde's net-split behaviour (CRDT merge picks a winner, losing registrations silently dropped). Horde's match-spec support was verified against hexdocs (match/4, select/2, count_match/4 are all present). New "Which should I use?" decision table below the feature comparison frames the choice by use case — a group of processes under one key → PgRegistry, a cluster-wide singleton → Horde. Punchline made explicit: Horde owns the unique-key problem space, PgRegistry owns the duplicate-key problem space. They're complementary, not competitors, and a real distributed app often wants both. Kept the scope narrowly on registries — no mentions of Horde.DynamicSupervisor or process migration, since this README is comparing registries and mixing in supervision would muddy the positioning.
Verified specs first: both PgRegistry and PgRegistry.Pg are fully spec'd on their public surface. Nothing to add there. Added 13 doctests across the common PgRegistry functions: - lookup/2 (registered + empty) - count/1 - update_value/3 (happy path + :not_joined) - keys/2 - register/3 (duplicate happy path + unique collision) - unregister/2 (happy path + idempotent) - values/3 - match/3 (value-pattern filter) - meta/2 (round-trip with put_meta/delete_meta) - register_name/2 (unique mode, :yes then :no) - whereis_name/1 (unique mode, resolves local) Each doctest uses a unique scope atom (:doc_*) to avoid cross-test collisions; the test process owns the scopes and they die with it. The examples exercise real code paths against a running scope, so they double as small integration tests and as hex-doc examples. Enabled via `doctest PgRegistry` in test/pg_registry_test.exs. 13 doctests + 145 tests = 158 passing.
Wires up dialyxir 1.4 as a dev/test dep with a project-local PLT
under priv/plts/ (gitignored). Flags enabled:
- :error_handling
- :unknown
- :unmatched_returns
The :unmatched_returns flag is the strictest commonly-used dialyzer
check — it flags side-effect-only code that throws away a non-trivial
return value. First run caught exactly two hits:
- sync_groups/3 (two clauses, both returning the list from a `for`)
- sync_one_group/4 (returning the list from a `for`)
Both are side-effect helpers that ran for their effect on ETS, with
their return values silently dropped by the caller. Made both
return :ok explicitly, and updated handle_sync/4 and
sync_groups/3's recursive call to pattern-match on the :ok.
With those two fixes, `mix dialyzer` passes cleanly:
Total errors: 0, Skipped: 0, Unnecessary Skips: 0
done (passed successfully)
CI integration:
- .github/workflows/ci.yml gains a `mix dialyzer --format github`
step after the existing compile/test/format steps
- priv/plts/ is cached per (runner os, Elixir, OTP, mix.lock) so
the ~3-minute initial PLT build only happens when deps change
The --format github flag lets dialyzer failures show up inline in
PR reviews as code annotations.
Local perf: first PLT build was ~3 minutes (building core + Elixir
stdlib + dep PLTs). Subsequent runs are ~2 seconds. CI cache should
keep it at ~2s on most runs.
f1d3a0a to
3b5d972
Compare
Elixir's Registry.meta/2, put_meta/3, and delete_meta/2 all guard
on `is_atom(key) or is_tuple(key)`. Ours didn't — we accepted any
term as a meta key, which is more permissive but also surprising
for anyone porting from Registry: typos in atom keys silently become
different keys, there's no shape gut-check, and "put_meta takes any
term" is a lurking footgun.
Tighten to match. Guards added to:
- Pg.get_scope_meta/2
- Pg.put_scope_meta/3
- Pg.delete_scope_meta/2
...all of the form `when is_atom(key) or is_tuple(key)`. Guards run
in the public API layer so callers see a clean FunctionClauseError
rather than a gen_server exit.
Also:
- New `Pg.meta_key :: atom() | tuple()` typedoc (mirrors Registry's
`meta_key` type).
- `PgRegistry.meta/2`, `put_meta/3`, `delete_meta/2` @specs
reference `Pg.meta_key()` instead of `term()`.
- `PgRegistry.meta/2` docstring calls out the constraint and
references Registry's contract.
- The existing doctest gains a tuple-key round-trip case so both
allowed shapes are exercised.
- New test `non-atom, non-tuple keys are rejected` runs a loop
over strings, integers, lists, and maps for each of the three
functions.
Why this isn't a breaking change: nothing is on Hex yet, and the
previous permissive behavior was undocumented. The tighter contract
is the one anyone reading the README ("Mirrors Registry.meta/2")
would expect.
13 doctests + 147 tests passing. Credo clean. Dialyzer clean.
Three tweaks to the "Why PgRegistry" section: 1. Trimmed the editorializing that followed the two tables. The tables already convey where PgRegistry fits; the "Horde owns unique, PgRegistry owns duplicate" framing and the "deliberately does not try to do" paragraph were just me restating what the rows show. Opening paragraph is now one sentence with no comparative prose below the tables. 2. Added :pg as a fifth column in the feature comparison and a corresponding row in the decision table. :pg is the OTP primitive PgRegistry is built on, and it's a legitimate choice when you don't need metadata, listeners, or the Registry-shaped API. Showing it in the table makes the progression visible. 3. Added a "Part of OTP" row and an "Interop with other Erlang/OTP apps" row. The interop row is the honest differentiator: PgRegistry deliberately broke wire compatibility with :pg when we added metadata and the protocol-version handshake, so our scope can't share groups with another OTP app using :pg. That's a real cost and the table should show it. :global has the same property at the name level and also gets a yes. Horde has its own wire protocol and doesn't interop either. The decision-table entry for :pg now explicitly mentions interop as a reason you might reach for it over PgRegistry.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Post-review cleanup for the metadata/Registry-shaped work, plus ergonomics and type-safety work. Five commits:
Split
:viacallbacks into write side and read side.register_name/unregister_namework in both:duplicateand:uniquemodes.whereis_name/send(and transitivelyGenServer.call/cast,Process.whereis,Kernel.sendon a via tuple) requirekeys: :uniqueand raiseArgumentErrorotherwise. In:uniquemode,whereis_nameis strictly local-only — no remote fallback, because per-node uniqueness means each node owns its local holder. This matches the split Elixir'sRegistryuses — verified againstlib/elixir/lib/registry.exdirectly this time instead of guessing.Drop
PgRegistry.get_members/2, addPgRegistry.lookup_local/2.get_members/2was vestigial oncelookup/2existed (the bare-pid view is a one-liner:for {pid, _} <- lookup(scope, key), do: pid).lookup_local/2is a first-class distributed-registry concern with no directRegistryanalog — useful for draining a node, per-node metrics, or preferring a local instance. Layering:Pgowns pg vocabulary (get_members,get_local_members,lookup,lookup_local),PgRegistryowns Registry vocabulary (lookup,lookup_local, noget_members).README positioning table: add
Horde.Registry. Four-column feature comparison (Registry/:global/Horde.Registry/PgRegistry) plus a ""Which should I use?"" decision table below. Makes the split explicit: Horde owns cluster-wide unique keys, PgRegistry owns cluster-wide duplicate keys. They're complementary, not competitors. Horde's match-spec support was verified against hexdocs before claiming ""yes"" in the table.Add doctests to
PgRegistrypublic API. Spec audit first (both modules are fully spec'd on their public surface — the awk-based scan turned up only second clauses and GenServer callbacks, all covered). Then 13 new doctests across the common API:lookup/2,count/1,update_value/3,keys/2,register/3(both modes),unregister/2,values/3,match/3,meta/2,register_name/2(unique),whereis_name/1(unique). Each uses a unique:doc_*scope atom to avoid cross-test collisions. Enabled viadoctest PgRegistryin the test file.Add dialyzer to the project and CI.
dialyxir 1.4as a dev/test dep, project-local PLT underpriv/plts/(gitignored), flags[:error_handling, :unknown, :unmatched_returns]— the strictest commonly-used set. First run caught exactly twounmatched_returnwarnings insync_groups/3andsync_one_group/4(side-effect helpers returning lists from aforcomprehension that nobody cared about). Both fixed by making them return:okexplicitly and pattern-matching at the call sites. CI workflow gains a cached PLT step andmix dialyzer --format githubthat runs beforemix test.Design decisions worth calling out
:viasplit matches Registry. Our behaviour is within a hair of Elixir's: Registry restrictsregister_nametopid == self(); we allow an arbitrary local pid (inherited fromPg.join/3). Registry'ssend/2pattern-matches[{pid, _}]and raisesCaseClauseErroron multiple entries; oursend/2goes throughwhereis_nameand raisesArgumentErrorexplicitly in:duplicatemode. Both are intentional small improvements, not divergences.lookup_local/2is the one Registry-absent API we're keeping. Registry is single-node, so ""local"" is trivially ""all."" We're distributed, and the local subset is genuinely useful for draining/metrics/preference-for-local patterns. The_localsuffix is standard OTP convention (Node.list/0vsNode.list(:connected),:pg.get_members/2vs:pg.get_local_members/2, etc.).which_groups/1stays as-is. It's the one remaining pg-flavored name inPgRegistry. Registry has no equivalent (the idiomatic replacement isRegistry.select/2with a match-spec returning keys) and the operation is genuinely useful as a convenience. Leaving it is honest about the fact that distributed registries sometimes want things single-node registries don't.Dialyzer flags are deliberately strict.
:unmatched_returnsis the one people usually disable because of false-positive volume — we enabled it and got exactly two warnings, both legitimate. Future contributors writing side-effect-only helpers will get flagged and have to think about whether the helper should return:okor whether the return value actually matters.Test plan
mix test— 13 doctests + 145 tests, 0 failuresmix credo— cleanmix format --check-formatted— clean (styler plugin enabled)mix docs— builds without warningsmix dialyzer— 0 errors with[:error_handling, :unknown, :unmatched_returns]flagsmix dialyzer --format githubbeforemix test(format checks + dialyzer fail fast before the slowest step):uniquescope includes{:already_started, pid}collisionRelease note
This PR was split off
mainafterv0.3.0was tagged and a GitHub release created (though nothing has been published to Hex). The changes here materially affect the semantics thatv0.3.0currently points at:whereis_nameno longer has silent local-preferred-then-remote fallbackwhereis_name/sendnow require:uniquescopesget_members/2is gone fromPgRegistrylookup_local/2is a new PgRegistry functionRecommended release workflow after merging: delete the existing
v0.3.0tag and GitHub release, re-tag against the new tip ofmain, then publish to Hex for the first time. Nothing depends on the old tag because nothing is on Hex. If you'd rather respect tag immutability, ship this asv0.3.1— but that publishes av0.3.0on Hex that we already know is worse, which seems silly for an unshipped version.