Measure whether an LLM agent's memory stayed the same after a fine-tune, migration, or personalisation pass.
A Nozickian closest-continuer metric for portable agent memory. Given two PAM snapshots of an agent (before and after a fine-tune, migration, or personalisation pass), ExitKit returns a continuity score in
[0, 1]and a structured drift report — so you can answer "is it still the same agent?" with a number instead of vibes.
ExitKit is a Python library that measures whether an LLM agent's memory remained the same after an update. You pass two PAM-format MemoryStore snapshots — one captured before a fine-tune, provider migration, or personalisation pass, one captured after — and get back a single continuity score from 0.0 (completely different) to 1.0 (identical), plus a structured breakdown of which memories were added, removed, or silently rewritten.
The metric combines two components:
- Structural identity diff — which memory objects changed by ID and content hash
- Semantic drift — cosine distance between centroid embeddings
Both components are configurable, and you can plug in any embedding model you like.
LLM agents accumulate memory: preferences, facts, project context. As you fine-tune, switch providers, or run a personalisation pass, the memory mutates. Did the agent stay the same agent, or did you replace it?
Robert Nozick's Tracking Truth and closest-continuer framework (Philosophical Explanations, 1981, §1) gives a principled answer: the post-update snapshot is the continuer iff it remains the highest-scoring candidate by a continuity metric and exceeds a chosen threshold.
ExitKit ports that idea to PAM-format memory snapshots and returns:
- a deterministic continuity score in
[0, 1], - the
added/removed/mutatedmemory IDs (content-hash aware), - the underlying
identity_diffandsemantic_driftcomponents, - the weights used (default 0.5 / 0.5, fully configurable).
pip install exitkitRequires Python 3.11+.
from portable_ai_memory import MemoryObject, MemoryStore, Owner
from exitkit import continuity_score
def make(memories):
return MemoryStore(
schema_version="1.0",
owner=Owner(id="alice"),
memories=[
MemoryObject.create(id=mid, type="fact", content=c, platform="demo")
for mid, c in memories
],
)
before = make([("m1", "Prefers async."), ("m2", "Lives in Tokyo.")])
after = make([("m1", "Prefers async, except for I/O-bound."), ("m3", "Working on ExitKit.")])
report = continuity_score(before, after)
print(report.continuity) # 0.0 - 1.0 (1.0 = identical)
print(report.added, report.removed, report.mutated)See examples/continuer_demo.py for a runnable end-to-end demo.
continuity_score(before, after, *, identity_weight=0.5, semantic_weight=0.5, embedder=None) returns a DriftReport:
| Field | Meaning |
|---|---|
continuity |
1 - (w_id * identity_diff + w_sem * semantic_drift), clipped to [0, 1] |
identity_diff |
Content-hash-aware symmetric difference over MemoryObject IDs. |
semantic_drift |
1 - cosine(centroid(before), centroid(after)), rescaled to [0, 1]. |
added |
frozenset[str] of new memory IDs. |
removed |
frozenset[str] of dropped memory IDs. |
mutated |
frozenset[str] of kept-but-rewritten memory IDs (same ID, new content_hash). |
weights |
The (identity_weight, semantic_weight) actually used. |
n_before, n_after |
Memory counts. |
Weights must each lie in [0, 1] and sum to 1.
The default semantic component uses hashing_embedder: a pure-numpy, dependency-light, deterministic bag-of-words projection using the blake2b hashing trick (1024-dimensional, L2-normalised). No external model download required.
Tokenisation is \w+ (unicode). Content containing only punctuation, whitespace, or emoji collapses to zero tokens and contributes no semantic signal under the default embedder.
For richer signals, pass any Callable[[Iterable[str]], np.ndarray]:
from sentence_transformers import SentenceTransformer
import numpy as np
model = SentenceTransformer("all-MiniLM-L6-v2")
embed = lambda texts: np.asarray(model.encode(list(texts)))
report = continuity_score(before, after, embedder=embed)- Use it as a drift binary classifier. Threshold the
continuityscore (e.g.>= 0.8) to flag whether a fine-tune, migration, or personalisation pass kept the agent's memory identity intact — the toy benchmark intests/test_auc.pyshows the default weights are discriminative (ROC-AUC >= 0.7 against unrelated agents). semantic_driftrange depends on the embedder. With the defaulthashing_embedder(non-negative bag-of-words), cosines lie in[0, 1], sosemantic_driftis bounded by[0, 0.5]for non-empty stores. Pass a sentence-transformers (or other arbitrary-direction) embedder if you need the full[0, 1]range. Empty vs. empty (drift = 0.0) and empty vs. non-empty (drift = 1.0) remain reachable under any embedder.- Weights apply to drift, not a normalised scale.
identity_diffalways spans[0, 1], but with the default embeddersemantic_driftonly spans[0, 0.5]. The default0.5 / 0.5weighting therefore puts roughly twice the effective weight on the identity component. Pass an embedder that spans the full cosine range, or setidentity_weight/semantic_weightexplicitly, if that asymmetry matters for your use case. - MemoryObject IDs must be unique per store.
continuity_scoreraisesValueErrorif aMemoryStorecontains duplicate IDs — silently collapsing duplicates produced subtle false-positivemutatedresults. - Default tokenisation is alphanumeric (
\w+, unicode). Memories whose content is only punctuation, whitespace, or emoji collapse to zero tokens underhashing_embedderand therefore contribute no semantic signal. Pass a richer custom embedder if those signals matter. - One component, on purpose. The core metric is the continuer-select score — no UI, no provenance store. Cedar-based export policies (
exitkit[cedar],cedar_loader.py) and Sigstore-signed release manifests are already available in v0.2. - Tracking Truth != aggregation. ExitKit does not try to aggregate users or vote on values; it measures whether a single agent's memory state continues, in the closest-continuer sense.
- Convergent goals, not endorsement. The "agent stays the agent you trained" objective overlaps with the positive alignment programme described in Laukkonen et al., Positive Alignment (arXiv:2605.10310, 2026); cited as a convergent vision, not a methodological commitment.
memcanon v0.2+ accepts events
from this repo via a thin in-process shim and content-hashes them into a
local audit store:
memcanon is not on PyPI yet. Install it from the tagged release:
pip install "git+https://github.com/hinanohart/memcanon@v0.2.0a2"
from memcanon.emit import emit
from memcanon.store.local import LocalStore
with LocalStore("audit") as store:
emit("exitkit", {"kind": "...", "decision": "..."}, store=store)Each record is tagged source:exitkit + schema:memcanon-emit/1. Memcanon's
memcanon export --format eu-ai-act-12 --to OUT.json can then build an
Article 12(2) paragraph-mapped audit-log artefact (SHAPE only, NOT a
conformity assessment).
git clone https://github.com/hinanohart/exitkit
cd exitkit
python3 -m venv .venv && source .venv/bin/activate
pip install -e ".[dev]"
pytest
ruff check .
mypy- Santhosh Kumar Ravindran. Portable Agent Memory: A Protocol for Provenance-Verified Memory Transfer Across Heterogeneous LLM Agents. arXiv:2605.11032 (2026).
- Robert Nozick. Anarchy, State, and Utopia. Basic Books, 1974 — Part III, "A Framework for Utopia".
- Robert Nozick. Philosophical Explanations. Harvard University Press, 1981 — §1 Tracking Truth and §1 closest-continuer.
- Ruben Laukkonen, Seb Krier, Chloé Bakalar et al. Positive Alignment: Artificial Intelligence for Human Flourishing. arXiv:2605.10310 (2026).
MIT License — see LICENSE.
