Skip to content

hinanohart/exitkit

ExitKit

CI MIT Python 3.11+

Measure whether an LLM agent's memory stayed the same after a fine-tune, migration, or personalisation pass.

A Nozickian closest-continuer metric for portable agent memory. Given two PAM snapshots of an agent (before and after a fine-tune, migration, or personalisation pass), ExitKit returns a continuity score in [0, 1] and a structured drift report — so you can answer "is it still the same agent?" with a number instead of vibes.


What is this

ExitKit is a Python library that measures whether an LLM agent's memory remained the same after an update. You pass two PAM-format MemoryStore snapshots — one captured before a fine-tune, provider migration, or personalisation pass, one captured after — and get back a single continuity score from 0.0 (completely different) to 1.0 (identical), plus a structured breakdown of which memories were added, removed, or silently rewritten.

The metric combines two components:

  • Structural identity diff — which memory objects changed by ID and content hash
  • Semantic drift — cosine distance between centroid embeddings

Both components are configurable, and you can plug in any embedding model you like.


Why

LLM agents accumulate memory: preferences, facts, project context. As you fine-tune, switch providers, or run a personalisation pass, the memory mutates. Did the agent stay the same agent, or did you replace it?

Robert Nozick's Tracking Truth and closest-continuer framework (Philosophical Explanations, 1981, §1) gives a principled answer: the post-update snapshot is the continuer iff it remains the highest-scoring candidate by a continuity metric and exceeds a chosen threshold.

ExitKit ports that idea to PAM-format memory snapshots and returns:

  • a deterministic continuity score in [0, 1],
  • the added / removed / mutated memory IDs (content-hash aware),
  • the underlying identity_diff and semantic_drift components,
  • the weights used (default 0.5 / 0.5, fully configurable).

Installation

pip install exitkit

Requires Python 3.11+.


Quick start

from portable_ai_memory import MemoryObject, MemoryStore, Owner
from exitkit import continuity_score


def make(memories):
    return MemoryStore(
        schema_version="1.0",
        owner=Owner(id="alice"),
        memories=[
            MemoryObject.create(id=mid, type="fact", content=c, platform="demo")
            for mid, c in memories
        ],
    )


before = make([("m1", "Prefers async."), ("m2", "Lives in Tokyo.")])
after  = make([("m1", "Prefers async, except for I/O-bound."), ("m3", "Working on ExitKit.")])

report = continuity_score(before, after)
print(report.continuity)             # 0.0 - 1.0  (1.0 = identical)
print(report.added, report.removed, report.mutated)

See examples/continuer_demo.py for a runnable end-to-end demo.


How it works

continuity_score(before, after, *, identity_weight=0.5, semantic_weight=0.5, embedder=None) returns a DriftReport:

Field Meaning
continuity 1 - (w_id * identity_diff + w_sem * semantic_drift), clipped to [0, 1]
identity_diff Content-hash-aware symmetric difference over MemoryObject IDs.
semantic_drift 1 - cosine(centroid(before), centroid(after)), rescaled to [0, 1].
added frozenset[str] of new memory IDs.
removed frozenset[str] of dropped memory IDs.
mutated frozenset[str] of kept-but-rewritten memory IDs (same ID, new content_hash).
weights The (identity_weight, semantic_weight) actually used.
n_before, n_after Memory counts.

Weights must each lie in [0, 1] and sum to 1.

Default embedder

The default semantic component uses hashing_embedder: a pure-numpy, dependency-light, deterministic bag-of-words projection using the blake2b hashing trick (1024-dimensional, L2-normalised). No external model download required.

Tokenisation is \w+ (unicode). Content containing only punctuation, whitespace, or emoji collapses to zero tokens and contributes no semantic signal under the default embedder.

Custom embedder

For richer signals, pass any Callable[[Iterable[str]], np.ndarray]:

from sentence_transformers import SentenceTransformer
import numpy as np

model = SentenceTransformer("all-MiniLM-L6-v2")
embed = lambda texts: np.asarray(model.encode(list(texts)))

report = continuity_score(before, after, embedder=embed)

Architecture

exitkit architecture

Design notes

  • Use it as a drift binary classifier. Threshold the continuity score (e.g. >= 0.8) to flag whether a fine-tune, migration, or personalisation pass kept the agent's memory identity intact — the toy benchmark in tests/test_auc.py shows the default weights are discriminative (ROC-AUC >= 0.7 against unrelated agents).
  • semantic_drift range depends on the embedder. With the default hashing_embedder (non-negative bag-of-words), cosines lie in [0, 1], so semantic_drift is bounded by [0, 0.5] for non-empty stores. Pass a sentence-transformers (or other arbitrary-direction) embedder if you need the full [0, 1] range. Empty vs. empty (drift = 0.0) and empty vs. non-empty (drift = 1.0) remain reachable under any embedder.
  • Weights apply to drift, not a normalised scale. identity_diff always spans [0, 1], but with the default embedder semantic_drift only spans [0, 0.5]. The default 0.5 / 0.5 weighting therefore puts roughly twice the effective weight on the identity component. Pass an embedder that spans the full cosine range, or set identity_weight / semantic_weight explicitly, if that asymmetry matters for your use case.
  • MemoryObject IDs must be unique per store. continuity_score raises ValueError if a MemoryStore contains duplicate IDs — silently collapsing duplicates produced subtle false-positive mutated results.
  • Default tokenisation is alphanumeric (\w+, unicode). Memories whose content is only punctuation, whitespace, or emoji collapse to zero tokens under hashing_embedder and therefore contribute no semantic signal. Pass a richer custom embedder if those signals matter.
  • One component, on purpose. The core metric is the continuer-select score — no UI, no provenance store. Cedar-based export policies (exitkit[cedar], cedar_loader.py) and Sigstore-signed release manifests are already available in v0.2.
  • Tracking Truth != aggregation. ExitKit does not try to aggregate users or vote on values; it measures whether a single agent's memory state continues, in the closest-continuer sense.
  • Convergent goals, not endorsement. The "agent stays the agent you trained" objective overlaps with the positive alignment programme described in Laukkonen et al., Positive Alignment (arXiv:2605.10310, 2026); cited as a convergent vision, not a methodological commitment.

Audit-trail integration (memcanon)

memcanon v0.2+ accepts events from this repo via a thin in-process shim and content-hashes them into a local audit store:

memcanon is not on PyPI yet. Install it from the tagged release:

pip install "git+https://github.com/hinanohart/memcanon@v0.2.0a2"
from memcanon.emit import emit
from memcanon.store.local import LocalStore

with LocalStore("audit") as store:
    emit("exitkit", {"kind": "...", "decision": "..."}, store=store)

Each record is tagged source:exitkit + schema:memcanon-emit/1. Memcanon's memcanon export --format eu-ai-act-12 --to OUT.json can then build an Article 12(2) paragraph-mapped audit-log artefact (SHAPE only, NOT a conformity assessment).


Development

git clone https://github.com/hinanohart/exitkit
cd exitkit
python3 -m venv .venv && source .venv/bin/activate
pip install -e ".[dev]"
pytest
ruff check .
mypy

References

  • Santhosh Kumar Ravindran. Portable Agent Memory: A Protocol for Provenance-Verified Memory Transfer Across Heterogeneous LLM Agents. arXiv:2605.11032 (2026).
  • Robert Nozick. Anarchy, State, and Utopia. Basic Books, 1974 — Part III, "A Framework for Utopia".
  • Robert Nozick. Philosophical Explanations. Harvard University Press, 1981 — §1 Tracking Truth and §1 closest-continuer.
  • Ruben Laukkonen, Seb Krier, Chloé Bakalar et al. Positive Alignment: Artificial Intelligence for Human Flourishing. arXiv:2605.10310 (2026).

License

MIT License — see LICENSE.

About

Nozickian closest-continuer metric for portable agent memory — drift detection over PAM MemoryStore snapshots.

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages