Most mid-market manufacturers and distributors sit on a mountain of customer data and nobody reads it. An account that used to reorder every six weeks quietly stretches to nine, then twelve. No status flag trips — it isn't dormant, it's just fading — and the rep finds out when the reorder that mattered never comes. Reter's ML models watch every customer account, every day, to predict churn/decline and identify retention/expansion opportunities months in advance.
Records tell you what happened, Reter tells you whats coming and what to do about it, account by account, every day — and when it goes to act, a human signs off first. It's built for manufacturers and distributors who sell consumables through reps. Opinionated on purpose: not an ERP or CRM you configure, an intelligence layer that works with existing systems, takes a position, and hands you or your agents the next move.
Portfolio repo. Everything runs against a synthetic demo tenant (
acme_industrial) — there is no real customer data here. The headline ML figures from the original engagement were measured on private data and don't reproduce from this checkout; see ML, honestly.
ERP transactions ─┐
comms metadata ├─▶ signals (SQL) ─▶ synthesis ─▶ Cue ─▶ OMA
human feedback ─┘ anomaly / (pattern- (agent (23 tools;
reorder-gap / as-code → loop) every binding
buying-group / a per-account action gated
dormancy / … narrative) on human approval)
│
Postgres (RLS, tenant_id) ──▶ read-only MCP server
Signals → narrative. Per-signal SQL modules (anomaly, reorder window, buying-group transition, dormant triage, product-mix shift) feed packages/synthesis — a zero-dependency engine that walks an ordered, most-specific-first pattern library and emits one diagnosis per account in three beats: THE READ / DO THIS / WHAT I CAN'T SEE. The third beat is load-bearing: the system states its own blind spots instead of laundering them into false confidence. A model-overcall meta-pattern can override the churn model when ground-truth revenue pace contradicts its alarm.
Cue — the agent runtime. Cue is a hand-rolled tool-use loop (packages/core/src/cue/runtime/session.ts), not a hosted/managed one. That's the deliberate part: keeping the loop means the guardrails are ours. It carries a reminder queue (how guardrail outcomes get surfaced back to the model mid-turn), a wall-clock budget, bounded retry, and clean cancellation that returns a degraded result instead of throwing. Why hand-rolled and not Managed Agents: ADR 0004.
OMA — where the agent meets the real world. 23 tools in three tiers: read (6: customer, pricing, inventory, ship-date, open orders), internal (10: notify production/warehouse, log engagement/decisions, credit check), and approval (7: quote, sales order, customer email, release). Anything that touches money or leaves the building is approval-gated. The gate is wired end to end — agent enqueues → approval_requests row (tiered expiry: 4h / 24h / 72h) → human decides via the API → React approval cards — and the agent blocks on the decision before continuing. The model proposes; a person commits.
Guardrails are tested code, not prompt etiquette (packages/core/src/cue/patterns/): verify_cite_verbatim (every number in the model's output must appear verbatim in a tool result), a critical-field gate that enforces the 0/1/many rule so the agent never guesses an ambiguous referent, an approval gate, a latency budget, and a drafts-only degrade when a tenant has outbound transmission off.
Two ways to consume it. Reter is the surface the rep lives in (Cue), or the substrate other tools call into: a read-only MCP server (resolve-account, get-account-read, list-drifting-accounts, get-synthesis, …) exposes the same intelligence over stdio and HTTP — RLS-scoped, rate-limited, audit-logged, no write path imported.
The churn model is real (XGBoost, served at request time through a dependency-free TypeScript tree-walker in packages/ml/inference.ts — no Python in the hot path). The honest part is what happened to its numbers:
A single broad churn classifier hit ~0.99 CV AUC. That was temporal leakage — segment/territory and peer context computed across the full training period. Out-of-sample at real cutoffs, broad churn collapses toward chance (~0.40–0.43). So instead of shipping the headline number, prediction was routed into the lanes that actually generalize: reactivation ranking ~0.84–0.86, material-risk / revenue-decline ~0.70–0.72. Catching my own leakage and pivoting is the point, not a footnote. Full post-mortem: packages/ml/CALIBRATION_PLAYBOOK.md.
Shipped and exercised by tests: the monorepo, Cue, the OMA tool set and the approval queue end to end, churn inference, the synthesis engine, and the MCP server. Scaffolded and inert: a survival model — the scoring code and schema exist, but no trained artifact or trainer ships, so load_survival_model() returns None. The agent-platform vision (more agents, per-agent RAG, unified observability) is design, not code, and is fenced off in ROADMAP.md. The current architecture, with file references, is ARCHITECTURE.md.
npm install
# Zero setup — the synthesis engine is fully self-contained:
npm --workspace @reter/synthesis test # 99 tests, no DB, no keys
# Full stack (needs Postgres/Supabase — see .env.example):
cd packages/db && npm run migrate
npm run dev # API (:3001) + web togetherCue and the OMA tools additionally need an ANTHROPIC_API_KEY (the runtime calls Claude, pinned to a fixed Sonnet version). Test suites: synthesis 99 · core 495 · api 119 · web 57 · db 36 · mcp-server 19 (vitest); packages/ml has its own pytest suite.
ARCHITECTURE.md— what's built today, with file references.packages/synthesis/synthesize.ts— the pattern library and the three-beat narrative.packages/core/src/cue/runtime/session.ts— the agent loop.packages/core/src/oma/approval-queue/— the human-in-the-loop gate.docs/adr/— the decisions, including ADR 0004 on owning the runtime.