Skip to content

v0.36.2.0 feat: ZeroEntropy as default + zero-based README rewrite#1136

Merged
garrytan merged 15 commits into
masterfrom
garrytan/milan-v1
May 19, 2026
Merged

v0.36.2.0 feat: ZeroEntropy as default + zero-based README rewrite#1136
garrytan merged 15 commits into
masterfrom
garrytan/milan-v1

Conversation

@garrytan
Copy link
Copy Markdown
Owner

Summary

ZeroEntropy is the new default. Faster, cheaper, better quality on real queries. Existing users get a one-shot switch prompt with cost estimate; new installs land on it out of the box.

Track A — ZE as default:

  • DEFAULT_EMBEDDING_MODEL flipped to zeroentropyai:zembed-1 at 1280d via Matryoshka
  • balanced mode bundle now enables zerank-2 cross-encoder reranker by default
  • New RetrievalUpgradePlanner consolidates v0.32.7 chunker-bump + v0.36 ZE switch into ONE re-embed pass (no double-charge)
  • New gbrain ze-switch CLI with --dry-run/--undo/--resume/--force/--non-interactive
  • Tagged-union ApplyResult enum (six states, no string-parsing reasons)
  • Three config keys separate UI state / user intent / work-done
  • Full JSON snapshot in ze_switch_previous_snapshot for symmetric --undo
  • Two new doctor checks (ze_embedding_health, embedding_width_consistency)
  • OpenAI Matryoshka range validation extended in dims.ts (paste-ready AIConfigError.fix)
  • Schema transition atomic: DROP indexes + ALTER COLUMN + CREATE INDEX inside one transaction

Track B — README zero-based rewrite:

  • 884 lines → 139 lines (33 H2s → 8)
  • Cut three competing "New in v0.X.Y" hero blocks, the 136-line Commands section, and the 6-table skills enumeration
  • New docs/INSTALL.md, docs/architecture/RETRIEVAL.md, docs/ethos/ORIGIN.md
  • Hero retains every load-bearing fact: OpenClaw + Hermes credit, production numbers, BrainBench numbers, ZE comparison numbers

The numbers that matter (real-corpus benchmark, 20 queries, 17K-page brain):

Metric OpenAI Voyage ZeroEntropy
Top-1 wins 6 4 11
Avg latency 973ms 559ms 442ms
Cost/1M tokens $0.13 $0.06 $0.05 (regular)
Reshuffle as reranker n/a n/a 60%

Test Coverage

7 new test files (+74 cases):

  • test/retrieval-upgrade-planner.test.ts (24 cases) — state machine, atomicity, undo
  • test/ze-switch-cli.test.ts (11 cases) — every flag + envelope shape
  • test/doctor-ze-checks.test.ts (8 cases) — both new doctor checks
  • test/ai/dims-openai.test.ts (16 cases) — Matryoshka range validation
  • test/asymmetric-encoding-contract.test.ts (6 cases) — D17 behavior-mock for embedQuery on read path
  • test/balanced-reranker-default.test.ts (10 cases) — D6 mode-bundle flip + fail-open contract
  • test/readme-hero-anchors.test.ts (5 cases) — D9 regression guard for headline facts

Existing test updates: test/ai/gateway.test.ts, test/search-mode.test.ts, test/openai-compat-multimodal.test.ts, test/e2e/v0_28_5-fix-wave.test.ts updated for the new defaults.

Tests: 6907 pass / 0 fail (unit + serial, full suite ~7 min). E2E: 623/624 pass (1 pre-existing flake, voyage-multimodal.test.ts rejects test fixture — confirmed identical failure on master).

Pre-Landing Review

Two-pass /plan-eng-review ran on the plan before implementation: 18 decisions resolved across two interactive passes (D1–D18) + 4 baked-in nits from codex outside-voice review. Critical fixes caught pre-implementation:

Full review report in ~/.claude/plans/system-instruction-you-are-working-rippling-petal.md.

Plan Completion

All 11 implementation tasks (T1–T11) shipped:

  • T1–T6: Track A core (dims, planner, CLI, defaults, asymmetric encoding, doctor)
  • T7: README rewrite + 3 new docs files
  • T8: Regen llms.txt + llms-full.txt
  • T9: Version trio bump + migration skill
  • T10: README hero anchor test
  • T11: Full test suite green (verify + 6907 unit + serial + E2E lifecycle)

To take advantage of v0.36.0.0

gbrain upgrade runs the consolidated retrieval-upgrade prompt automatically.

  1. Run the upgrade:
    gbrain upgrade
  2. Read the comparison numbers in the prompt. Press s to switch, Enter to stay (default), l to ask later, n to never ask.
  3. If you switched: refill embeddings. The schema is rebuilt at 1280d but embeddings are NULL until you re-embed:
    gbrain embed --stale     # serial; cost estimate from the upgrade prompt
    The autopilot cycle's embed phase also walks through this on its own cadence.
  4. Verify:
    gbrain doctor
    ze_embedding_health should be green; embedding_width_consistency should show schema + config both at 1280.
  5. If you regret it: gbrain ze-switch --undo. Restores prior model + dim + reranker with a symmetric cost prompt.

Test plan

  • bun run verify (privacy + jsonb + progress + wasm + typecheck) green
  • Unit suite: 6907 pass / 0 fail across 8 parallel shards + serial
  • E2E with real Postgres: 623/624 pass (1 pre-existing Voyage real-API flake, unrelated)
  • Trio audit: VERSION + package.json + CHANGELOG header all 0.36.0.0
  • README hero anchor test: 5/5 pass (D9 regression guard)
  • After merge: re-ran full unit suite — 6907/6907 still green

Credit: ZeroEntropy (@zeroentropy) for the embedding + reranker stack. Codex outside-voice review caught the double-re-embed bug class pre-implementation.

🤖 Generated with Claude Code

garrytan and others added 10 commits May 17, 2026 19:01
dimsProviderOptions now fail-loud at the embed boundary when the
configured embedding_dimensions is outside the model's native range
(1..1536 for -small, 1..3072 for -large). Paste-ready fix hint in the
AIConfigError.fix field. Closes the silent-HTTP-400 path that would
have bit OpenAI-fallback users on v0.36.0.0 ZE-default installs.

16 new test cases in test/ai/dims-openai.test.ts pinning the contract
across native-openai and openai-compatible adapter paths.
…nker

Default embedding model is now zeroentropyai:zembed-1 at 1280d via
Matryoshka. Real-corpus benchmark: 2.2x faster than OpenAI, 2.6x
cheaper at regular pricing, wins 11/20 head-to-head queries.

1280 is the closest valid ZE Matryoshka step to the prior OpenAI 1536d
default (valid set: 2560/1280/640/320/160/80/40). 1024 (Voyage's step)
is NOT on ZE's list — pinned by AIConfigError fail-loud in dims.ts.

balanced mode bundle now defaults reranker_enabled=true. zerank-2
reshuffles 60% of top-1 results in benchmarks. Missing-key fail-open
contract in src/core/search/rerank.ts handles unauthenticated cases.
Opt out with: gbrain config set search.reranker.enabled false

Existing tests updated (gateway.test.ts, search-mode.test.ts) and a
new test/balanced-reranker-default.test.ts (10 cases) pins the fail-
open invariants.
New src/core/retrieval-upgrade-planner.ts is the consolidated planner
that computes the brain's pending retrieval-upgrade work (chunker
bumps + ZE switch) in one pass and applies the schema transition +
config updates atomically.

Tagged-union ApplyResult enum (D15): 'applied' | 'skipped_already_
applied' | 'skipped_no_work' | 'declined' | 'planned' | 'failed'.
No string-parsing reasons.

Three config keys (D12): ze_switch_prompt_shown (UI state),
ze_switch_requested (user intent), ze_switch_applied (work done).
Plus ze_switch_previous_snapshot (JSON, full prior config for --undo
per D16) and ze_switch_declined_at (90-day re-ask window).

Schema transition (D18) is atomic: DROP indexes + ALTER COLUMN +
CREATE INDEX inside a single engine.transaction(). HNSW recreation
is part of the same transaction — no silent slow-search window.

C3 eligibility logic: ze_switch_offered iff NOT on ZE + NOT declined
recently + NOT applied + (legacy default OR >100 pages).

C4 cost math: MAX(chunker_pending, dim_pending) not SUM — one
re-embed pass invalidates both surfaces simultaneously.

New src/core/retrieval-upgrade-prompt.ts wires the planner to a
TTY-only interactive prompt with two-line cost split (D10) and
privacy callout for the reranker flip.

Tests: test/retrieval-upgrade-planner.test.ts (24 cases) pins the
state machine. test/asymmetric-encoding-contract.test.ts (6 cases)
pins D17: search read path uses gateway.embedQuery() not embed(),
asserted via __setEmbedTransportForTests mock.
New gbrain ze-switch CLI with --dry-run, --json, --resume, --force,
--undo, --non-interactive, --confirm-reembed, --ignore-missing-key
flags. Mirrors the upgrade prompt's UX symmetry: --undo presents a
cost-warning before re-embedding back to the prior width.

src/cli.ts: dispatch case + CLI_ONLY entry. ze-switch owns its own
engine lifecycle (mirrors the doctor pattern).

test/ze-switch-cli.test.ts (11 cases): --help, --dry-run, --json,
--non-interactive, --ignore-missing-key, --resume, --undo,
--confirm-reembed. Uses captureExit harness to test process.exit()
paths without breaking the test process.
Two new doctor checks (D-A5):

ze_embedding_health: when embedding_model starts with zeroentropyai:,
verify ZEROENTROPY_API_KEY is set (env or config). Paste-ready setup
hint with the signup URL on failure.

embedding_width_consistency: cross-check that the configured
embedding_dimensions matches the actual vector(N) column width on
content_chunks.embedding. Catches the half-applied switch state
(schema migrated but config write crashed) with a paste-ready
gbrain ze-switch --resume hint.

Wired into runDoctor between reranker_health and the existing
sync_freshness checks. Both checks gracefully no-op on non-ZE
embedding configs.

test/doctor-ze-checks.test.ts (8 cases) pins both checks across
happy + missing-key + missing-config + drift paths. Uses withEnv()
helper to clear ZEROENTROPY_API_KEY for the no-key path so tests
are hermetic against contributor env state.

test/e2e/v0_28_5-fix-wave.test.ts + test/openai-compat-multimodal.test.ts:
updated to explicit-configure the gateway when the test depends on
specific dims that diverge from the v0.36.0.0 default (1280d).
Strip 4 months of accreted "New in v0.X.Y" hero blocks and reorganize
around what gbrain does today. 33 H2s -> 8. The Commands section
(136 lines duplicating gbrain --help) moved out; the 6-table skills
enumeration collapsed to a one-paragraph capability description with
a link to skills/RESOLVER.md.

Hero retains load-bearing facts: OpenClaw + Hermes credit, production
numbers (17,888 pages / 4,383 people / 723 companies), BrainBench
numbers (P@5 49.1% / R@5 97.9% / +31.4 lift), ZE comparison numbers,
30-min install claim. Adds one paragraph announcing the v0.36.0.0 ZE
default with the explicit gbrain config set escape for OpenAI/Voyage
users.

New files:
- docs/INSTALL.md: every install path consolidated (agent platform,
  CLI standalone, MCP server). Thin-client mode covered.
- docs/architecture/RETRIEVAL.md: why the hybrid + graph stack works.
  BrainBench numbers, why each strategy alone fails, the source-aware
  ranking + intent classification + multi-query expansion story.
- docs/ethos/ORIGIN.md: origin story lifted from the old README so
  the front door stays factual + concrete.

test/readme-hero-anchors.test.ts (5 cases) is the D9 regression
guard. Five load-bearing strings: OpenClaw, Hermes, ZE,
production-numbers regex, P@5/R@5. Light anchors that let voice/
structure evolve but block accidental loss of headline facts.

scripts/check-test-real-names.sh: allowlist entries for OpenClaw +
Hermes literals in the anchor test (it explicitly asserts those
strings appear in README).
ZeroEntropy as the new default for embedding (zembed-1 at 1280d via
Matryoshka) and reranker (zerank-2 cross-encoder, on by default in
balanced mode bundle). README zero-based rewrite (884 -> 139 lines).
3 new docs files. Two new doctor checks. New gbrain ze-switch CLI
with --undo for symmetric reversibility.

skills/migrations/v0.36.0.0.md tells the agent how to surface the
retrieval-upgrade prompt post-upgrade.

llms-full.txt regenerated via bun run build:llms.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
# Conflicts:
#	CHANGELOG.md
#	README.md
#	VERSION
#	llms-full.txt
#	package.json
#	src/cli.ts
Three open PRs were claiming v0.36.0.0 (#1130 skillpack, #1139
hindsight, #1136 this PR). Ship-aware queue allocator says this
branch lands at v0.36.2.0.

Trio audit:
  VERSION       0.36.2.0
  package.json  0.36.2.0
  CHANGELOG     ## [0.36.2.0] - 2026-05-17

Updates: VERSION, package.json, CHANGELOG header + body refs,
README "New default in v0.36.2.0" announcement + credit line,
skills/migrations/v0.36.0.0.md renamed to v0.36.2.0.md with
frontmatter + body refs updated. llms-full.txt regenerated.
@garrytan garrytan changed the title v0.36.0.0 feat: ZeroEntropy as default + zero-based README rewrite v0.36.2.0 feat: ZeroEntropy as default + zero-based README rewrite May 18, 2026
garrytan added 5 commits May 18, 2026 06:24
# Conflicts:
#	CHANGELOG.md
#	VERSION
#	package.json
CI shard 1 reported 10 failures across `query-cache.test.ts` (6) and
`consolidate-valid-until.test.ts` (4). Both files hardcode 1536-dim
vectors but rely on `PGLiteEngine.initSchema()` to size
`vector(__EMBEDDING_DIMS__)` at the right width.

Root cause: v0.36.2.0 flipped DEFAULT_EMBEDDING_DIMENSIONS from 1536
to 1280 (ZE Matryoshka step). The gateway module is process-singleton;
when ANOTHER test file in the same shard's bun-test process configures
the gateway before us, `pglite-engine.ts:216` reads
`getEmbeddingDimensions() === 1280` and sizes the schema columns at
vector(1280). The hardcoded 1536-dim INSERTs then fail with
"expected 1280 dimensions, not 1536".

Locally these tests pass in isolation because the gateway falls back
through the try/catch at pglite-engine.ts:218 (1536 default). CI runs
multiple test files in one process, so cross-file state poisons the
schema width.

Fix: explicit `resetGateway()` + `configureGateway({embedding_dimensions:
1536, ...})` at the top of `beforeAll`, plus `resetGateway()` in
`afterAll`. Pins the schema width regardless of cross-file state.
# Conflicts:
#	CHANGELOG.md
#	README.md
#	VERSION
#	llms-full.txt
#	package.json
#	scripts/check-test-real-names.sh
# Conflicts:
#	CHANGELOG.md
#	README.md
#	VERSION
#	llms-full.txt
#	package.json
# Conflicts:
#	CHANGELOG.md
#	VERSION
#	package.json
@garrytan garrytan merged commit cdba533 into master May 19, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant