Evaluate extracting reusable packages from internal/

## Context

knowing's `internal/` contains several components with clean boundaries that could serve broader use cases as independent Go packages. This issue tracks evaluation of what to extract, when, and under what stability guarantees.

## Candidates

### Strong candidates (clean boundaries, general-purpose)

| Package | Current location | What it does | Potential consumers |
|---------|-----------------|-------------|-------------------|
| Hierarchical Merkle tree | `internal/snapshot/hierarchical.go` + `merkle.go` | Semantic-boundary Merkle trees with subgraph roots, typed diffs, context pack roots | Any content-addressed system: config management, infrastructure graphs, dependency trackers, audit systems |
| Subgraph cache | `internal/cache/subgraph.go` | TTL-bounded cache keyed by Merkle roots with selective package-scoped invalidation | Anything using content-addressed caching |
| GCF/TOON wire formats | `internal/wire/gcf.go` + `toon.go` | Token-optimized wire formats for LLM context delivery | Any MCP server, any tool sending structured data to agents |
| Community detection | `internal/community/` | Pluggable Algorithm interface with Louvain + label propagation | Graph analysis, visualization, social network analysis |
| Hash identity | `internal/types/types.go` (Hash, NewHash, domain prefixes, Verify) | Content-addressed identity with type-safe domain prefixes | Any Go project using SHA-256 content addressing |
| Equivalence classes | `internal/context/equivalence.go` + `universal_seeds.go` | Vocabulary bridging between natural language and code symbol names | Code search tools, developer-facing retrieval systems |

### Moderate candidates (need interface decoupling)

| Package | Blocker |
|---------|---------|
| Tree-sitter extractor | Depends on types package; needs a minimal interface |
| LSP enrichment | Depends on types + store; the pattern is reusable but needs abstraction |
| RWR + HITS algorithms | Graph algorithms are general but wrapped in knowing-specific scoring |

### Not extractable (too knowing-specific)

- `internal/store/sqlite.go` (schema is knowing-specific)
- `internal/mcp/` (tool definitions are product-specific)
- `internal/daemon/` (watcher + lifecycle is product-specific)
- `cmd/knowing/` (CLI is the product)

## Stability concerns

The Merkle tree API is NOT stable yet:
- Hash domain prefixes shipped 2026-05-18 (broke all existing hashes)
- File-level roots are planned (Phase 4, would add a tree level)
- The flat tree was just dropped (hierarchical root is now canonical)

**Do not extract until:**
1. Hash format is stable for at least one release cycle
2. File-level roots ship or are explicitly deferred
3. Subgraph cache has been validated by daily use
4. At least one external consumer validates the API

## Suggested extraction order

1. **GCF/TOON** first (helps MCP ecosystem, creates network effect, format is stable)
2. **Community detection** second (generic algorithms, no competitive advantage from keeping private)
3. **Equivalence classes** third (the concept is useful broadly, knowing's classes are tuned for knowing)
4. **Hierarchical Merkle tree** last (the differentiator, extract only after API is stable)

## Decision

This is a tracking issue. No extraction should happen until the conditions above are met. The purpose is to maintain awareness of what's extractable so internal code stays clean at the boundaries.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Evaluate extracting reusable packages from internal/ #2

Context

Candidates

Strong candidates (clean boundaries, general-purpose)

Moderate candidates (need interface decoupling)

Not extractable (too knowing-specific)

Stability concerns

Suggested extraction order

Decision

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Package	Current location	What it does	Potential consumers
Hierarchical Merkle tree	`internal/snapshot/hierarchical.go` + `merkle.go`	Semantic-boundary Merkle trees with subgraph roots, typed diffs, context pack roots	Any content-addressed system: config management, infrastructure graphs, dependency trackers, audit systems
Subgraph cache	`internal/cache/subgraph.go`	TTL-bounded cache keyed by Merkle roots with selective package-scoped invalidation	Anything using content-addressed caching
GCF/TOON wire formats	`internal/wire/gcf.go` + `toon.go`	Token-optimized wire formats for LLM context delivery	Any MCP server, any tool sending structured data to agents
Community detection	`internal/community/`	Pluggable Algorithm interface with Louvain + label propagation	Graph analysis, visualization, social network analysis
Hash identity	`internal/types/types.go` (Hash, NewHash, domain prefixes, Verify)	Content-addressed identity with type-safe domain prefixes	Any Go project using SHA-256 content addressing
Equivalence classes	`internal/context/equivalence.go` + `universal_seeds.go`	Vocabulary bridging between natural language and code symbol names	Code search tools, developer-facing retrieval systems

Package	Blocker
Tree-sitter extractor	Depends on types package; needs a minimal interface
LSP enrichment	Depends on types + store; the pattern is reusable but needs abstraction
RWR + HITS algorithms	Graph algorithms are general but wrapped in knowing-specific scoring

Evaluate extracting reusable packages from internal/ #2

Description

Context

Candidates

Strong candidates (clean boundaries, general-purpose)

Moderate candidates (need interface decoupling)

Not extractable (too knowing-specific)

Stability concerns

Suggested extraction order

Decision

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions