From bb15d3e2e961e18591e346bd9d1d4ffe496a5474 Mon Sep 17 00:00:00 2001
From: "kiloconnect[bot]" <240665456+kiloconnect[bot]@users.noreply.github.com>
Date: Sun, 15 Feb 2026 06:18:00 +0000
Subject: [PATCH] Add scratch files: decompose project docs into 13
 implementation topics

Create topic-focused scratch files consolidating all requirements from
8 source documents into bite-sized, implementable chunks:

- 01: Embedding Model Stack (Qwen3-VL-Embedding-8B, reranker, fallbacks)
- 02: Chunking Strategies (7 methods: fixed, sentence, semantic, recursive, AST, multimodal, fusion)
- 03: Metadata Schema (12-dimension chunk metadata with TypeScript interfaces)
- 04: Database Schema (SQLite tables, indexes, relationships)
- 05: MemVid Storage (H.265 video encoding, quad-encoding, FAISS)
- 06: Memory Hierarchy (Hot/Warm/Cold: ByteRover, Graphiti, MemVid)
- 07: Retrieval Pipeline (two-stage recall + rerank, hybrid search, cross-modal)
- 08: Orchestration & Concurrency (agentic swarm, workers, MCP servers)
- 09: Proxy/Shim (Claude wrapper, context injection, sanitization)
- 10: Sleep-Time Compute (autonomous refinement loops, Tribunal, Mutator)
- 11: Quality Assurance (validation, error handling, verification queries)
- 12: Domain Configuration (prompts/codebase/research routing)
- 13: Strategic Integrations (hypergraph, active inference, formal verification)

TASK_INDEX.md provides implementation order, dependency graph, complexity
estimates, and cross-document conflict resolution.
---
 scratch/01-embedding-model-stack.md     | 187 ++++++++++++++++
 scratch/02-chunking-strategies.md       | 202 +++++++++++++++++
 scratch/03-metadata-schema.md           | 285 ++++++++++++++++++++++++
 scratch/04-database-schema.md           | 253 +++++++++++++++++++++
 scratch/05-memvid-storage.md            | 206 +++++++++++++++++
 scratch/06-memory-hierarchy.md          | 123 ++++++++++
 scratch/07-retrieval-pipeline.md        | 169 ++++++++++++++
 scratch/08-orchestration-concurrency.md | 160 +++++++++++++
 scratch/09-proxy-shim.md                | 103 +++++++++
 scratch/10-sleep-time-compute.md        | 138 ++++++++++++
 scratch/11-quality-assurance.md         | 120 ++++++++++
 scratch/12-domain-configuration.md      | 125 +++++++++++
 scratch/13-strategic-integrations.md    |  96 ++++++++
 scratch/TASK_INDEX.md                   | 122 ++++++++++
 14 files changed, 2289 insertions(+)
 create mode 100644 scratch/01-embedding-model-stack.md
 create mode 100644 scratch/02-chunking-strategies.md
 create mode 100644 scratch/03-metadata-schema.md
 create mode 100644 scratch/04-database-schema.md
 create mode 100644 scratch/05-memvid-storage.md
 create mode 100644 scratch/06-memory-hierarchy.md
 create mode 100644 scratch/07-retrieval-pipeline.md
 create mode 100644 scratch/08-orchestration-concurrency.md
 create mode 100644 scratch/09-proxy-shim.md
 create mode 100644 scratch/10-sleep-time-compute.md
 create mode 100644 scratch/11-quality-assurance.md
 create mode 100644 scratch/12-domain-configuration.md
 create mode 100644 scratch/13-strategic-integrations.md
 create mode 100644 scratch/TASK_INDEX.md

diff --git a/scratch/01-embedding-model-stack.md b/scratch/01-embedding-model-stack.md
new file mode 100644
index 0000000..b8a8074
--- /dev/null
+++ b/scratch/01-embedding-model-stack.md
@@ -0,0 +1,187 @@
+# Topic: Embedding Model Stack
+
+## Summary
+Configuration, initialization, and integration of the Qwen3-VL multimodal embedding models and reranker for the RAG v3.0 system.
+
+---
+
+## Primary Model: Qwen3-VL-Embedding-8B
+
+> **Source:** [opus-prd1-v3.md](../opus-prd1-v3.md), [opus-prd2-v3.md](../opus-prd2-v3.md), [opus-prd3-v3.md](../opus-prd3-v3.md), [docs/UNIFIED_PRD.md](../docs/UNIFIED_PRD.md)
+
+- **Model ID:** `Qwen/Qwen3-VL-Embedding-8B`
+- **Released:** January 7-8, 2026 (arXiv:2601.04720)
+- **Parameters:** 8.14B
+- **Layers:** 36
+- **Architecture:** Dual-Tower (qwen3_vl)
+- **Context Length:** 32,768 tokens (default 8,192)
+- **Native Embedding Dimensions:** 4096
+- **MRL Support:** Yes — options: [256, 512, 1024, 2048, 4096]
+  - Storage dimension: 1024 (truncated for MemVid efficiency)
+  - Retrieval dimension: 2048 (higher precision for queries)
+- **Quantization:** bf16 (recommended), fp16, int8, int4
+- **Instruction-Aware:** Yes
+
+### Benchmarks
+| Benchmark | Score |
+|-----------|-------|
+| MMEB-V2 | 77.8 (Rank #1) |
+| MMTEB | 67.88 |
+| Image Retrieval | 80.0 |
+| Video Retrieval | 67.1 |
+| VisDoc Retrieval | 82.4 |
+
+### Supported Input Modalities
+- Pure text
+- Pure image
+- Pure video
+- Text + image (mixed)
+- Text + video (mixed)
+- Image + video (mixed)
+- Text + image + video (mixed)
+- Screenshots (treated as images with OCR awareness)
+
+### Vision Configuration
+> **Source:** [opus-prd2-v3.md](../opus-prd2-v3.md)
+
+- `min_pixels`: 4096
+- `max_pixels`: 1,843,200 (1280×1440)
+- `total_video_pixels`: 7,864,320
+- `default_fps`: 1.0
+- `default_frames`: 64
+- `max_frames`: 64
+
+### Inference Configuration
+> **Source:** [opus-prd2-v3.md](../opus-prd2-v3.md)
+
+- `torch_dtype`: bfloat16
+- `attn_implementation`: flash_attention_2
+- `device_map`: auto
+
+### Architecture Details
+> **Source:** [opus-prd3-v3.md](../opus-prd3-v3.md)
+
+- Extracts `[EOS]` token hidden state from last layer as final representation
+- Cross-modal pretraining with unified modality projection
+- Integrates supervised tasks, masked modeling, and multimodal alignment objectives
+- Enables efficient independent encoding for large-scale retrieval
+
+---
+
+## Boundary Detection Model: Qwen3-Embedding-0.6B
+
+> **Source:** [opus-prd2-v3.md](../opus-prd2-v3.md), [docs/UNIFIED_PRD.md](../docs/UNIFIED_PRD.md)
+
+- **Model ID:** `Qwen/Qwen3-Embedding-0.6B`
+- **Type:** Text-only
+- **Parameters:** 595.8M
+- **Native Dimensions:** 1024
+- **Purpose:** Cheap/fast similarity detection for semantic chunking boundary detection
+
+---
+
+## Reranker: Qwen3-VL-Reranker-8B
+
+> **Source:** [opus-prd1-v3.md](../opus-prd1-v3.md), [opus-prd2-v3.md](../opus-prd2-v3.md)
+
+- **Model ID:** `Qwen/Qwen3-VL-Reranker-8B`
+- **Parameters:** 8.14B
+- **Layers:** 36
+- **Architecture:** Single-Tower with Cross-Attention
+- **Input:** (Query, Document) pairs — both can be mixed-modal
+- **Output:** Relevance score (via yes/no token generation probability)
+- **Supported Modalities:** text, image, video, mixed
+- **Inference:** bfloat16, flash_attention_2
+
+### Smaller Variant: Qwen3-VL-Reranker-2B
+- **Parameters:** 2.13B
+- Same architecture (Single-Tower)
+
+---
+
+## Fallback Model: Qwen3-Embedding-8B (Text-Only)
+
+> **Source:** [opus-prd2-v3.md](../opus-prd2-v3.md)
+
+- **Model ID:** `Qwen/Qwen3-Embedding-8B`
+- **Type:** Text-only
+- **Parameters:** 7.57B
+- **Native Dimensions:** 4096
+- **MTEB Score:** 70.58 (Rank #1)
+- **Note:** Higher MTEB score than VL model (70.58 vs 67.88) but lacks multimodal capabilities
+
+---
+
+## Model Initialization Code
+
+> **Source:** [chatgpt5.2-prd.md](../chatgpt5.2-prd.md) (Phase 1), [opus-prd1-v3.md](../opus-prd1-v3.md)
+
+```python
+import torch
+from src.models.qwen3_vl_embedding import Qwen3VLEmbedder
+from src.models.qwen3_vl_reranker import Qwen3VLReranker
+
+# Primary Embedding Model
+embedder = Qwen3VLEmbedder(
+    model_name_or_path="Qwen/Qwen3-VL-Embedding-8B",
+    max_length=8192,
+    min_pixels=4096,
+    max_pixels=1843200,
+    total_pixels=7864320,
+    fps=1.0,
+    num_frames=64,
+    max_frames=64,
+    torch_dtype=torch.bfloat16,
+    attn_implementation="flash_attention_2"
+)
+
+# Precision Reranker
+reranker = Qwen3VLReranker(
+    model_name_or_path="Qwen/Qwen3-VL-Reranker-8B",
+    torch_dtype=torch.bfloat16,
+    attn_implementation="flash_attention_2"
+)
+```
+
+---
+
+## Alternative Models Considered
+
+> **Source:** [chatgpt5.2-prd.md](../chatgpt5.2-prd.md)
+
+- **Gemini Text-Embedding-001:** Upcoming model (replacing text-embedding-004), expected January 16, 2026. Considered as alternative/complement.
+- **Qwen3-VL-Embedding-2B:** Lightweight variant (2.13B params, 2048 dims, MMEB-V2: 73.2)
+
+---
+
+## Cost Analysis
+
+> **Source:** [docs/UNIFIED_PRD.md](../docs/UNIFIED_PRD.md), [chatgpt5.2-prd.md](../chatgpt5.2-prd.md)
+
+| Component | Model | Cost |
+|-----------|-------|------|
+| Embedding | Qwen3-VL-Embedding-8B | ~$0.03/1M tokens* |
+| Reranking | Qwen3-VL-Reranker-8B | ~$0.05/1M tokens* |
+| Ingestion (35MB) | One-time | ~$0.10 |
+| Queries (10K/day, annual) | - | ~$5.00 |
+
+*Estimated — not yet on OpenRouter, requires self-hosting or wait for API availability.
+
+---
+
+## Implementation Requirements
+
+1. Set up Qwen3-VL-Embedding-8B environment with flash_attention_2
+2. Implement model wrapper classes (`Qwen3VLEmbedder`, `Qwen3VLReranker`)
+3. Support MRL dimension truncation for storage vs retrieval
+4. Implement multimodal input preprocessing (text, image, video, mixed)
+5. Add fallback to text-only model on multimodal failure
+6. Integrate with OpenRouter for remote inference
+
+---
+
+## Conflicts / Ambiguities
+
+- **⚠️ Dimension mismatch:** chatgpt5.2-prd.md mentions "1526 or 3746 or 3182" as possible embedding sizes — these don't match the actual Qwen3-VL dimensions (4096 native, MRL options: 256/512/1024/2048/4096). The opus PRDs provide the correct values.
+- **⚠️ Gemini alternative:** chatgpt5.2-prd.md suggests potentially using both Qwen and Gemini embeddings. No other document addresses dual-embedding strategy.
+- **⚠️ Hosting:** chatgpt5.2-prd.md assumes OpenRouter availability; cost estimates are speculative since the model may require self-hosting.
diff --git a/scratch/02-chunking-strategies.md b/scratch/02-chunking-strategies.md
new file mode 100644
index 0000000..e90692c
--- /dev/null
+++ b/scratch/02-chunking-strategies.md
@@ -0,0 +1,202 @@
+# Topic: Chunking Strategies
+
+## Summary
+Seven distinct chunking methods for processing different content types (text, code, mixed-modal) into the RAG system. Includes configuration, routing logic, and the four-layer epistemic scaffolding model.
+
+---
+
+## Conceptual Framework: Four-Layer Epistemic Scaffolding
+
+> **Source:** [opus-prd3-v3.md](../opus-prd3-v3.md), [docs/UNIFIED_PRD.md](../docs/UNIFIED_PRD.md)
+
+Chunking has evolved into a four-layer system:
+1. **Fixed-length chunking** — mechanical, deterministic
+2. **Sentence/semantic-unit chunking** — linguistic awareness
+3. **Semantic coherence chunking (agentic)** — meaning-aware boundaries
+4. **Recursive hierarchical chunking (agentic)** — document-structure-aware
+
+---
+
+## Method 1: Fixed-Size Chunking
+
+> **Source:** [opus-prd2-v3.md](../opus-prd2-v3.md), [chatgpt5.2-prd.md](../chatgpt5.2-prd.md), [docs/UNIFIED_PRD.md](../docs/UNIFIED_PRD.md)
+
+- **Window tokens:** 512
+- **Overlap tokens:** 50
+- **Applies to:** configuration files, data files
+- **Modalities:** text only
+- **Agent required:** No — can be done programmatically
+
+### Implementation Notes
+> **Source:** [chatgpt5.2-prd.md](../chatgpt5.2-prd.md)
+
+- Length-based chunking can be done programmatically without an LLM agent
+- Simplest method, serves as fallback for AST chunking failures
+
+---
+
+## Method 2: Sentence-Based Chunking
+
+> **Source:** [docs/UNIFIED_PRD.md](../docs/UNIFIED_PRD.md), [chatgpt5.2-prd.md](../chatgpt5.2-prd.md)
+
+- **Window size:** 3 sentences
+- **Min chunk tokens:** 128
+- **Max chunk tokens:** 2048
+- **Agent required:** No — can be done programmatically
+
+---
+
+## Method 3: Semantic Chunking (Agentic)
+
+> **Source:** [opus-prd2-v3.md](../opus-prd2-v3.md), [docs/UNIFIED_PRD.md](../docs/UNIFIED_PRD.md), [chatgpt5.2-prd.md](../chatgpt5.2-prd.md)
+
+- **Similarity threshold:** 0.75
+- **Window size:** 3 sentences
+- **Boundary detection model:** `Qwen/Qwen3-Embedding-0.6B`
+- **Min chunk tokens:** 128
+- **Max chunk tokens:** 2048
+- **Applies to:** documentation, research papers
+- **Modalities:** text only
+- **Agent required:** Yes — requires intelligence for boundary detection
+
+### How It Works
+> **Source:** [opus-prd3-v3.md](../opus-prd3-v3.md)
+
+Uses embedding similarity between adjacent sentence windows to detect topic shifts. When similarity drops below threshold (0.75), a chunk boundary is placed. The lightweight 0.6B model handles boundary detection cheaply.
+
+---
+
+## Method 4: Recursive Hierarchical Chunking (Agentic)
+
+> **Source:** [opus-prd2-v3.md](../opus-prd2-v3.md), [docs/UNIFIED_PRD.md](../docs/UNIFIED_PRD.md), [chatgpt5.2-prd.md](../chatgpt5.2-prd.md)
+
+- **Chunk size tokens:** 1024
+- **Overlap tokens:** 100
+- **Separators** (in priority order):
+  1. `"\n\n"` — Paragraphs
+  2. `"\n"` — Lines
+  3. `". "` — Sentences
+  4. `" "` — Words (last resort)
+- **Applies to:** documentation, conversation
+- **Modalities:** text only
+- **Agent required:** Yes — requires understanding of document structure
+
+---
+
+## Method 5: AST Structural Chunking (Code)
+
+> **Source:** [opus-prd2-v3.md](../opus-prd2-v3.md), [docs/UNIFIED_PRD.md](../docs/UNIFIED_PRD.md)
+
+### Supported Languages & Parsers
+
+| Language | Parser | AST Nodes |
+|----------|--------|-----------|
+| Python | tree-sitter-python | function_definition, class_definition, decorated_definition |
+| TypeScript | tree-sitter-typescript | function_declaration, class_declaration, method_definition, interface_declaration |
+| JavaScript | tree-sitter-javascript | function_declaration, class_declaration, method_definition |
+| Go | tree-sitter-go | function_declaration, method_declaration, type_declaration |
+| Rust | tree-sitter-rust | function_item, impl_item, struct_item, trait_item |
+| Java | tree-sitter-java | method_declaration, class_declaration, constructor_declaration, interface_declaration |
+
+### Configuration
+- `prepend_parent_context`: true
+- `preserve_docstrings`: true
+- `preserve_imports`: true
+- `extract_dependencies`: true
+- `compute_complexity`: true
+- `fallback_to_fixed`: true (falls back to fixed-size 512 tokens on parse failure)
+
+### Applies to
+- Content types: code
+- Modalities: text
+
+---
+
+## Method 6: Multimodal Boundary Detection (NEW)
+
+> **Source:** [opus-prd2-v3.md](../opus-prd2-v3.md), [docs/UNIFIED_PRD.md](../docs/UNIFIED_PRD.md)
+
+- **Visual context window:** 1 paragraph before/after
+- **Caption detection:** true
+- **Figure reference detection:** true
+- **Preserve figure-caption pairs:** true
+- **Applies to:** documentation, research papers
+- **Modalities:** mixed_text_image, mixed_all
+
+### Purpose
+Detects boundaries between text and visual content in mixed documents. Ensures figures, diagrams, and their captions are kept together as coherent chunks.
+
+---
+
+## Method 7: Screenshot-Code Fusion (NEW)
+
+> **Source:** [opus-prd2-v3.md](../opus-prd2-v3.md), [docs/UNIFIED_PRD.md](../docs/UNIFIED_PRD.md)
+
+- **Matching strategies:**
+  - `filename_similarity` — match screenshots to code files by name
+  - `ocr_text_matching` — extract text from screenshots, match to code
+  - `reference_comment_detection` — find code comments referencing screenshots
+- **Applies to:** code
+- **Modalities:** mixed_text_image
+
+### Purpose
+Fuses UI screenshots with the code that generates them, creating cross-modal chunks that link visual output to source code.
+
+---
+
+## Content Type → Chunking Method Routing
+
+> **Source:** [opus-prd2-v3.md](../opus-prd2-v3.md) (domains section)
+
+| Domain | Chunking Methods |
+|--------|-----------------|
+| Prompts | semantic, fixed_size |
+| Codebase | ast_structural, fixed_size, screenshot_code_fusion |
+| Research | recursive_hierarchical, semantic, multimodal_boundary |
+
+---
+
+## Asynchronous / Multi-Agent Chunking
+
+> **Source:** [chatgpt5.2-prd.md](../chatgpt5.2-prd.md)
+
+### Agent Assignment by Method
+- **Fixed-size & Sentence-based:** Programmatic (no LLM needed)
+- **Semantic chunking:** Requires LLM intelligence — can use Haiku/Flash-class model
+- **Recursive hierarchical:** Requires higher intelligence — Sonnet/Pro-class model recommended
+
+### Key Questions from Requirements
+- Can semantic and recursive hierarchical chunking be done in a single pass by one agent, or do they require separate passes?
+- The user suggests asynchronous processing across files is ideal given the multi-file corpus
+
+### Model Recommendations for Agentic Chunking
+> **Source:** [chatgpt5.2-prd.md](../chatgpt5.2-prd.md)
+
+- Sonnet/Gemini Pro class: For recursive hierarchical chunking
+- Haiku/Gemini Flash class: For semantic chunking
+- Free models (e.g., MIMO V2 via OpenRouter): For simpler tasks
+
+---
+
+## Quad Encoding (MemVid-Specific Chunking)
+
+> **Source:** [gemini-prd.md](../gemini-prd.md) (Appendix I)
+
+MemVid uses "Quad Encoding" — encoding the same content at four resolutions:
+
+| Resolution | What it Encodes | Agent Query Type |
+|-----------|----------------|-----------------|
+| Word (Token) | Keywords & Entities | Exact definitions, variable names |
+| Sentence | Discrete Facts | Return types, specific error codes |
+| Paragraph | Local Context | How a flow handles edge cases |
+| Boundary | Relationships & Flow | What connects between sections |
+
+This is done during sleep-time compute (not real-time) due to 4x embedding cost.
+
+---
+
+## Conflicts / Ambiguities
+
+- **⚠️ Chunk size inconsistency:** chatgpt5.2-prd.md mentions "1.5-3K tokens" for chunks; opus-prd2-v3.md specifies 512 tokens (fixed), 1024 tokens (recursive), 128-2048 tokens (semantic). The AGGREGATION_PLAN.md lists yet another set: "1.5-3k tokens with 200-400 token overlap" for fixed-size. The opus-prd2 YAML config should be treated as authoritative.
+- **⚠️ Number of methods:** UNIFIED_PRD.md lists 7 methods; chatgpt5.2-prd.md discusses 4 core methods; opus-prd2-v3.md defines 6 in YAML config. The 7-method list (adding sentence-based as distinct from semantic) is the most complete.
+- **⚠️ Agentic vs programmatic:** chatgpt5.2-prd.md suggests semantic and recursive hierarchical need LLM agents; opus-prd2-v3.md treats semantic chunking as algorithmic (embedding similarity threshold). Resolution: semantic chunking uses the lightweight 0.6B model algorithmically, not a full LLM agent.
\ No newline at end of file
diff --git a/scratch/03-metadata-schema.md b/scratch/03-metadata-schema.md
new file mode 100644
index 0000000..6d20688
--- /dev/null
+++ b/scratch/03-metadata-schema.md
@@ -0,0 +1,285 @@
+# Topic: Metadata Schema (12 Dimensions)
+
+## Summary
+The 12-dimensional chunk metadata schema for the RAG v3.0 system, including TypeScript interfaces and YAML configuration for enabling/disabling dimensions.
+
+---
+
+## Schema Overview
+
+> **Source:** [opus-prd1-v3.md](../opus-prd1-v3.md) (Phase 2), [docs/SCHEMA_REFERENCE.md](../docs/SCHEMA_REFERENCE.md), [docs/UNIFIED_PRD.md](../docs/UNIFIED_PRD.md)
+
+The metadata schema has 12 dimensions, each capturing a different aspect of chunk information:
+
+```
+1. IDENTITY       — Unique identification and versioning
+2. PROVENANCE     — Complete audit trail
+3. CONTENT        — What the chunk contains
+4. STRUCTURE      — How the chunk was created
+5. HIERARCHY      — Document structure preservation
+6. SEMANTIC       — Extracted meaning and classification
+7. CODE_SPECIFIC  — Code-only metadata (AST, complexity, imports)
+8. MULTIMODAL     — Cross-modal relationships
+9. EMBEDDING      — Vector representation metadata
+10. GRAPH         — Knowledge graph relationships
+11. QUALITY       — Quality metrics and validation
+12. RETRIEVAL     — Retrieval analytics and feedback
+```
+
+---
+
+## Dimension 1: IDENTITY
+
+> **Source:** [docs/SCHEMA_REFERENCE.md](../docs/SCHEMA_REFERENCE.md), [opus-prd1-v3.md](../opus-prd1-v3.md)
+
+| Field | Type | Description |
+|-------|------|-------------|
+| chunk_id | string | UUID v7 (time-sortable) |
+| content_hash | string | SHA-256 of raw content (deduplication) |
+| version | number | Incremental version for updates |
+| parent_chunk_id | string/null | If this is a sub-chunk |
+| root_document_id | string | Original document this came from |
+| corpus_id | string | Which corpus/domain (prompts/code/research) |
+
+**Config:** `generate_uuid_v7: true`, `compute_content_hash: true`
+
+---
+
+## Dimension 2: PROVENANCE
+
+> **Source:** [docs/SCHEMA_REFERENCE.md](../docs/SCHEMA_REFERENCE.md), [opus-prd1-v3.md](../opus-prd1-v3.md)
+
+| Field | Type | Description |
+|-------|------|-------------|
+| source_uri | string | file://path or https://url |
+| source_type | enum | local_file, git_repo, web_url, api, user_upload |
+| git_metadata | object | repository, commit_sha, branch, timestamp, author, file_path |
+| author | object | name, email, organization |
+| license | string | SPDX identifier |
+| created_at | string | ISO 8601 |
+| modified_at | string | ISO 8601 |
+| ingested_at | string | ISO 8601 |
+| ingestion_pipeline_version | string | e.g., "3.0.0" |
+
+**Config:** `git_integration: true`, `track_authors: true`, `track_license: true`
+
+---
+
+## Dimension 3: CONTENT
+
+> **Source:** [docs/SCHEMA_REFERENCE.md](../docs/SCHEMA_REFERENCE.md)
+
+| Field | Type | Description |
+|-------|------|-------------|
+| content_type | enum | code, documentation, research_paper, prompt, configuration, data, conversation, mixed |
+| modalities | Modality[] | text, image, video, audio, screenshot, diagram |
+| primary_modality | Modality | Dominant modality |
+| language.natural | string | ISO 639-1 (e.g., 'en') |
+| language.programming | string | e.g., 'python', 'typescript' |
+| mime_type | string | e.g., 'text/markdown' |
+| byte_size | number | Size in bytes |
+| encoding | string | e.g., 'utf-8' |
+
+**Config:** `detect_language: true`, `detect_modalities: true`
+
+---
+
+## Dimension 4: STRUCTURE
+
+> **Source:** [docs/SCHEMA_REFERENCE.md](../docs/SCHEMA_REFERENCE.md)
+
+| Field | Type | Description |
+|-------|------|-------------|
+| chunking_method | enum | fixed_size, sentence_based, semantic, recursive_hierarchical, ast_structural, multimodal_boundary, manual |
+| chunking_config | object | target_tokens, overlap_tokens, similarity_threshold, separators |
+| token_count | number | Token count |
+| char_count | number | Character count |
+| word_count | number | Word count |
+| line_count | number | Line count |
+| overlap.previous_chunk_id | string | Previous chunk reference |
+| overlap.previous_overlap_tokens | number | Overlap with previous |
+| overlap.next_chunk_id | string | Next chunk reference |
+| overlap.next_overlap_tokens | number | Overlap with next |
+| boundaries.start_offset | number | Byte offset in source |
+| boundaries.end_offset | number | End byte offset |
+| boundaries.start_line | number | Start line number |
+| boundaries.end_line | number | End line number |
+
+**Config:** `track_overlaps: true`, `track_boundaries: true`
+
+---
+
+## Dimension 5: HIERARCHY
+
+> **Source:** [docs/SCHEMA_REFERENCE.md](../docs/SCHEMA_REFERENCE.md)
+
+| Field | Type | Description |
+|-------|------|-------------|
+| depth_level | number | 0=root, 1=section, 2=subsection... |
+| section_path | string[] | e.g., ["Chapter 1", "Introduction", "Background"] |
+| heading_text | string | Current section heading |
+| parent_heading | string | Parent section heading |
+| document_position | object | section_index, chunk_index_in_section, total_chunks_in_section, global_chunk_index, total_document_chunks |
+| sibling_chunk_ids | string[] | Other chunks at same level |
+| child_chunk_ids | string[] | Sub-chunks if hierarchical |
+
+**Config:** `max_depth: 10`, `track_siblings: true`
+
+---
+
+## Dimension 6: SEMANTIC
+
+> **Source:** [docs/SCHEMA_REFERENCE.md](../docs/SCHEMA_REFERENCE.md)
+
+| Field | Type | Description |
+|-------|------|-------------|
+| topic_cluster_id | string | Cluster assignment from topic modeling |
+| topic_keywords | string[] | Top keywords for this topic |
+| topic_confidence | number | Confidence score |
+| entities | NamedEntity[] | Extracted named entities with type, confidence, offsets |
+| keywords | object[] | term, tfidf_score, is_technical |
+| summary | string | Auto-generated 1-2 sentence summary |
+| intent_classification | object | primary_intent (explanation/tutorial/reference), confidence |
+| sentiment | object | polarity (-1 to 1), subjectivity (0 to 1) |
+| reading_level | string | technical, beginner, expert |
+
+**Entity Types:** PERSON, ORG, PRODUCT, TECH, CONCEPT, LOCATION, DATE, CODE_ELEMENT
+
+**Config:** `extract_entities: true`, `extract_keywords: true`, `generate_summaries: true`, `classify_intent: true`
+
+---
+
+## Dimension 7: CODE_SPECIFIC
+
+> **Source:** [docs/SCHEMA_REFERENCE.md](../docs/SCHEMA_REFERENCE.md)
+
+| Field | Type | Description |
+|-------|------|-------------|
+| ast_node_type | enum | module, class_definition, function_definition, method_definition, etc. |
+| parent_scope | string | e.g., "ClassName.method_name" |
+| fully_qualified_name | string | e.g., "module.ClassName.method_name" |
+| signature | string | Function/method signature |
+| return_type | string | Return type |
+| parameters | object[] | name, type, default_value |
+| imports | object[] | module, items, is_relative |
+| exports | string[] | Exported symbols |
+| docstring | object | summary, params, returns, raises, examples |
+| complexity | object | cyclomatic, cognitive, lines_of_code, lines_of_comments |
+| dependencies | object | internal (same codebase), external (packages) |
+| test_coverage | object | covered, test_file, coverage_percentage |
+
+**Config:** `extract_docstrings: true`, `compute_complexity: true`, `track_dependencies: true`, `track_test_coverage: false`
+
+---
+
+## Dimension 8: MULTIMODAL
+
+> **Source:** [docs/SCHEMA_REFERENCE.md](../docs/SCHEMA_REFERENCE.md)
+
+| Field | Type | Description |
+|-------|------|-------------|
+| visual_elements | VisualElement[] | figure, table, diagram, screenshot, equation, chart |
+| referenced_images | string[] | Image chunk IDs referenced |
+| referenced_code_blocks | string[] | Code chunk IDs referenced |
+| referenced_videos | string[] | Video chunk IDs referenced |
+| cross_modal_links | CrossModalLink[] | Links between modalities |
+| diagram_analysis | object | diagram_type, extracted_nodes, extracted_relationships |
+| ocr_extraction | object | full_text, confidence, language_detected |
+
+**CrossModalLink relationship types:** references, illustrates, implements, documents, derives_from, related_to
+
+**Config:** `extract_visual_elements: true`, `run_ocr: true`, `detect_diagram_types: true`, `build_cross_modal_links: true`
+
+---
+
+## Dimension 9: EMBEDDING
+
+> **Source:** [docs/SCHEMA_REFERENCE.md](../docs/SCHEMA_REFERENCE.md)
+
+| Field | Type | Description |
+|-------|------|-------------|
+| model_id | string | e.g., "qwen/qwen3-vl-embedding-8b" |
+| model_version | string | Model version |
+| native_dimensions | number | Original output dims (e.g., 4096) |
+| stored_dimensions | number | After MRL truncation (e.g., 1024) |
+| mrl_truncated | boolean | Whether MRL was applied |
+| quantization | string | bf16, fp16, int8, int4 |
+| instruction_used | string | The instruction prefix used |
+| embedding_hash | string | Hash of the embedding vector |
+| embedded_at | string | ISO 8601 timestamp |
+
+**Config:** `track_model_version: true`, `compute_embedding_hash: true`
+
+---
+
+## Dimension 10: GRAPH
+
+> **Source:** [docs/SCHEMA_REFERENCE.md](../docs/SCHEMA_REFERENCE.md)
+
+| Field | Type | Description |
+|-------|------|-------------|
+| incoming_refs | object[] | source_chunk_id, relationship_type, weight |
+| outgoing_refs | object[] | target_chunk_id, relationship_type, weight |
+| semantic_neighbors | object[] | chunk_id, similarity_score, model_id |
+| coreference_chain | string | Coreference chain ID |
+| dependency_graph | object | upstream_ids, downstream_ids |
+
+**Config:** `compute_semantic_neighbors: true`, `neighbor_top_k: 10`, `track_coreferences: false` (expensive, optional)
+
+---
+
+## Dimension 11: QUALITY
+
+> **Source:** [docs/SCHEMA_REFERENCE.md](../docs/SCHEMA_REFERENCE.md)
+
+| Field | Type | Description |
+|-------|------|-------------|
+| confidence_score | number | Overall confidence (0-1) |
+| validation_status | enum | valid, warning, error, pending |
+| error_flags | string[] | List of detected issues |
+| review_status | enum | auto_approved, needs_review, reviewed, rejected |
+| chunking_quality | object | coherence_score, completeness_score, boundary_quality |
+
+**Config:** `validate_chunks: true`, `compute_coherence: true`
+
+---
+
+## Dimension 12: RETRIEVAL
+
+> **Source:** [docs/SCHEMA_REFERENCE.md](../docs/SCHEMA_REFERENCE.md)
+
+| Field | Type | Description |
+|-------|------|-------------|
+| access_count | number | Times retrieved |
+| retrieval_success_rate | number | How often selected after retrieval |
+| user_feedback_score | number | Aggregated user feedback |
+| freshness_decay | number | Time-based relevance decay |
+| last_accessed_at | string | ISO 8601 |
+
+**Config:** `track_access: true`, `track_feedback: true`, `compute_freshness_decay: true`
+
+---
+
+## YAML Configuration Reference
+
+> **Source:** [opus-prd2-v3.md](../opus-prd2-v3.md), [docs/SCHEMA_REFERENCE.md](../docs/SCHEMA_REFERENCE.md)
+
+All 12 dimensions can be individually enabled/disabled via YAML config under `metadata.dimensions.<dimension>.enabled`.
+
+---
+
+## Implementation Requirements
+
+1. Define TypeScript interfaces for all 12 dimensions (reference code in SCHEMA_REFERENCE.md)
+2. Implement Pydantic models (Python) matching the TypeScript interfaces
+3. Build metadata extraction pipeline for each dimension
+4. Create configuration loader for enabling/disabling dimensions
+5. Implement content_hash computation (SHA-256)
+6. Implement UUID v7 generation for chunk_id
+
+---
+
+## Conflicts / Ambiguities
+
+- **⚠️ Schema completeness:** The UNIFIED_PRD.md schema overview shows fewer fields per dimension than the full TypeScript interfaces in SCHEMA_REFERENCE.md. The TypeScript interfaces are the authoritative source.
+- **⚠️ Hierarchy fields:** The overview diagram shows `sibling_ids[]` but the TypeScript interface uses `sibling_chunk_ids` and adds `child_chunk_ids`. Use the TypeScript interface names.
\ No newline at end of file
diff --git a/scratch/04-database-schema.md b/scratch/04-database-schema.md
new file mode 100644
index 0000000..f8a2b17
--- /dev/null
+++ b/scratch/04-database-schema.md
@@ -0,0 +1,253 @@
+# Topic: Database Schema (SQLite)
+
+## Summary
+SQLite database schema optimized for MemVid video-encoded storage, including all tables, relationships, and indexes for the RAG v3.0 system.
+
+---
+
+## Tables Overview
+
+> **Source:** [docs/SCHEMA_REFERENCE.md](../docs/SCHEMA_REFERENCE.md)
+
+| Table | Purpose |
+|-------|---------|
+| chunks | Core chunk storage with essential fields |
+| embeddings | Multiple embedding versions per chunk, MemVid integration |
+| chunk_relationships | Normalized graph relationships |
+| semantic_neighbors | Precomputed similar-chunk retrieval |
+| cross_modal_links | Multimodal retrieval support |
+| entities | Entity definitions |
+| chunk_entities | Entity-to-chunk mapping with offsets |
+| retrieval_events | Analytics tracking |
+| memvid_indices | MemVid video-encoded storage mapping |
+
+---
+
+## Table: chunks
+
+> **Source:** [docs/SCHEMA_REFERENCE.md](../docs/SCHEMA_REFERENCE.md)
+
+```sql
+CREATE TABLE chunks (
+    chunk_id TEXT PRIMARY KEY,
+    content_hash TEXT NOT NULL,
+    version INTEGER DEFAULT 1,
+    corpus_id TEXT NOT NULL,
+    root_document_id TEXT NOT NULL,
+    raw_content TEXT NOT NULL,
+    content_type TEXT NOT NULL,
+    modalities TEXT NOT NULL,           -- JSON array
+    primary_modality TEXT NOT NULL,
+    token_count INTEGER NOT NULL,
+    chunking_method TEXT NOT NULL,
+    parent_chunk_id TEXT,
+    depth_level INTEGER DEFAULT 0,
+    created_at TEXT NOT NULL,
+    modified_at TEXT NOT NULL,
+    ingested_at TEXT NOT NULL,
+    metadata_json TEXT NOT NULL,        -- Complete ChunkMetadata object
+    FOREIGN KEY (parent_chunk_id) REFERENCES chunks(chunk_id)
+);
+```
+
+**Design notes:**
+- `metadata_json` stores the complete 12-dimension metadata as JSON for flexibility
+- Core fields are denormalized for fast queries without JSON parsing
+- `modalities` stored as JSON array string
+
+---
+
+## Table: embeddings
+
+> **Source:** [docs/SCHEMA_REFERENCE.md](../docs/SCHEMA_REFERENCE.md)
+
+```sql
+CREATE TABLE embeddings (
+    embedding_id TEXT PRIMARY KEY,
+    chunk_id TEXT NOT NULL,
+    model_id TEXT NOT NULL,
+    dimensions INTEGER NOT NULL,
+    mrl_truncated INTEGER DEFAULT 0,
+    quantization TEXT,
+    vector BLOB,                        -- Or reference to MemVid frame
+    memvid_frame_index INTEGER,         -- If stored in MemVid
+    memvid_file TEXT,                   -- Which .mp4 file
+    instruction_used TEXT,
+    embedded_at TEXT NOT NULL,
+    embedding_hash TEXT NOT NULL,
+    FOREIGN KEY (chunk_id) REFERENCES chunks(chunk_id)
+);
+```
+
+**Design notes:**
+- Supports both direct BLOB storage and MemVid frame references
+- Multiple embeddings per chunk (different models, dimensions)
+
+---
+
+## Table: chunk_relationships
+
+> **Source:** [docs/SCHEMA_REFERENCE.md](../docs/SCHEMA_REFERENCE.md)
+
+```sql
+CREATE TABLE chunk_relationships (
+    relationship_id TEXT PRIMARY KEY,
+    source_chunk_id TEXT NOT NULL,
+    target_chunk_id TEXT NOT NULL,
+    relationship_type TEXT NOT NULL,
+    weight REAL DEFAULT 1.0,
+    evidence TEXT,
+    created_at TEXT NOT NULL,
+    FOREIGN KEY (source_chunk_id) REFERENCES chunks(chunk_id),
+    FOREIGN KEY (target_chunk_id) REFERENCES chunks(chunk_id)
+);
+```
+
+---
+
+## Table: semantic_neighbors
+
+> **Source:** [docs/SCHEMA_REFERENCE.md](../docs/SCHEMA_REFERENCE.md)
+
+```sql
+CREATE TABLE semantic_neighbors (
+    chunk_id TEXT NOT NULL,
+    neighbor_chunk_id TEXT NOT NULL,
+    similarity_score REAL NOT NULL,
+    computed_at TEXT NOT NULL,
+    model_id TEXT NOT NULL,
+    PRIMARY KEY (chunk_id, neighbor_chunk_id, model_id),
+    FOREIGN KEY (chunk_id) REFERENCES chunks(chunk_id),
+    FOREIGN KEY (neighbor_chunk_id) REFERENCES chunks(chunk_id)
+);
+```
+
+---
+
+## Table: cross_modal_links
+
+> **Source:** [docs/SCHEMA_REFERENCE.md](../docs/SCHEMA_REFERENCE.md)
+
+```sql
+CREATE TABLE cross_modal_links (
+    link_id TEXT PRIMARY KEY,
+    source_chunk_id TEXT NOT NULL,
+    target_chunk_id TEXT NOT NULL,
+    source_modality TEXT NOT NULL,
+    target_modality TEXT NOT NULL,
+    relationship_type TEXT NOT NULL,
+    confidence REAL NOT NULL,
+    anchor_text TEXT,
+    FOREIGN KEY (source_chunk_id) REFERENCES chunks(chunk_id),
+    FOREIGN KEY (target_chunk_id) REFERENCES chunks(chunk_id)
+);
+```
+
+---
+
+## Tables: entities & chunk_entities
+
+> **Source:** [docs/SCHEMA_REFERENCE.md](../docs/SCHEMA_REFERENCE.md)
+
+```sql
+CREATE TABLE entities (
+    entity_id TEXT PRIMARY KEY,
+    entity_text TEXT NOT NULL,
+    entity_type TEXT NOT NULL,
+    canonical_name TEXT,
+    knowledge_base_id TEXT
+);
+
+CREATE TABLE chunk_entities (
+    chunk_id TEXT NOT NULL,
+    entity_id TEXT NOT NULL,
+    mention_text TEXT NOT NULL,
+    start_offset INTEGER NOT NULL,
+    end_offset INTEGER NOT NULL,
+    confidence REAL NOT NULL,
+    PRIMARY KEY (chunk_id, entity_id, start_offset),
+    FOREIGN KEY (chunk_id) REFERENCES chunks(chunk_id),
+    FOREIGN KEY (entity_id) REFERENCES entities(entity_id)
+);
+```
+
+---
+
+## Table: retrieval_events
+
+> **Source:** [docs/SCHEMA_REFERENCE.md](../docs/SCHEMA_REFERENCE.md)
+
+```sql
+CREATE TABLE retrieval_events (
+    event_id TEXT PRIMARY KEY,
+    chunk_id TEXT NOT NULL,
+    query_text TEXT,
+    query_embedding_hash TEXT,
+    retrieval_rank INTEGER,
+    rerank_score REAL,
+    was_selected INTEGER,
+    user_feedback INTEGER,              -- -1, 0, 1
+    timestamp TEXT NOT NULL,
+    FOREIGN KEY (chunk_id) REFERENCES chunks(chunk_id)
+);
+```
+
+---
+
+## Table: memvid_indices
+
+> **Source:** [docs/SCHEMA_REFERENCE.md](../docs/SCHEMA_REFERENCE.md)
+
+```sql
+CREATE TABLE memvid_indices (
+    memvid_file TEXT NOT NULL,
+    frame_index INTEGER NOT NULL,
+    chunk_id TEXT NOT NULL,
+    embedding_id TEXT NOT NULL,
+    corpus_id TEXT NOT NULL,
+    PRIMARY KEY (memvid_file, frame_index),
+    FOREIGN KEY (chunk_id) REFERENCES chunks(chunk_id),
+    FOREIGN KEY (embedding_id) REFERENCES embeddings(embedding_id)
+);
+```
+
+---
+
+## Indexes
+
+> **Source:** [docs/SCHEMA_REFERENCE.md](../docs/SCHEMA_REFERENCE.md)
+
+```sql
+CREATE INDEX idx_chunks_corpus ON chunks(corpus_id);
+CREATE INDEX idx_chunks_content_type ON chunks(content_type);
+CREATE INDEX idx_chunks_document ON chunks(root_document_id);
+CREATE INDEX idx_chunks_parent ON chunks(parent_chunk_id);
+CREATE INDEX idx_embeddings_chunk ON embeddings(chunk_id);
+CREATE INDEX idx_embeddings_model ON embeddings(model_id);
+CREATE INDEX idx_relationships_source ON chunk_relationships(source_chunk_id);
+CREATE INDEX idx_relationships_target ON chunk_relationships(target_chunk_id);
+CREATE INDEX idx_relationships_type ON chunk_relationships(relationship_type);
+CREATE INDEX idx_neighbors_similarity ON semantic_neighbors(similarity_score DESC);
+CREATE INDEX idx_cross_modal_source ON cross_modal_links(source_chunk_id);
+CREATE INDEX idx_cross_modal_modality ON cross_modal_links(source_modality, target_modality);
+CREATE INDEX idx_entities_type ON entities(entity_type);
+CREATE INDEX idx_chunk_entities_entity ON chunk_entities(entity_id);
+```
+
+---
+
+## Implementation Requirements
+
+1. Create SQLite database initialization script with all tables
+2. Create migration system for schema versioning
+3. Implement data access layer (DAL) with CRUD operations for each table
+4. Add JSON validation for `metadata_json` and `modalities` fields
+5. Implement content_hash-based deduplication logic
+6. Build query helpers for common access patterns (by corpus, by document, by content_type)
+
+---
+
+## Conflicts / Ambiguities
+
+- **⚠️ SQLite vs other databases:** The schema is SQLite-specific, but gemini-prd.md mentions FalkorDB (via Bolt Protocol) for Graphiti and Qdrant/FAISS for vector search. The SQLite schema appears to be for the chunk metadata store only, not for vector search or graph queries.
+- **⚠️ Vector storage:** The embeddings table stores vectors as BLOB, but actual vector similarity search would use FAISS/HNSW indexes (see MemVid topic). SQLite is the metadata store, not the vector search engine.
diff --git a/scratch/05-memvid-storage.md b/scratch/05-memvid-storage.md
new file mode 100644
index 0000000..fb435a4
--- /dev/null
+++ b/scratch/05-memvid-storage.md
@@ -0,0 +1,206 @@
+# Topic: MemVid Storage (Video-Encoded Vector Storage)
+
+## Summary
+MemVid is the Cold Memory storage layer that encodes chunks and embeddings into H.265 compressed video files (QR frames) for massive compression. Includes encoder configuration, quad-encoding strategy, and file organization.
+
+---
+
+## What is MemVid?
+
+> **Source:** [gemini-prd.md](../gemini-prd.md), [docs/UNIFIED_PRD.md](../docs/UNIFIED_PRD.md)
+
+MemVid is a deep archive (90d+) storage system that uses H.265 compressed video (QR frames) to store massive datasets. It provides:
+- 50-100x compression over raw storage
+- Quad-encoded vectors for high-fidelity retrieval
+- Portable archive files (.mp4)
+
+---
+
+## Encoder Configuration
+
+> **Source:** [opus-prd2-v3.md](../opus-prd2-v3.md)
+
+```yaml
+memvid:
+  encoder:
+    codec: "hevc"          # H.265
+    crf: 18                # Constant Rate Factor (quality)
+    gop: 30                # Group of Pictures
+    preset: "medium"       # Encoding speed/quality tradeoff
+  
+  vector_config:
+    input_dimensions: 4096   # Native Qwen3-VL output
+    storage_dimensions: 1024 # MRL-truncated for efficiency
+    similarity_sort: true    # Sort vectors for better compression
+  
+  features:
+    parallel_segments: true
+    smart_recall: true
+    text_search: true
+    hnsw_index: true
+```
+
+---
+
+## File Organization
+
+> **Source:** [opus-prd2-v3.md](../opus-prd2-v3.md)
+
+Three separate MemVid files by domain:
+
+| File | Domain | Content |
+|------|--------|---------|
+| `codebase.mp4` | Codebase | Multi-repository source code and configs |
+| `research.mp4` | Research | Research papers, documentation, diagrams |
+| `prompts.mp4` | Prompts | User inputs and prompts to LLMs |
+
+---
+
+## Encoding Process (The "Freeze" Transition)
+
+> **Source:** [gemini-prd.md](../gemini-prd.md)
+
+The Warm → Cold transition ("Freeze") follows this process:
+
+1. **Deconstruction:** Serialize Graphiti nodes into JSON
+2. **Rendering:** Generate QR Code images (PNGs) of the JSON data
+   - QR Code Version 40, High Error Correction
+3. **Quad-Encoding:** Generate 4 vector layers per content block
+4. **Stitching:** Compile images into an H.265 `.mp4` video file
+
+### QR Code Specifications
+> **Source:** [gemini-prd.md](../gemini-prd.md)
+
+- **Version:** 40 (maximum capacity)
+- **Error Correction:** High
+- **Format:** PNG images compiled into video frames
+
+---
+
+## Quad-Encoding Strategy
+
+> **Source:** [gemini-prd.md](../gemini-prd.md) (Appendix I)
+
+Each content block is encoded at four resolutions with separate FAISS indices:
+
+| Layer | Resolution | What it Encodes | Use Case |
+|-------|-----------|----------------|----------|
+| 1 | Word/Token | Keywords & Entities | Exact definitions, variable names |
+| 2 | Sentence/Fact | Discrete Facts | Return types, specific values |
+| 3 | Paragraph/Context | Local Context | How flows handle edge cases |
+| 4 | Boundary | Relationships & Flow | Cross-section connections |
+
+### Why Quad-Encoding?
+- **Needle in a Haystack Fix:** Word/Sentence vectors allow precise fact retrieval without paragraph dilution
+- **Context Drift Fix:** Boundary vectors encode concept edges, preventing information loss at chunk boundaries
+
+### Storage Architecture: "Heavy Index, Light Payload"
+- **Index (4x larger):** 4 FAISS indices per video file
+- **Payload (100x smaller):** H.265 compressed video stores actual content
+- **Result:** Trading cheap disk space for high intelligence density
+
+---
+
+## MP4 RAG Encoder Implementation
+
+> **Source:** [gemini-prd.md](../gemini-prd.md) (Appendix II)
+
+```python
+class MP4RAGEncoder:
+    def __init__(self, frame_width=1920, frame_height=1080):
+        self.frame_width = frame_width
+        self.frame_height = frame_height
+    
+    def text_to_frame(self, text, chunk_id):
+        """Convert text chunk to image frame"""
+        # Render text with word wrap onto image
+        # Add chunk_id as QR code or metadata overlay
+    
+    def encode_chunks_to_mp4(self, chunks, embeddings, metadata, output_path):
+        """Encode all chunks into MP4 with H.265 compression"""
+        # Write video with H.265 (HEVC) codec
+        # Store chunk index and embeddings as sidecar files
+    
+    def decode_frame(self, mp4_path, frame_number):
+        """Quickly seek to specific frame and extract text"""
+        # OCR the frame to get text back
+```
+
+### Sidecar Files
+Each `.mp4` file has companion files:
+- `*_index.json` — Maps frame numbers to chunk IDs and metadata
+- `*_embeddings.npy` — NumPy array of embedding vectors
+
+---
+
+## MemVid Index Table (SQLite)
+
+> **Source:** [docs/SCHEMA_REFERENCE.md](../docs/SCHEMA_REFERENCE.md)
+
+```sql
+CREATE TABLE memvid_indices (
+    memvid_file TEXT NOT NULL,
+    frame_index INTEGER NOT NULL,
+    chunk_id TEXT NOT NULL,
+    embedding_id TEXT NOT NULL,
+    corpus_id TEXT NOT NULL,
+    PRIMARY KEY (memvid_file, frame_index),
+    FOREIGN KEY (chunk_id) REFERENCES chunks(chunk_id),
+    FOREIGN KEY (embedding_id) REFERENCES embeddings(embedding_id)
+);
+```
+
+---
+
+## Integration with Sleep-Time Compute
+
+> **Source:** [gemini-prd.md](../gemini-prd.md)
+
+Quad-encoding is computationally expensive (4x embedding time) and cannot be done in real-time. The workflow:
+
+1. **Live (ByteRover):** Simple paragraph chunks (fast, good enough)
+2. **Sleep Time (Daemon):** Explodes content into 4 layers, embeds all, encodes to MemVid
+3. **Next Day:** Agent has "super-resolution" access to yesterday's work
+
+---
+
+## Retrieval from MemVid: The "Zoom" Pattern
+
+> **Source:** [gemini-prd.md](../gemini-prd.md)
+
+Cascading lookup strategy (not all 4 layers at once):
+
+1. **Scout (Paragraph Layer):** Find general concepts — broad context
+2. **Snipe (Sentence Layer):** Check specific facts in the region — precise lines
+3. **Stitch (Boundary Layer):** Retrieve boundary vectors to see what connects next
+
+---
+
+## Implementation Requirements
+
+1. Implement MP4RAGEncoder class with H.265 encoding
+2. Implement QR code generation for JSON serialization
+3. Build quad-encoding pipeline (4 FAISS indices per video)
+4. Create sidecar file management (index.json, embeddings.npy)
+5. Implement frame seeking and OCR-based decoding
+6. Build the "Zoom" pattern retrieval logic
+7. Integrate with sleep-time daemon for batch encoding
+
+---
+
+## Dependencies
+
+- FFmpeg (H.265/HEVC encoding)
+- OpenCV (cv2) for video I/O
+- FAISS for vector indexing
+- Pillow for image generation
+- QR code library (qrcode or similar)
+- Ghostscript (for QR rendering)
+
+---
+
+## Conflicts / Ambiguities
+
+- **⚠️ QR vs text rendering:** gemini-prd.md describes QR code frames, but the MP4RAGEncoder code in Appendix II renders text directly onto frames. These are two different approaches — QR is more robust for data integrity, text rendering is simpler. The QR approach (from Section 3.4) appears to be the intended production approach.
+- **⚠️ Vector storage location:** opus-prd2-v3.md mentions `hnsw_index: true` as a MemVid feature, but gemini-prd.md describes separate FAISS indices. These may be complementary (HNSW within FAISS).
+- **⚠️ Sidecar vs embedded:** The code example uses sidecar files for index/embeddings, but the SQLite memvid_indices table provides a database-backed alternative. Both approaches may coexist.
diff --git a/scratch/06-memory-hierarchy.md b/scratch/06-memory-hierarchy.md
new file mode 100644
index 0000000..b66ddcf
--- /dev/null
+++ b/scratch/06-memory-hierarchy.md
@@ -0,0 +1,123 @@
+# Topic: Three-Tiered Memory Hierarchy
+
+## Summary
+The Hot/Warm/Cold memory architecture using ByteRover, Graphiti, and MemVid, including data lifecycle transitions and graduation protocols.
+
+---
+
+## Architecture Overview
+
+> **Source:** [gemini-prd.md](../gemini-prd.md), [docs/UNIFIED_PRD.md](../docs/UNIFIED_PRD.md)
+
+| Tier | Component | Role | Retention | Storage Format | Optimized For |
+|------|-----------|------|-----------|---------------|---------------|
+| Hot | ByteRover | Active Context | 0-24h | JSONL (filesystem) | Speed (grep/find) |
+| Warm | Graphiti | Knowledge Graph | 7-90d | Property Graph (FalkorDB) | Relationships |
+| Cold | MemVid | Deep Archive | 90d+ | H.265 video (QR frames) | Compression/Density |
+
+---
+
+## Hot Memory: ByteRover
+
+> **Source:** [gemini-prd.md](../gemini-prd.md)
+
+- **Type:** Filesystem-based active context
+- **Location:** `~/.byterover/inbox/`
+- **Format:** JSONL with strict Pydantic schema
+- **Schema fields:** type, summary, content, tags, timestamp
+- **Purpose:** Stores live "Working Memory," active Git branches, and "in-flight" ideas
+- **Optimization:** Speed via grep/find (no database overhead)
+- **Retention:** Purged nightly unless related to active Git branch
+
+### Live Usage
+During active work, ByteRover uses simple paragraph chunks — fast and good enough for real-time context injection.
+
+---
+
+## Warm Memory: Graphiti
+
+> **Source:** [gemini-prd.md](../gemini-prd.md)
+
+- **Type:** Temporal Knowledge Graph
+- **Backend:** FalkorDB (via Bolt Protocol)
+- **Node Types:** Concept, Pattern, Decision, DecisionNode, PatternNode
+- **Edge Types:** IMPLEMENTS, DEPRECATES, DEPENDS_ON, MITIGATES
+- **Purpose:** Stores structured relationships, "Skill" storage, and "Lineage"
+- **Retention:** 7-90 days active; nodes >30 days inactive become candidates for archival
+
+### Tombstone Pointers
+When nodes are archived to MemVid, they are replaced with lightweight "Tombstone Pointers" (e.g., `See Archive W42`) to maintain graph connectivity.
+
+---
+
+## Cold Memory: MemVid
+
+> **Source:** [gemini-prd.md](../gemini-prd.md), [opus-prd2-v3.md](../opus-prd2-v3.md)
+
+See [scratch/05-memvid-storage.md](./05-memvid-storage.md) for full details.
+
+- **Type:** Deep archive with video-encoded storage
+- **Format:** H.265 compressed video with QR frames
+- **Vector Index:** Quad-encoded (4 FAISS indices per video)
+- **Metadata:** Sidecar JSON + SQLite memvid_indices table
+- **Retention:** Permanent
+
+---
+
+## Transition A: The "Digest" (Hot → Warm)
+
+> **Source:** [gemini-prd.md](../gemini-prd.md), [docs/UNIFIED_PRD.md](../docs/UNIFIED_PRD.md)
+
+- **Trigger:** Nightly "Sleep Cycle" Daemon (or system idle > 15 minutes)
+- **Input:** Raw interaction logs from ByteRover (cleaned via Proxy)
+- **Process (The Dreamer):**
+  1. **Structuring:** Convert raw logs into strict Graphiti Nodes (DecisionNode, PatternNode)
+  2. **Filtering:** Discard "chatter" (conversational noise). Keep only "Solved Problems" and "Architectural Decisions"
+- **Output:** New Nodes added to Graphiti. Raw logs purged from ByteRover (unless related to active Git branch)
+
+---
+
+## Transition B: The "Freeze" (Warm → Cold)
+
+> **Source:** [gemini-prd.md](../gemini-prd.md), [docs/UNIFIED_PRD.md](../docs/UNIFIED_PRD.md)
+
+- **Trigger:** Weekly "Archivist" Job (Sunday)
+- **Input:** Stale Graphiti nodes (>30 days inactive) + Curated "Gold Standard" datasets
+- **Process (The Renderer):**
+  1. **Deconstruction:** Serialize nodes into JSON
+  2. **Rendering:** Generate QR Code images (PNGs) of the JSON data
+  3. **Quad-Encoding:** Generate 4 vector layers (Token, Fact, Context, Boundary)
+  4. **Stitching:** Compile images into an H.265 `.mp4` video file
+- **Output:** Portable MemVid archive file. Stale nodes replaced with Tombstone Pointers in Graphiti.
+
+---
+
+## Data Flow Diagram
+
+```
+User Input → Proxy → ByteRover (Hot, 0-24h)
+                         ↓ [Nightly Digest]
+                     Graphiti (Warm, 7-90d)
+                         ↓ [Weekly Freeze]
+                     MemVid (Cold, 90d+)
+```
+
+---
+
+## Implementation Requirements
+
+1. Implement ByteRover filesystem layer with JSONL read/write and Pydantic validation
+2. Set up FalkorDB container for Graphiti (docker-compose)
+3. Define Graphiti node and edge schemas
+4. Implement the "Digest" transition daemon (Hot → Warm)
+5. Implement the "Freeze" transition daemon (Warm → Cold)
+6. Build Tombstone Pointer system for archived nodes
+7. Implement idle detection trigger (system idle > 15 minutes)
+
+---
+
+## Conflicts / Ambiguities
+
+- **⚠️ Retention periods:** gemini-prd.md says Hot is 0-24h and Warm is 7-90h (hours), but UNIFIED_PRD.md says Warm is 7-90d (days) and the Freeze trigger is >30 days inactive. The "h" in gemini-prd.md appears to be a typo — days is the intended unit based on context.
+- **⚠️ Graph database:** gemini-prd.md specifies FalkorDB via Bolt Protocol. No other document confirms this choice. The containerization section mentions docker-compose for FalkorDB.
+- **⚠️ ByteRover location:** gemini-prd.md uses `~/.byterover/inbox/` but this is macOS-specific. Should be configurable.
diff --git a/scratch/07-retrieval-pipeline.md b/scratch/07-retrieval-pipeline.md
new file mode 100644
index 0000000..d06d412
--- /dev/null
+++ b/scratch/07-retrieval-pipeline.md
@@ -0,0 +1,169 @@
+# Topic: Retrieval Pipeline
+
+## Summary
+Two-stage retrieval system with broad recall, hybrid search, precision reranking, and cross-modal retrieval capabilities.
+
+---
+
+## Pipeline Overview
+
+> **Source:** [opus-prd2-v3.md](../opus-prd2-v3.md), [docs/UNIFIED_PRD.md](../docs/UNIFIED_PRD.md)
+
+```
+Query → Stage 1: Recall (top 100) → Hybrid Search → Stage 2: Rerank (top 10) → Results
+```
+
+---
+
+## Stage 1: Broad Recall
+
+> **Source:** [opus-prd2-v3.md](../opus-prd2-v3.md)
+
+```yaml
+recall:
+  model: "primary"              # Qwen3-VL-Embedding-8B
+  top_k: 100
+  similarity_threshold: 0.5
+  multimodal_query_support: true
+```
+
+- Embed the query using Qwen3-VL-Embedding-8B
+- Retrieve top 100 candidates by vector similarity
+- Minimum similarity threshold: 0.5
+- Supports multimodal queries (text, image, mixed)
+
+---
+
+## Hybrid Search
+
+> **Source:** [opus-prd2-v3.md](../opus-prd2-v3.md)
+
+```yaml
+hybrid:
+  enabled: true
+  vector_weight: 0.7
+  keyword_weight: 0.3
+  keyword_method: "bm25"
+```
+
+- Combines vector similarity (70%) with BM25 keyword matching (30%)
+- Improves recall for exact-match queries that pure vector search might miss
+
+---
+
+## Stage 2: Precision Reranking
+
+> **Source:** [opus-prd2-v3.md](../opus-prd2-v3.md), [opus-prd1-v3.md](../opus-prd1-v3.md)
+
+```yaml
+reranking:
+  enabled: true
+  model: "reranker"             # Qwen3-VL-Reranker-8B
+  top_k_input: 100
+  top_k_output: 10
+  multimodal_rerank: true
+```
+
+- Takes 100 candidates from recall stage
+- Uses Qwen3-VL-Reranker-8B (Single-Tower, Cross-Attention)
+- Outputs top 10 most relevant results
+- Supports multimodal reranking (query and documents can be mixed-modal)
+- Relevance score via yes/no token generation probability
+
+---
+
+## Cross-Modal Retrieval
+
+> **Source:** [opus-prd2-v3.md](../opus-prd2-v3.md), [opus-prd1-v3.md](../opus-prd1-v3.md)
+
+```yaml
+cross_modal:
+  enabled: true
+  query_modalities: ["text", "image", "mixed"]
+  result_modalities: ["text", "image", "code", "mixed"]
+```
+
+Enables queries like:
+- Text query → retrieve images/code/video
+- Image query → retrieve related text/code
+- Mixed query (text + image) → retrieve any modality
+
+> **Source:** [opus-prd3-v3.md](../opus-prd3-v3.md)
+
+Example cross-modal queries:
+- "Find the diagram referenced by the code comment describing the vectorized kernel"
+- "Find video clips illustrating the algorithm described in Section 3.2"
+- "Find the code implementing the architecture in this screenshot"
+
+---
+
+## Performance Targets
+
+> **Source:** [opus-prd2-v3.md](../opus-prd2-v3.md)
+
+| Metric | Target |
+|--------|--------|
+| Max latency | 550ms (text), 600ms (image), 700ms (mixed) |
+| Min relevance score | 0.6 |
+
+---
+
+## MemVid "Zoom" Pattern Retrieval
+
+> **Source:** [gemini-prd.md](../gemini-prd.md)
+
+For MemVid (Cold Memory) retrieval, use cascading lookup across quad-encoded layers:
+
+1. **Scout (Paragraph Layer):** Find general concepts — broad context
+2. **Snipe (Sentence Layer):** Check specific facts in the region
+3. **Stitch (Boundary Layer):** Retrieve boundary vectors to see cross-section connections
+
+---
+
+## Query Classification & Routing
+
+> **Source:** [chatgpt5.2-prd.md](../chatgpt5.2-prd.md)
+
+The retrieval system needs to:
+1. Classify the user's query intent
+2. Determine which domain(s) to search (prompts, codebase, research)
+3. Select appropriate retrieval strategy based on query type
+4. Route to the correct MemVid file(s) and chunking method indices
+
+### Model for Query Classification
+> **Source:** [chatgpt5.2-prd.md](../chatgpt5.2-prd.md)
+
+A model is needed to categorize user requests and determine which RAG ingestion methodology to use for retrieval. The user suggests this could potentially be done recursively/iteratively.
+
+---
+
+## Context Injection (Proxy Integration)
+
+> **Source:** [gemini-prd.md](../gemini-prd.md)
+
+The Proxy enriches queries by:
+1. Running intent classification
+2. Querying Graphiti (Warm) + ByteRover (Hot)
+3. Prepending relevant context as a "System Note"
+
+---
+
+## Implementation Requirements
+
+1. Implement vector similarity search (FAISS/HNSW)
+2. Implement BM25 keyword search
+3. Build hybrid search score combiner (0.7 vector + 0.3 keyword)
+4. Integrate Qwen3-VL-Reranker-8B for precision reranking
+5. Build cross-modal query support
+6. Implement query classification/routing logic
+7. Build the "Zoom" pattern for MemVid retrieval
+8. Implement latency monitoring and optimization
+9. Build retrieval analytics tracking (retrieval_events table)
+
+---
+
+## Conflicts / Ambiguities
+
+- **⚠️ Query routing model:** chatgpt5.2-prd.md asks what model class is needed for query classification but doesn't specify one. No other document provides a concrete answer. This needs to be determined during implementation.
+- **⚠️ Latency targets vary:** 550ms for text queries, 600ms for image, 700ms for mixed — these are from verification_queries in opus-prd2-v3.md. The general target is 550ms. Mixed-modal queries may need relaxed targets.
+- **⚠️ Sub-second vs 550ms:** chatgpt5.2-prd.md mentions "sub-second latency" as a goal; opus-prd2-v3.md specifies 550ms. These are compatible but 550ms is the stricter target.
diff --git a/scratch/08-orchestration-concurrency.md b/scratch/08-orchestration-concurrency.md
new file mode 100644
index 0000000..ac88d19
--- /dev/null
+++ b/scratch/08-orchestration-concurrency.md
@@ -0,0 +1,160 @@
+# Topic: Orchestration & Concurrency
+
+## Summary
+Headless agentic orchestration via Claude Code, concurrency settings, batching strategies, MCP server configurations, and the agentic swarm architecture.
+
+---
+
+## Headless Operation
+
+> **Source:** [opus-prd2-v3.md](../opus-prd2-v3.md), [chatgpt5.2-prd.md](../chatgpt5.2-prd.md)
+
+```yaml
+orchestration:
+  headless:
+    enabled: true
+    logic_file: "orchestration_logic_v3.md"
+    checkpoint_interval_minutes: 5
+```
+
+- Orchestration logic is defined in natural language (markdown file)
+- Provider-agnostic — designed for Anthropic agentic SDK
+- Deployed via Claude Code headless CLI with monitoring layer (e.g., autoclot)
+- Uses `.claude/claude.md` based skills, plugins, and MCP tools
+
+---
+
+## Concurrency Settings
+
+> **Source:** [opus-prd2-v3.md](../opus-prd2-v3.md)
+
+```yaml
+concurrency:
+  max_files: 50
+  modality_detector_workers: 2
+  content_router_workers: 2
+  code_specialist_workers: 8
+  text_specialist_workers: 8
+  multimodal_specialist_workers: 4
+  graph_builder_workers: 2
+  integration_workers: 2
+```
+
+### Worker Roles
+
+| Worker Type | Count | Purpose |
+|------------|-------|---------|
+| Modality Detector | 2 | Detect content modalities in incoming files |
+| Content Router | 2 | Route content to appropriate chunking pipeline |
+| Code Specialist | 8 | AST parsing, code chunking, dependency extraction |
+| Text Specialist | 8 | Semantic/recursive/sentence chunking for text |
+| Multimodal Specialist | 4 | Multimodal boundary detection, screenshot-code fusion |
+| Graph Builder | 2 | Entity extraction, relationship building, semantic neighbors |
+| Integration | 2 | Final assembly, quality validation, storage |
+
+---
+
+## Batching Configuration
+
+> **Source:** [opus-prd2-v3.md](../opus-prd2-v3.md)
+
+```yaml
+batching:
+  embedding_batch_size: 16          # Smaller for multimodal (memory constraints)
+  integration_buffer_size: 50
+  integration_flush_timeout_ms: 5000
+```
+
+---
+
+## MCP Server Configuration
+
+> **Source:** [opus-prd2-v3.md](../opus-prd2-v3.md)
+
+| MCP Server | Description | Config |
+|-----------|-------------|--------|
+| filesystem-mcp | Sandboxed file system access | — |
+| git-mcp | Git history for provenance | — |
+| embedding-mcp | Unified multimodal embedding API | default_model: Qwen3-VL-Embedding-8B |
+| memvid-mcp | Video-encoded vector storage | — |
+| entity-mcp | Named entity extraction | — |
+
+---
+
+## Agentic Swarm Architecture
+
+> **Source:** [chatgpt5.2-prd.md](../chatgpt5.2-prd.md), [docs/UNIFIED_PRD.md](../docs/UNIFIED_PRD.md)
+
+### Ingestion Pipeline Agents
+
+The orchestrator coordinates specialized agents for the ingestion pipeline:
+
+1. **Modality Detector Agent** — Classifies incoming content type and modalities
+2. **Content Router Agent** — Routes to appropriate chunking strategy
+3. **Chunking Agents** (per method):
+   - Fixed-size chunker (programmatic, no LLM)
+   - Sentence-based chunker (programmatic, no LLM)
+   - Semantic chunker (uses Qwen3-Embedding-0.6B for boundary detection)
+   - Recursive hierarchical chunker (may need Sonnet/Pro-class LLM)
+   - AST structural chunker (tree-sitter based, programmatic)
+   - Multimodal boundary chunker (needs VL model)
+   - Screenshot-code fusion chunker (needs VL model + OCR)
+4. **Entity Extraction Agent** — NER and relationship extraction
+5. **Graph Builder Agent** — Knowledge graph construction
+6. **Quality Validation Agent** — Chunk quality scoring
+7. **Integration Agent** — Final assembly and storage
+
+### Asynchronous Processing
+> **Source:** [chatgpt5.2-prd.md](../chatgpt5.2-prd.md)
+
+- Files should be processed asynchronously (multi-file corpus)
+- Different chunking methods can run in parallel on different files
+- Embedding can be batched across chunks
+
+---
+
+## Orchestration Logic File
+
+> **Source:** [chatgpt5.2-prd.md](../chatgpt5.2-prd.md)
+
+The orchestration logic should be:
+- Natural language (provider-agnostic)
+- Markdown-based configuration
+- Compatible with Anthropic agentic SDK
+- Deployable via Claude Code headless CLI
+
+---
+
+## Key Directories
+
+> **Source:** [docs/UNIFIED_PRD.md](../docs/UNIFIED_PRD.md)
+
+```
+src/models/          — Qwen3-VL model implementations
+src/chunking/        — Multi-strategy chunking algorithms
+src/memvid/          — Video-encoded storage system
+src/graphiti/        — Knowledge graph implementation
+src/orchestration/   — Agentic swarm coordination logic
+src/retrieval/       — Two-stage retrieval pipeline
+```
+
+---
+
+## Implementation Requirements
+
+1. Create orchestration logic markdown file
+2. Implement worker pool with configurable concurrency
+3. Build content routing logic (content type → chunking method)
+4. Implement async file processing pipeline
+5. Set up MCP server integrations
+6. Build checkpoint/resume system (5-minute intervals)
+7. Implement embedding batching with configurable batch size
+8. Create monitoring/logging for worker status
+
+---
+
+## Conflicts / Ambiguities
+
+- **⚠️ Agent vs programmatic:** chatgpt5.2-prd.md envisions LLM agents for semantic/hierarchical chunking, but opus-prd2-v3.md treats these as algorithmic processes with configurable parameters. The implementation should use algorithmic approaches with LLM fallback for edge cases.
+- **⚠️ Orchestration tool:** chatgpt5.2-prd.md mentions "autoclot or something" as monitoring layer. This is vague — the specific monitoring tool needs to be determined.
+- **⚠️ Worker counts:** The concurrency settings (8 code + 8 text + 4 multimodal = 20 specialist workers) assume significant compute resources. May need to be tuned for M3 Max MacBook Pro.
diff --git a/scratch/09-proxy-shim.md b/scratch/09-proxy-shim.md
new file mode 100644
index 0000000..ea8b033
--- /dev/null
+++ b/scratch/09-proxy-shim.md
@@ -0,0 +1,103 @@
+# Topic: Proxy / Shim (The Gatekeeper)
+
+## Summary
+The Claude-Proxy wrapper that intercepts user prompts and model outputs, injects context from memory, and sanitizes data for storage. This is the central hub of the system.
+
+---
+
+## Role
+
+> **Source:** [gemini-prd.md](../gemini-prd.md), [docs/UNIFIED_PRD.md](../docs/UNIFIED_PRD.md)
+
+| Component | Role | Responsibility |
+|-----------|------|---------------|
+| The Proxy (Shim) | The Gatekeeper | Wraps `claude` command. Intercepts user prompts and model outputs. Injects context from memory. Sanitizes data via Pydantic schemas before storage. |
+
+---
+
+## Proxy Logic Flow
+
+> **Source:** [gemini-prd.md](../gemini-prd.md) (Section 3.2)
+
+1. **Intercept:** Capture `stdin` (User Prompt)
+2. **Enrich:**
+   - Run Classification (Intent Detection)
+   - Query Graphiti (Warm) + ByteRover (Hot)
+   - Inject: Prepend relevant context as a "System Note"
+3. **Execute:** Pass modified payload to the real `claude` binary
+4. **Capture:** Read the resulting `stdout` and log files
+5. **Sanitize:** Pass output to Local LLM (Structure Gate) to strip noise
+6. **Ingest:** Write structured JSON to `~/.byterover/inbox/`
+
+---
+
+## Architecture
+
+> **Source:** [gemini-prd.md](../gemini-prd.md) (Section 3.1)
+
+The Claude-Proxy (Python) sits at the center with spokes:
+
+- **North:** StdIO Interface (User Terminal)
+- **South:** Anthropic API (Claude Code Execution)
+- **East (Storage):**
+  - ByteRover Interface (File I/O)
+  - Graphiti Interface (Bolt Protocol to FalkorDB)
+  - MemVid Interface (FFmpeg + FAISS)
+- **West (Compute):**
+  - OpenRouter API (Sleep-Time Models)
+  - Local LLM (Ollama — Pydantic Guardrails)
+
+---
+
+## User Experience
+
+> **Source:** [gemini-prd.md](../gemini-prd.md) (Section 1.6)
+
+- **Transparent Operation:** User types `claude` as normal. Proxy handles all complexity invisibly.
+- **Context Injection:** "God Mode" automatically prepends relevant Hot/Warm memory based on intent classification.
+- **Feedback Loop:** If user explicitly praises/scolds the agent, the Proxy tags that interaction for high-priority processing by the Tribunal during sleep.
+
+---
+
+## Installation
+
+> **Source:** [gemini-prd.md](../gemini-prd.md) (Section 3.5)
+
+A single `install.sh` script that:
+1. Sets up Python `venv`
+2. Installs `ffmpeg`, `ghostscript` (for QR)
+3. Aliases `claude` to `python ~/.bin/claude_proxy.py`
+4. Registers the `sleep_daemon` with `launchd`
+
+---
+
+## Data Sanitization
+
+> **Source:** [gemini-prd.md](../gemini-prd.md)
+
+- Uses Pydantic schemas for strict data validation
+- Local LLM (Ollama) acts as Structure Gate to strip conversational noise
+- Output format: Structured JSON with fields: type, summary, content, tags, timestamp
+
+---
+
+## Implementation Requirements
+
+1. Create `claude_proxy.py` wrapper script
+2. Implement stdin/stdout interception
+3. Build intent classification module
+4. Implement context retrieval from ByteRover (Hot) and Graphiti (Warm)
+5. Build context injection (System Note prepending)
+6. Implement output capture and sanitization via Pydantic
+7. Build Local LLM integration (Ollama) for structure gating
+8. Create JSONL writer for ByteRover inbox
+9. Implement feedback detection (praise/scold tagging)
+10. Create `install.sh` for turnkey setup
+
+---
+
+## Conflicts / Ambiguities
+
+- **⚠️ Local LLM dependency:** The proxy requires a local LLM (Ollama) for sanitization. This adds a dependency that may not be available on all systems. Could be made optional with a simpler regex-based fallback.
+- **⚠️ macOS-specific:** `launchd` registration is macOS-only. Linux would need systemd, Windows would need a service. Should be abstracted.
+- **⚠️ Claude binary wrapping:** Assumes the `claude` CLI binary exists and can be wrapped. The exact interception mechanism depends on Claude Code's CLI interface.
diff --git a/scratch/10-sleep-time-compute.md b/scratch/10-sleep-time-compute.md
new file mode 100644
index 0000000..9e1bd29
--- /dev/null
+++ b/scratch/10-sleep-time-compute.md
@@ -0,0 +1,138 @@
+# Topic: Sleep-Time Compute & Self-Improvement Loops
+
+## Summary
+The autonomous background processing system that runs during idle/sleep periods to refine skills, process memories, and improve the system without human intervention.
+
+---
+
+## Overview
+
+> **Source:** [gemini-prd.md](../gemini-prd.md) (Sections 1.5, 3.3)
+
+Sleep-Time Compute is the engine of self-improvement. It operates autonomously to upgrade the system's intelligence during idle periods, using free/cheap models via OpenRouter.
+
+---
+
+## Sleep-Time Daemon Architecture
+
+> **Source:** [gemini-prd.md](../gemini-prd.md) (Section 3.3)
+
+A background service (`launchd` on macOS) with a state machine:
+
+| State | Trigger | Action |
+|-------|---------|--------|
+| IDLE | Default | Monitoring system load |
+| DREAMING | Daily | Processing ByteRover inbox → Graphiti |
+| EVOLVING | Nightly | Full refinement loop (5 steps) |
+
+### EVOLVING Steps
+1. **Curiosity Module** generates task list (identifies knowledge gaps)
+2. **Creator** generates artifacts via OpenRouter (free models)
+3. **Tribunal** (Parallel Async) critiques artifacts
+4. **Mutator** updates `~/.skills/*.md` files
+5. **Archivist** renders approved artifacts to `~/.memvid/staging`
+
+---
+
+## Loop 1: The Simulator (Correction)
+
+> **Source:** [gemini-prd.md](../gemini-prd.md) (Section 1.5)
+
+- **Input:** Failed tests/specs from the day's active work
+- **Action:** Spawns a temporary git branch. Retries the failed spec using infinite time/retries.
+- **Result:** Upon success, creates a "Solution Node" in Graphiti
+- **Purpose:** Automatically fixes failures encountered during the day
+
+---
+
+## Loop 2: The Professor (Synthesis)
+
+> **Source:** [gemini-prd.md](../gemini-prd.md) (Section 1.5)
+
+- **Input:** High-quality external repositories (e.g., `shadcn/ui`, `actix-web`)
+- **Action:** "Reverse Engineers" the code to generate Synthetic PRDs
+- **Result:** Stores pairs of `{Synthetic_PRD} -> {Perfect_Code}` in MemVid for future RAG retrieval
+- **Purpose:** Learns from exemplary codebases
+
+---
+
+## Loop 3: The Evolutionary Forge (Creation)
+
+> **Source:** [gemini-prd.md](../gemini-prd.md) (Section 1.5)
+
+- **Input:** "Madlib" Inspiration Queue (Randomized Topic + Style + Constraint)
+- **Action:**
+  1. **Draft:** Creator Model generates artifact
+  2. **Gate:** Taste Oracle checks novelty (rejects if too similar/dissimilar to Gold Standard)
+  3. **Critique:** Tribunal (Personas) attacks the draft
+  4. **Mutate:** If score < 95, Mutator rewrites the Skill File (Prompt)
+- **Result:** A graduated "Skill File" v2.0 and a high-quality artifact for the archive
+
+---
+
+## The Tribunal (The Critic)
+
+> **Source:** [gemini-prd.md](../gemini-prd.md), [docs/UNIFIED_PRD.md](../docs/UNIFIED_PRD.md)
+
+A dynamic graph of adversarial personas that critique generated artifacts:
+- **Security Zealot** — Attacks security vulnerabilities
+- **Pedant** — Checks correctness and precision
+- **Visionary** — Evaluates innovation and forward-thinking
+
+Runs in parallel async during sleep cycles.
+
+---
+
+## The Mutator (The Evolution)
+
+> **Source:** [gemini-prd.md](../gemini-prd.md), [docs/UNIFIED_PRD.md](../docs/UNIFIED_PRD.md)
+
+- Uses Genetic Algorithms to rewrite "Skill Files" (prompts)
+- Based on Tribunal feedback scores
+- Skill Files stored at `~/.skills/*.md`
+
+---
+
+## The Taste Oracle (The Quality Gate)
+
+> **Source:** [gemini-prd.md](../gemini-prd.md), [docs/UNIFIED_PRD.md](../docs/UNIFIED_PRD.md)
+
+- Vector-based novelty detector
+- Compares outputs against a "Gold Standard" baseline in MemVid
+- Rejects derivative or hallucinated work
+- Uses cosine similarity in embedding space
+
+---
+
+## Cost Strategy
+
+> **Source:** [gemini-prd.md](../gemini-prd.md) (Section 1.2)
+
+- Uses Free/OpenRouter tiers for "Heavy Hitter" models during sleep cycles
+- Models: DeepSeek, Qwen, Mistral (free tier)
+- Zero-cost autonomous improvement
+
+---
+
+## Implementation Requirements
+
+1. Implement sleep-time daemon (background service)
+2. Build state machine (IDLE → DREAMING → EVOLVING)
+3. Implement idle detection (system idle > 15 minutes)
+4. Build the Simulator loop (failed test retry on temp branches)
+5. Build the Professor loop (external repo analysis → synthetic PRDs)
+6. Build the Evolutionary Forge loop (creation + critique + mutation)
+7. Implement the Tribunal with configurable adversarial personas
+8. Implement the Mutator with genetic algorithm-based prompt rewriting
+9. Implement the Taste Oracle with vector novelty detection
+10. Build the Curiosity Module (knowledge gap detection)
+11. Create Skill File management system (`~/.skills/*.md`)
+12. Integrate with OpenRouter free tier for sleep-time models
+
+---
+
+## Conflicts / Ambiguities
+
+- **⚠️ Curiosity Module vs Madlib:** The Curiosity Module (Active Inference) is listed as a strategic integration in gemini-prd.md Section 2, replacing the random "Madlib" generator. But Loop 3 still references "Madlib Inspiration Queue." The Curiosity Module is the intended upgrade path.
+- **⚠️ Score threshold:** Loop 3 uses "score < 95" as the mutation threshold. This seems very high — may need calibration.
+- **⚠️ Platform dependency:** `launchd` is macOS-only. Needs cross-platform daemon support.
diff --git a/scratch/11-quality-assurance.md b/scratch/11-quality-assurance.md
new file mode 100644
index 0000000..b277f0b
--- /dev/null
+++ b/scratch/11-quality-assurance.md
@@ -0,0 +1,120 @@
+# Topic: Quality Assurance & Error Handling
+
+## Summary
+Chunk validation, coherence scoring, verification queries, outlier detection, and error handling procedures for the RAG v3.0 system.
+
+---
+
+## Chunk Validation
+
+> **Source:** [opus-prd2-v3.md](../opus-prd2-v3.md)
+
+```yaml
+quality:
+  validation:
+    validate_all_chunks: true
+    min_coherence_score: 0.6
+    min_completeness_score: 0.5
+    flag_outlier_embeddings: true
+```
+
+- All chunks are validated after creation
+- Minimum coherence score: 0.6 (how well the chunk holds together semantically)
+- Minimum completeness score: 0.5 (whether the chunk contains a complete thought)
+- Outlier embeddings are flagged for review
+
+---
+
+## Quality Metadata (Dimension 11)
+
+> **Source:** [docs/SCHEMA_REFERENCE.md](../docs/SCHEMA_REFERENCE.md)
+
+Each chunk carries quality metadata:
+
+| Field | Type | Description |
+|-------|------|-------------|
+| confidence_score | number | Overall confidence (0-1) |
+| validation_status | enum | valid, warning, error, pending |
+| error_flags | string[] | List of detected issues |
+| review_status | enum | auto_approved, needs_review, reviewed, rejected |
+| chunking_quality.coherence_score | number | Semantic coherence |
+| chunking_quality.completeness_score | number | Thought completeness |
+| chunking_quality.boundary_quality | number | How clean the chunk boundaries are |
+
+---
+
+## Verification Queries
+
+> **Source:** [opus-prd2-v3.md](../opus-prd2-v3.md)
+
+Post-ingestion verification queries to validate the system works correctly:
+
+| Query | Type | Expected | Max Latency |
+|-------|------|----------|-------------|
+| "database connection setup" | text | codebase domain | 550ms |
+| "system architecture diagram" | text | image/mixed modalities | 600ms |
+| "Find code that implements this UI" + test_screenshot.png | mixed | codebase domain | 700ms |
+
+---
+
+## Error Handling
+
+> **Source:** [opus-prd2-v3.md](../opus-prd2-v3.md)
+
+### API Rate Limiting
+```yaml
+api_rate_limit:
+  initial_backoff_seconds: 5
+  max_backoff_seconds: 60
+  max_retries: 5
+```
+
+### Embedding Failures
+```yaml
+embedding_failure:
+  retry_count: 3
+  fallback_to_text_only: true
+```
+- Retry up to 3 times
+- Fall back to text-only Qwen3-Embedding-8B if multimodal embedding fails
+
+### Parse Failures
+```yaml
+parse_failure:
+  log_file: "ingestion_errors.log"
+  continue_on_error: true
+  quarantine_failed: true
+```
+- Log errors to `ingestion_errors.log`
+- Continue processing other files on error
+- Quarantine failed files for manual review
+
+### Multimodal Failures
+```yaml
+multimodal_failure:
+  fallback_to_text_only: true
+  log_visual_errors: true
+```
+- Fall back to text-only processing
+- Log visual processing errors separately
+
+---
+
+## Implementation Requirements
+
+1. Implement chunk coherence scoring algorithm
+2. Implement chunk completeness scoring algorithm
+3. Build outlier embedding detection (statistical outlier in vector space)
+4. Create validation pipeline that runs after each chunk creation
+5. Implement verification query test suite
+6. Build exponential backoff retry logic for API calls
+7. Implement fallback chain (multimodal → text-only)
+8. Create error quarantine system for failed files
+9. Build ingestion error logging
+
+---
+
+## Conflicts / Ambiguities
+
+- **⚠️ Scoring algorithms undefined:** The documents specify minimum scores (0.6 coherence, 0.5 completeness) but don't define how these scores are computed. Implementation needs to determine the scoring methodology (e.g., embedding-based coherence, LLM-based completeness).
+- **⚠️ Outlier detection method:** "flag_outlier_embeddings" is specified but the detection method (z-score, IQR, isolation forest, etc.) is not defined.
diff --git a/scratch/12-domain-configuration.md b/scratch/12-domain-configuration.md
new file mode 100644
index 0000000..5a21faa
--- /dev/null
+++ b/scratch/12-domain-configuration.md
@@ -0,0 +1,125 @@
+# Topic: Domain Configuration & Content Routing
+
+## Summary
+Configuration for the three content domains (prompts, codebase, research), including per-domain chunking methods, storage files, retention policies, and content type detection.
+
+---
+
+## Domain Definitions
+
+> **Source:** [opus-prd2-v3.md](../opus-prd2-v3.md)
+
+### Domain: Prompts
+```yaml
+- name: "prompts"
+  description: "User inputs and prompts to LLMs"
+  storage: "prompts.mp4"
+  chunking_methods:
+    - "semantic"
+    - "fixed_size"
+  retention: "30_days_rolling"
+  multimodal: false
+```
+
+### Domain: Codebase
+```yaml
+- name: "codebase"
+  description: "Multi-repository source code and configs"
+  storage: "codebase.mp4"
+  chunking_methods:
+    - "ast_structural"
+    - "fixed_size"
+    - "screenshot_code_fusion"
+  retention: "version_controlled"
+  multimodal: true
+  cross_reference: true
+```
+
+### Domain: Research
+```yaml
+- name: "research"
+  description: "Research papers, documentation, diagrams"
+  storage: "research.mp4"
+  chunking_methods:
+    - "recursive_hierarchical"
+    - "semantic"
+    - "multimodal_boundary"
+  retention: "permanent"
+  multimodal: true
+```
+
+---
+
+## Content Type Detection
+
+> **Source:** [chatgpt5.2-prd.md](../chatgpt5.2-prd.md), [opus-prd2-v3.md](../opus-prd2-v3.md)
+
+### Supported File Types
+- **Markdown (.md)** — Primary format for writing/documentation
+- **Python (.py)** — Code
+- **JavaScript/TypeScript (.js/.ts)** — Code
+- **DOCX** — Documents (not large portion)
+- **PDF** — Research papers (not large portion)
+- **Config files** — Various formats
+
+### Content Type Mapping
+
+| File Type | Content Type | Domain | Chunking Methods |
+|-----------|-------------|--------|-----------------|
+| .md (writing) | documentation | research | recursive_hierarchical, semantic, multimodal_boundary |
+| .md (prompts) | prompt | prompts | semantic, fixed_size |
+| .py, .js, .ts | code | codebase | ast_structural, fixed_size |
+| .json, .yaml, .toml | configuration | codebase | fixed_size |
+| .pdf | research_paper | research | recursive_hierarchical, semantic |
+| .docx | documentation | research | recursive_hierarchical, semantic |
+
+---
+
+## Multi-Repository Setup
+
+> **Source:** [chatgpt5.2-prd.md](../chatgpt5.2-prd.md)
+
+- Codebase is stored in a multi-repo setup (not monorepo)
+- Each repository should be tracked separately for provenance
+- Git metadata (commit SHA, branch, author) captured per chunk
+
+---
+
+## Corpus Size Estimates
+
+> **Source:** [chatgpt5.2-prd.md](../chatgpt5.2-prd.md), [gemini-prd.md](../gemini-prd.md)
+
+| Metric | Value |
+|--------|-------|
+| Initial text corpus | ~35MB |
+| Initial documents (gemini estimate) | 500 docs × 150 pages = 75,000 pages |
+| Weekly growth (gemini estimate) | +100 docs × 150 pages = +15,000 pages/week |
+
+---
+
+## Retention Policies
+
+| Domain | Policy | Description |
+|--------|--------|-------------|
+| Prompts | 30-day rolling | Older prompts archived to MemVid |
+| Codebase | Version-controlled | Tied to git history, never deleted |
+| Research | Permanent | Always retained |
+
+---
+
+## Implementation Requirements
+
+1. Implement content type detector (file extension + content analysis)
+2. Build domain router (content type → domain → chunking methods)
+3. Configure per-domain MemVid files
+4. Implement retention policy enforcement
+5. Build multi-repo ingestion support with git provenance tracking
+6. Create domain-specific embedding instructions (instruction-aware model)
+
+---
+
+## Conflicts / Ambiguities
+
+- **⚠️ Corpus size discrepancy:** chatgpt5.2-prd.md says ~35MB of text files; gemini-prd.md estimates 75,000 pages initially with 15,000 pages/week growth. These may refer to different corpora or different time horizons.
+- **⚠️ Prompt vs documentation detection:** Both prompts and documentation can be markdown files. The routing logic needs a way to distinguish user prompts from documentation (possibly by source directory or metadata).
+- **⚠️ Code tokenization:** chatgpt5.2-prd.md asks whether different tokenization is needed for config vs script vs library files. The opus-prd2 config uses fixed_size for configs and ast_structural for code, which implicitly answers this.
diff --git a/scratch/13-strategic-integrations.md b/scratch/13-strategic-integrations.md
new file mode 100644
index 0000000..80ab5e9
--- /dev/null
+++ b/scratch/13-strategic-integrations.md
@@ -0,0 +1,96 @@
+# Topic: Strategic Integrations (Advanced Features)
+
+## Summary
+Five advanced integrations to push the architecture from "Advanced" to "State-of-the-Art": Hypergraph Knowledge, Active Inference, Formal Verification, Model Merging, and Contrastive Value Alignment.
+
+---
+
+## Overview
+
+> **Source:** [gemini-prd.md](../gemini-prd.md) (Section 2)
+
+These are future-proofing add-ons, not core requirements. They represent the upgrade path from the base system.
+
+---
+
+## 1. Hypergraph Knowledge Representation
+
+> **Source:** [gemini-prd.md](../gemini-prd.md)
+
+- **What:** Standard graphs use triplets (A → B). Hypergraphs allow a single edge to connect _multiple_ nodes (Code + PRD + Timestamp + Author).
+- **Why:** Code is rarely binary. A function depends on a library, a requirement, and a specific node version simultaneously.
+- **Integration:** Use Hypergraph RAG in the Graphiti layer to allow "n-ary" relationships, reducing the number of "hops" needed to understand complex dependencies.
+- **Implementation:** Update Graphiti schema to support "Hyperedges" (Node-to-Edge connections)
+
+---
+
+## 2. Active Inference Curiosity Module (Frisstonian AI)
+
+> **Source:** [gemini-prd.md](../gemini-prd.md)
+
+- **What:** Replaces the random "Madlib" generator in the Evolutionary Forge. The agent calculates "Free Energy" (uncertainty) across its knowledge base.
+- **Why:** The agent should learn _what it realizes it doesn't know_. If it knows React but not Svelte, the Curiosity Module detects that gap and generates a targeted learning task.
+- **Integration:** A "Curiosity Daemon" runs before Sleep Time, identifying sparse areas in the Graphiti vector space and generating targeted learning tasks.
+
+---
+
+## 3. Formal Verification Gate (VeriGuard Protocol)
+
+> **Source:** [gemini-prd.md](../gemini-prd.md)
+
+- **What:** Uses a mathematical prover (Coq, Lean, or lightweight Python-based CrossHair) to verify code correctness.
+- **Why:** "95% Confidence" is subjective. "Mathematically Proven" is absolute.
+- **Integration:** The Tribunal gains a **Math-Persona** that demands the agent write assertions. If assertions fail formal verification, the artifact is rejected immediately.
+- **Implementation:** Add a `verify.py` hook in the Tribunal loop.
+
+---
+
+## 4. Automated Model Merging (The "Frankenstein" Strategy)
+
+> **Source:** [gemini-prd.md](../gemini-prd.md)
+
+- **What:** Techniques like TIES-Merging or DARE allow merging weights of different fine-tuned models without retraining.
+- **Why:** Instead of just refining prompts, the system can merge a "Security Expert" LoRA with a "Creative Writer" LoRA to create a custom daily driver.
+- **Integration:** Monthly script that checks HuggingFace for compatible LoRAs and merges them.
+- **Frequency:** Once a month
+
+---
+
+## 5. Contrastive Value Alignment (Taste Oracle++)
+
+> **Source:** [gemini-prd.md](../gemini-prd.md)
+
+- **What:** Uses a learned Reward Model based on user-specific "Taste" vectors (Contrastive Learning).
+- **Why:** Simple vector distance is a crude proxy for "Good." A trained Reward Model can learn the nuance of why you like "Brutalist" code but dislike "Spaghetti" code, even if they look vectorially similar.
+- **Integration:** Train a small classifier (e.g., DeBERTa) on "Accepted" vs. "Rejected" tribunal outcomes to act as a highly accurate pre-filter for the Creation loop.
+
+---
+
+## Implementation Priority
+
+These are listed in suggested implementation order (after core system is built):
+
+1. **Hypergraph Knowledge** — Enhances existing Graphiti layer (medium complexity)
+2. **Active Inference Curiosity** — Replaces Madlib generator (medium complexity)
+3. **Contrastive Value Alignment** — Improves Taste Oracle (medium complexity)
+4. **Formal Verification** — Adds verification hook (small complexity)
+5. **Model Merging** — Monthly automation (large complexity, requires ML expertise)
+
+---
+
+## Implementation Requirements
+
+1. Research and select hypergraph library compatible with FalkorDB
+2. Implement Free Energy calculation for knowledge gap detection
+3. Integrate CrossHair or similar lightweight formal verifier
+4. Build LoRA merging pipeline with HuggingFace integration
+5. Train DeBERTa classifier on accepted/rejected outcomes
+6. Create monthly automation for model merging
+
+---
+
+## Conflicts / Ambiguities
+
+- **⚠️ These are aspirational:** Only gemini-prd.md describes these integrations. No other document references them. They should be treated as Phase 2+ features, not core requirements.
+- **⚠️ Model merging feasibility:** Merging LoRAs requires access to model weights and significant ML infrastructure. May not be practical for a local-first system on M3 Max.
+- **⚠️ Formal verification scope:** CrossHair (Python) is limited compared to Coq/Lean. The scope of what can be formally verified needs to be realistic.
diff --git a/scratch/TASK_INDEX.md b/scratch/TASK_INDEX.md
new file mode 100644
index 0000000..e5ff1f4
--- /dev/null
+++ b/scratch/TASK_INDEX.md
@@ -0,0 +1,122 @@
+# Task Index: RAG v3.0 Implementation Topics
+
+## Overview
+
+This index maps 13 topic-focused scratch files decomposed from the RAG v3.0 project documentation. Each file consolidates all requirements for a single topic from across 8 source documents.
+
+### Source Documents
+| Document | Focus |
+|----------|-------|
+| `chatgpt5.2-prd.md` | Original requirements, cost analysis, agentic deployment |
+| `gemini-prd.md` | Autodidactic Omni-Loop, memory hierarchy, sleep-time compute |
+| `opus-prd1-v3.md` | RAG v3.0 architecture, embedding models, metadata schema |
+| `opus-prd2-v3.md` | YAML configuration for all components |
+| `opus-prd3-v3.md` | Foundational theory, multimodal revolution, chunking theory |
+| `docs/UNIFIED_PRD.md` | Consolidated specification |
+| `docs/SCHEMA_REFERENCE.md` | Database schema, TypeScript interfaces, config schema |
+| `docs/AGGREGATION_PLAN.md` | Overlap/conflict analysis between documents |
+
+---
+
+## Topic Index
+
+| # | File | Topic | Complexity | Dependencies |
+|---|------|-------|-----------|-------------|
+| 01 | [01-embedding-model-stack.md](./01-embedding-model-stack.md) | Embedding Model Stack | Large | None |
+| 02 | [02-chunking-strategies.md](./02-chunking-strategies.md) | Chunking Strategies | Large | 01 |
+| 03 | [03-metadata-schema.md](./03-metadata-schema.md) | Metadata Schema - 12 Dimensions | Large | None |
+| 04 | [04-database-schema.md](./04-database-schema.md) | SQLite Database Schema | Medium | 03 |
+| 05 | [05-memvid-storage.md](./05-memvid-storage.md) | MemVid Video-Encoded Storage | Large | 01, 04 |
+| 06 | [06-memory-hierarchy.md](./06-memory-hierarchy.md) | Three-Tiered Memory Hierarchy | Medium | 05 |
+| 07 | [07-retrieval-pipeline.md](./07-retrieval-pipeline.md) | Retrieval Pipeline | Large | 01, 04, 05 |
+| 08 | [08-orchestration-concurrency.md](./08-orchestration-concurrency.md) | Orchestration & Concurrency | Medium | 02, 07 |
+| 09 | [09-proxy-shim.md](./09-proxy-shim.md) | Proxy / Shim - The Gatekeeper | Medium | 06, 07 |
+| 10 | [10-sleep-time-compute.md](./10-sleep-time-compute.md) | Sleep-Time Compute & Self-Improvement | Large | 06, 09 |
+| 11 | [11-quality-assurance.md](./11-quality-assurance.md) | Quality Assurance & Error Handling | Small | 03, 04 |
+| 12 | [12-domain-configuration.md](./12-domain-configuration.md) | Domain Configuration & Content Routing | Small | 02, 05 |
+| 13 | [13-strategic-integrations.md](./13-strategic-integrations.md) | Strategic Integrations - Advanced | Large | 06, 10 |
+
+---
+
+## Suggested Implementation Order
+
+### Stage 1: Foundation (No dependencies)
+1. **03 - Metadata Schema** — Define all TypeScript interfaces and Pydantic models for the 12-dimension schema. This is the data contract everything else depends on.
+2. **01 - Embedding Model Stack** — Set up Qwen3-VL-Embedding-8B, reranker, and boundary detection model. Core capability needed by all pipelines.
+3. **04 - Database Schema** — Create SQLite tables, indexes, and data access layer. Depends on metadata schema being defined.
+
+### Stage 2: Chunking Pipeline
+4. **02 - Chunking Strategies** — Implement all 7 chunking methods. Depends on embedding models for semantic chunking.
+5. **12 - Domain Configuration** — Configure content routing (file type → domain → chunking methods). Depends on chunking strategies.
+6. **11 - Quality Assurance** — Implement chunk validation, coherence scoring, error handling. Can run in parallel with chunking.
+
+### Stage 3: Storage & Retrieval
+7. **05 - MemVid Storage** — Implement H.265 video encoding, QR frames, quad-encoding, FAISS indices. Depends on embedding models and database schema.
+8. **07 - Retrieval Pipeline** — Two-stage retrieval with hybrid search and reranking. Depends on MemVid and embedding models.
+9. **08 - Orchestration** — Wire up the agentic swarm, concurrency, MCP servers. Depends on chunking and retrieval being implemented.
+
+### Stage 4: Autonomous System
+10. **06 - Memory Hierarchy** — Implement ByteRover (Hot), Graphiti (Warm), transition daemons. Depends on MemVid for Cold tier.
+11. **09 - Proxy / Shim** — Build the Claude wrapper with context injection. Depends on memory hierarchy and retrieval.
+12. **10 - Sleep-Time Compute** — Implement autonomous refinement loops. Depends on proxy and memory hierarchy.
+
+### Stage 5: Advanced Features
+13. **13 - Strategic Integrations** — Hypergraph, Active Inference, Formal Verification, Model Merging, Contrastive Alignment. Only after core system is stable.
+
+---
+
+## Dependency Graph
+
+```
+Stage 1:  [03 Metadata] ──→ [04 Database]
+          [01 Embedding] ─┐
+                          │
+Stage 2:  [02 Chunking] ←─┘──→ [12 Domains]
+          [11 Quality] ←── [03] + [04]
+
+Stage 3:  [05 MemVid] ←── [01] + [04]
+          [07 Retrieval] ←── [01] + [05]
+          [08 Orchestration] ←── [02] + [07]
+
+Stage 4:  [06 Memory Hierarchy] ←── [05]
+          [09 Proxy] ←── [06] + [07]
+          [10 Sleep-Time] ←── [06] + [09]
+
+Stage 5:  [13 Strategic] ←── [06] + [10]
+```
+
+---
+
+## Cross-Document Conflicts Summary
+
+| Conflict | Documents | Resolution |
+|----------|-----------|------------|
+| Embedding dimensions | chatgpt5.2-prd vs opus-prd2 | Use opus-prd2 values: 4096 native, MRL options [256,512,1024,2048,4096] |
+| Chunk sizes | chatgpt5.2-prd vs opus-prd2 vs AGGREGATION_PLAN | Use opus-prd2 YAML config as authoritative |
+| Number of chunking methods | Various (4, 6, or 7) | 7 methods is the complete list |
+| Retention periods (hours vs days) | gemini-prd | "h" is a typo; use days |
+| Agentic vs algorithmic chunking | chatgpt5.2-prd vs opus-prd2 | Semantic chunking is algorithmic (0.6B model), not full LLM agent |
+| QR frames vs text frames | gemini-prd Appendix I vs Appendix II | QR approach is production intent; text rendering is simplified example |
+| Gemini embedding alternative | chatgpt5.2-prd only | Not addressed elsewhere; treat as optional future consideration |
+| Corpus size | chatgpt5.2-prd (35MB) vs gemini-prd (75K pages) | Different corpora or time horizons; design for the larger estimate |
+
+---
+
+## Tech Stack Summary
+
+| Component | Technology |
+|-----------|-----------|
+| Primary Embedding | Qwen3-VL-Embedding-8B |
+| Reranker | Qwen3-VL-Reranker-8B |
+| Boundary Detection | Qwen3-Embedding-0.6B |
+| Text Fallback | Qwen3-Embedding-8B |
+| Metadata Store | SQLite |
+| Vector Search | FAISS (HNSW) |
+| Graph Database | FalkorDB (Bolt Protocol) |
+| Video Encoding | FFmpeg (H.265/HEVC) |
+| AST Parsing | tree-sitter |
+| Languages | Python (primary), TypeScript (interfaces) |
+| Orchestration | Headless Claude Code + MCP |
+| Local LLM | Ollama |
+| Remote Models | OpenRouter (free tier for sleep-time) |
+| Target Hardware | M3 Max MacBook Pro |