From b24a0e18efda4a99986268ac06b36cc5ab8bf831 Mon Sep 17 00:00:00 2001 From: Baseline User Date: Sat, 28 Feb 2026 14:04:31 +0530 Subject: [PATCH] fix(share): harden 3-stage pipeline and add demo script MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Priority 1 - Fix broken things: - Rewrite test/team.test.ts with bun:test (was using console.assert, wrong imports) - Fix unit-level dedup in share.ts (was matching on random UUID, now content_hash only) - Fix duration calculation in segment.ts (use actual timestamps, not messageCount/2) - Fix replace() → replaceAll() for nested category paths in share.ts Priority 2 - Code quality: - Extract shared callOllama() to src/team/ollama.ts with timeout + retry - Extract shared slugify/datePrefix to src/team/utils.ts - DRY up share.ts: extract resolveOutputDir, querySessions, writeManifest helpers - Use isValidCategory from categorize/schema.ts instead of duplicate in segment.ts - Remove unused SMRITI_DIR import from document.ts Priority 3 - Test coverage: - Replace misleading tests in team-segmented.test.ts with mocked Ollama tests - Add generateDocument, generateDocumentsSequential, and fallback path tests - Add real DB validation tests using isValidCategory Priority 4 - Sync integration: - Fix sync.ts to handle Stage 2 structured output (pipeline: segmented flag) - Segmented knowledge docs are no longer write-only Priority 5 - Prompt improvements: - Constrain Stage 1 to use only listed categories (remove "other valid" text) - Add {{title}} placeholder and heading instruction to all Stage 2 templates - Remove hallucination-prone "Links to further reading" from topic template Also adds docs/demo-script.md showing the full smriti workflow story. Co-Authored-By: Claude Opus 4.6 --- docs/demo-script.md | 323 ++++++++++++++++++++ src/db.ts | 17 +- src/team/document.ts | 56 +--- src/team/ollama.ts | 88 ++++++ src/team/prompts/stage1-segment.md | 5 +- src/team/prompts/stage2-architecture.md | 2 + src/team/prompts/stage2-base.md | 2 + src/team/prompts/stage2-bug.md | 2 + src/team/prompts/stage2-code.md | 2 + src/team/prompts/stage2-feature.md | 2 + src/team/prompts/stage2-project.md | 2 + src/team/prompts/stage2-topic.md | 4 +- src/team/segment.ts | 67 ++--- src/team/share.ts | 378 +++++++++--------------- src/team/sync.ts | 8 +- src/team/utils.ts | 19 ++ test/team-segmented.test.ts | 173 +++++++---- test/team.test.ts | 171 +++++++---- 18 files changed, 858 insertions(+), 463 deletions(-) create mode 100644 docs/demo-script.md create mode 100644 src/team/ollama.ts create mode 100644 src/team/utils.ts diff --git a/docs/demo-script.md b/docs/demo-script.md new file mode 100644 index 0000000..bdcf0a9 --- /dev/null +++ b/docs/demo-script.md @@ -0,0 +1,323 @@ +# Smriti Demo: From Deep Dive to Team Knowledge + +## The Problem + +Priya is a senior engineer at a startup. She just spent 2 hours in a Claude +Code session doing a deep review of their payment service — a critical codebase +she inherited when the original author left. + +During the session, she and Claude: + +- Traced a race condition in the webhook handler that causes duplicate charges +- Discovered the retry logic uses `setTimeout` instead of exponential backoff +- Decided to replace the hand-rolled queue with BullMQ +- Found that the Stripe SDK is 3 major versions behind and the API they use is deprecated +- Mapped out the full payment flow across 14 files +- Identified 3 missing error boundaries that silently swallow failures + +That's a **goldmine** of institutional knowledge. But the Claude session is +just a 400-message transcript buried in `~/.claude/projects/`. Tomorrow, when +her teammate Arjun picks up the webhook fix, he'll start from scratch. When the +intern asks "why BullMQ?", nobody will remember the tradeoff analysis. + +**This is the problem Smriti solves.** + +--- + +## Act 1: The Session Ends + +Priya's Claude Code session just finished. Here's what her terminal looks like: + +``` +$ # Session over. 2 hours of deep review — bugs, decisions, architecture notes. +$ # All sitting in a Claude transcript she'll never look at again. +``` + +She has two paths to preserve this knowledge: + +| Path | Command | What it does | +|------|---------|--------------| +| **Ingest** | `smriti ingest claude` | Import into searchable memory (personal) | +| **Share** | `smriti share --segmented` | Export as team documentation (git-committed) | + +She'll do both. + +--- + +## Act 2: Ingest — Building Personal Memory + +``` +$ smriti ingest claude --project payments +``` + +``` + Discovering sessions... + Found 1 new session in payments + +Agent: claude-code +Sessions found: 1 +Sessions ingested: 1 +Messages ingested: 412 +Skipped: 0 +``` + +That's it. 412 messages are now indexed — full-text searchable with BM25, +ready for vector embedding, tagged with project and agent metadata. + +**What just happened under the hood:** + +1. Smriti found the JSONL transcript in `~/.claude/projects/-Users-priya-src-payments/` +2. Parsed every message, tool call, file edit, and error +3. Stored messages in QMD's content-addressable store (SHA256 dedup) +4. Registered the session with project = `payments`, agent = `claude-code` +5. Auto-indexed into FTS5 for instant search + +Now Priya can search her memory: + +``` +$ smriti search "race condition webhook" --project payments +``` + +``` +[0.891] Payment Service Deep Review + assistant: The race condition occurs in src/webhooks/stripe.ts at line 47. + The handler processes the event, then checks idempotency — but between + those two operations, a duplicate webhook can slip through... + +[0.823] Payment Service Deep Review + user: What's the fix? Can we just add a mutex? + +[0.756] Payment Service Deep Review + assistant: A mutex won't work in a multi-instance deployment. The proper + fix is to check idempotency BEFORE processing, using a database-level + unique constraint on the event ID... +``` + +Three weeks later, she barely remembers the session. But she can recall it: + +``` +$ smriti recall "why did we decide on BullMQ for payments" --synthesize +``` + +``` +[0.834] Payment Service Deep Review + assistant: After comparing the options, BullMQ is the clear winner... + +--- Synthesis --- + +The decision to adopt BullMQ for the payment queue was made during a deep +review of the payment service. The existing implementation used a hand-rolled +queue with setTimeout-based retries, which had several issues: + +1. No exponential backoff — failed jobs retry immediately, hammering Stripe +2. No dead-letter queue — permanently failed jobs disappear silently +3. No persistence — server restart loses the entire queue +4. No visibility — no way to inspect pending/failed jobs + +BullMQ was chosen over alternatives: +- **pg-boss**: Good, but adds Postgres load to an already-strained DB +- **Custom Redis queue**: Reinventing the wheel; BullMQ is battle-tested +- **SQS/Cloud queue**: Adds AWS dependency the team wants to avoid + +BullMQ provides exponential backoff, dead-letter queues, Redis persistence, +and a dashboard (Bull Board) — solving all four issues. +``` + +That synthesis didn't come from a new LLM call about BullMQ. It came from +**Priya's actual reasoning during the review**, reconstructed from her +session memory. + +--- + +## Act 3: Share — Exporting Team Knowledge + +Ingesting is personal. Sharing is for the team. + +``` +$ smriti share --project payments --segmented +``` + +``` + Segmenting session: Payment Service Deep Review... + Found 5 knowledge units (3 above relevance threshold) + Generating documentation... + +Output: /Users/priya/src/payments/.smriti +Files created: 3 +Files skipped: 0 +``` + +Smriti's 3-stage pipeline just: + +**Stage 1 — Segment**: Analyzed the 412-message session and identified 5 +distinct knowledge units: + +| Unit | Category | Relevance | Action | +|------|----------|-----------|--------| +| Webhook race condition | bug/investigation | 9 | Shared | +| BullMQ decision | architecture/decision | 8 | Shared | +| Stripe SDK deprecation | project/dependency | 7 | Shared | +| General code navigation | uncategorized | 3 | Filtered out | +| Test setup discussion | uncategorized | 2 | Filtered out | + +**Stage 2 — Document**: Generated structured markdown using category-specific +templates. A bug gets Symptoms → Root Cause → Fix → Prevention. A decision +gets Context → Options → Decision → Consequences. + +**Stage 3 — Persist**: Wrote files, deduplicated via content hash, updated the +manifest. + +Here's what landed on disk: + +``` +payments/ +└── .smriti/ + ├── CLAUDE.md # Auto-discovered by Claude Code + ├── index.json + ├── config.json + └── knowledge/ + ├── bug-investigation/ + │ └── 2026-02-28_webhook-race-condition-duplicate-charges.md + ├── architecture-decision/ + │ └── 2026-02-28_bullmq-for-payment-queue.md + └── project-dependency/ + └── 2026-02-28_stripe-sdk-v3-deprecation.md +``` + +Let's look at the bug document: + +```markdown +--- +id: unit-a1b2c3 +session_id: 6de3c493-60fa +category: bug/investigation +pipeline: segmented +relevance_score: 9 +entities: ["Stripe webhooks", "idempotency", "race condition", "PostgreSQL"] +files: ["src/webhooks/stripe.ts", "src/db/events.ts"] +project: payments +author: priya +shared_at: 2026-02-28T17:45:00Z +--- + +# Webhook Race Condition Causing Duplicate Charges + +## Symptoms + +Customers occasionally receive duplicate charges for a single purchase. +The issue occurs under high webhook volume — Stripe sends the same event +twice within milliseconds, and both get processed. + +## Root Cause + +In `src/webhooks/stripe.ts`, the handler processes the event first, then +checks the idempotency table. Between processing and the idempotency check, +a duplicate webhook slips through. + +The vulnerable window is ~15ms (database round-trip time), which is enough +for Stripe's retry mechanism to deliver a duplicate. + +## Investigation + +Traced the flow: `handleWebhook()` → `processEvent()` → `markProcessed()`. +The idempotency check happens inside `markProcessed()`, AFTER the charge +is executed. Should be BEFORE. + +## Fix + +Move the idempotency check to the entry point of `handleWebhook()`: + +1. Add a `UNIQUE` constraint on `webhook_events.stripe_event_id` +2. `INSERT OR IGNORE` before processing — if the insert fails, the event + was already handled +3. Wrap the entire handler in a database transaction + +## Prevention + +- Add integration test that fires duplicate webhooks concurrently +- Add monitoring alert on duplicate event IDs in the events table +- Consider adding Stripe's recommended `idempotency-key` header to all + API calls +``` + +That's not a raw transcript. It's a **structured incident document** that any +engineer can read, understand, and act on — without ever having been in the +original session. + +--- + +## Act 4: The Payoff + +### Monday morning — Arjun picks up the webhook fix + +He opens the payments repo. Claude Code automatically reads +`.smriti/CLAUDE.md` and sees the shared knowledge index. + +``` +$ smriti search "webhook duplicate" --project payments +``` + +He finds the full investigation, root cause, and fix — before writing a +single line of code. + +### Two weeks later — the intern asks "why BullMQ?" + +``` +$ smriti recall "why BullMQ instead of pg-boss" --synthesize --project payments +``` + +The original tradeoff analysis surfaces instantly, with Priya's reasoning +preserved verbatim. + +### A month later — Priya reviews a different service + +She notices the same setTimeout retry pattern: + +``` +$ smriti search "setTimeout retry" --category bug +``` + +Her earlier finding surfaces. She already knows the fix. + +--- + +## The Commands + +```bash +# After a deep session — capture everything +smriti ingest claude + +# Share structured knowledge with the team +smriti share --project payments --segmented + +# Commit shared knowledge to git +cd /path/to/payments +git add .smriti/ +git commit -m "docs: share payment service review findings" + +# Teammates sync the knowledge +smriti sync --project payments + +# Search across all your sessions +smriti search "race condition" --project payments + +# Get synthesized answers from memory +smriti recall "how should we handle retries" --synthesize + +# Check what you've captured +smriti status +``` + +--- + +## What Makes This Different + +| Without Smriti | With Smriti | +|---|---| +| Session transcript sits in `~/.claude/` forever | Searchable, indexed, synthesizable memory | +| Knowledge dies when the session closes | Knowledge persists across sessions and engineers | +| Teammates start from scratch | Teammates find existing analysis instantly | +| "Why did we decide X?" — nobody remembers | `smriti recall "why X" --synthesize` | +| Deep dives produce code changes only | Deep dives produce code changes + documentation | + +The session is ephemeral. The knowledge doesn't have to be. diff --git a/src/db.ts b/src/db.ts index d468696..4223a3f 100644 --- a/src/db.ts +++ b/src/db.ts @@ -62,11 +62,20 @@ function initializeQmdStore(db: Database): void { ) `); - // Create virtual vec table for sqlite-vec + // vectors_vec is managed by QMD at embedding time because dimensions depend on + // the active embedding model. Do not eagerly create it here. + // Migration: older Smriti versions created an incompatible vectors_vec table + // (embedding-only, no hash_seq), which breaks embed/search paths. try { - db.exec(`CREATE VIRTUAL TABLE IF NOT EXISTS vectors_vec USING vec0(embedding float[1536])`); + const vecTable = db + .prepare(`SELECT sql FROM sqlite_master WHERE type='table' AND name='vectors_vec'`) + .get() as { sql: string } | null; + + if (vecTable?.sql && !vecTable.sql.includes("hash_seq")) { + db.exec(`DROP TABLE IF EXISTS vectors_vec`); + } } catch { - // May fail if model doesn't support this dimension, that's OK + // If sqlite-vec isn't loaded or table introspection fails, continue. } } @@ -356,7 +365,7 @@ export function initializeSmritiTables(db: Database): void { CREATE INDEX IF NOT EXISTS idx_smriti_shares_hash ON smriti_shares(content_hash); CREATE INDEX IF NOT EXISTS idx_smriti_shares_unit - ON smriti_shares(content_hash, unit_id); + ON smriti_shares(unit_id); -- Indexes (sidecar tables) CREATE INDEX IF NOT EXISTS idx_smriti_tool_usage_session diff --git a/src/team/document.ts b/src/team/document.ts index 9042940..848560f 100644 --- a/src/team/document.ts +++ b/src/team/document.ts @@ -5,10 +5,10 @@ * using category-specific templates and LLM synthesis. */ -import { OLLAMA_HOST, OLLAMA_MODEL, SMRITI_DIR } from "../config"; -import { join, dirname, basename } from "path"; +import { join, dirname } from "path"; import type { KnowledgeUnit, DocumentationOptions, DocumentGenerationResult } from "./types"; -import { existsSync } from "fs"; +import { callOllama } from "./ollama"; +import { slugify } from "./utils"; // ============================================================================= // Template Loading @@ -113,7 +113,7 @@ export async function generateDocument( // Call LLM to synthesize let synthesis = ""; try { - synthesis = await callOllama(prompt, options.model); + synthesis = await callOllama(prompt, { model: options.model }); } catch (err) { console.warn(`Failed to synthesize unit ${unit.id}:`, err); // Fallback: return unit content as-is @@ -166,54 +166,6 @@ export async function generateDocumentsSequential( return results; } -// ============================================================================= -// Filename Generation -// ============================================================================= - -/** - * Generate a URL-friendly slug from text - */ -function slugify(text: string, maxLen: number = 50): string { - return text - .toLowerCase() - .replace(/[^a-z0-9\s-]/g, "") - .replace(/\s+/g, "-") - .replace(/-+/g, "-") - .slice(0, maxLen) - .replace(/-$/, ""); -} - -// ============================================================================= -// Ollama Integration -// ============================================================================= - -/** - * Call Ollama generate API - */ -async function callOllama(prompt: string, model?: string): Promise { - const ollamaModel = model || OLLAMA_MODEL; - - const response = await fetch(`${OLLAMA_HOST}/api/generate`, { - method: "POST", - headers: { "Content-Type": "application/json" }, - body: JSON.stringify({ - model: ollamaModel, - prompt, - stream: false, - temperature: 0.7, - }), - }); - - if (!response.ok) { - throw new Error( - `Ollama API error: ${response.status} ${response.statusText}` - ); - } - - const data = (await response.json()) as { response: string }; - return data.response || ""; -} - // ============================================================================= // Utilities // ============================================================================= diff --git a/src/team/ollama.ts b/src/team/ollama.ts new file mode 100644 index 0000000..6ea04f7 --- /dev/null +++ b/src/team/ollama.ts @@ -0,0 +1,88 @@ +/** + * team/ollama.ts - Shared Ollama HTTP client for team pipeline + * + * Centralized LLM call with timeout and retry support. + * Used by segment.ts (Stage 1) and document.ts (Stage 2). + */ + +import { OLLAMA_HOST, OLLAMA_MODEL } from "../config"; + +export type OllamaOptions = { + model?: string; + temperature?: number; + timeout?: number; + maxRetries?: number; +}; + +const DEFAULT_TIMEOUT = 120_000; +const DEFAULT_MAX_RETRIES = 2; +const BASE_DELAY_MS = 1_000; + +/** + * Call Ollama generate API with timeout and retry. + * + * Retries on 5xx errors and network failures with exponential backoff. + * Does NOT retry on 4xx (bad request, model not found, etc). + */ +export async function callOllama( + prompt: string, + options: OllamaOptions = {} +): Promise { + const model = options.model || OLLAMA_MODEL; + const temperature = options.temperature ?? 0.7; + const timeout = options.timeout ?? DEFAULT_TIMEOUT; + const maxRetries = options.maxRetries ?? DEFAULT_MAX_RETRIES; + + let lastError: Error | undefined; + + for (let attempt = 0; attempt <= maxRetries; attempt++) { + if (attempt > 0) { + const delay = BASE_DELAY_MS * 2 ** (attempt - 1); + await new Promise((r) => setTimeout(r, delay)); + } + + const controller = new AbortController(); + const timer = setTimeout(() => controller.abort(), timeout); + + try { + const response = await fetch(`${OLLAMA_HOST}/api/generate`, { + method: "POST", + headers: { "Content-Type": "application/json" }, + body: JSON.stringify({ + model, + prompt, + stream: false, + temperature, + }), + signal: controller.signal, + }); + + clearTimeout(timer); + + if (!response.ok) { + const msg = `Ollama API error: ${response.status} ${response.statusText}`; + // Don't retry client errors + if (response.status >= 400 && response.status < 500) { + throw new Error(msg); + } + lastError = new Error(msg); + continue; + } + + const data = (await response.json()) as { response: string }; + return data.response || ""; + } catch (err: any) { + clearTimeout(timer); + // Don't retry aborts (timeout) or client errors + if (err.name === "AbortError") { + throw new Error(`Ollama request timed out after ${timeout}ms`); + } + if (err.message?.includes("4")) { + throw err; + } + lastError = err; + } + } + + throw lastError || new Error("Ollama request failed after retries"); +} diff --git a/src/team/prompts/stage1-segment.md b/src/team/prompts/stage1-segment.md index 1f505ec..3a2eb12 100644 --- a/src/team/prompts/stage1-segment.md +++ b/src/team/prompts/stage1-segment.md @@ -28,7 +28,10 @@ Valid categories are: - `topic/learning` - Learning and tutorials - `topic/explanation` - Explanations and deep dives - `decision/technical` - Technical decisions -- Other valid category combinations with parent/child structure +- `decision/tooling` - Tooling decisions +- `project/dependency` - Dependencies and package management + +Use ONLY the categories listed above. Do not invent new categories. ## Conversation diff --git a/src/team/prompts/stage2-architecture.md b/src/team/prompts/stage2-architecture.md index 955e5d1..550d6ee 100644 --- a/src/team/prompts/stage2-architecture.md +++ b/src/team/prompts/stage2-architecture.md @@ -4,6 +4,7 @@ You are documenting an architecture or technical decision. ## Knowledge Unit +**Title**: {{title}} **Topic**: {{topic}} **Category**: {{category}} **Entities**: {{entities}} @@ -23,4 +24,5 @@ Transform this into an Architecture Decision Record (ADR) format with these sect 4. **Consequences** - Positive impacts and tradeoffs 5. **Rationale** - Deeper reasoning or constraints +Start with a `# Heading` using the title above. Return only the markdown body, no frontmatter. Be concise but thorough on tradeoffs. diff --git a/src/team/prompts/stage2-base.md b/src/team/prompts/stage2-base.md index 0a64433..d5f5e30 100644 --- a/src/team/prompts/stage2-base.md +++ b/src/team/prompts/stage2-base.md @@ -4,6 +4,7 @@ You are transforming a technical knowledge unit into a polished, team-friendly d ## Knowledge Unit +**Title**: {{title}} **Topic**: {{topic}} **Category**: {{category}} **Entities**: {{entities}} @@ -21,5 +22,6 @@ Transform this knowledge unit into clear, concise documentation that: 3. Uses clear section headers and formatting 4. Extracts actionable insights +Start with a `# Heading` using the title above. Provide a well-structured markdown document suitable for team knowledge sharing. Do not include frontmatter or YAML, just the markdown body. diff --git a/src/team/prompts/stage2-bug.md b/src/team/prompts/stage2-bug.md index 516f751..36708ea 100644 --- a/src/team/prompts/stage2-bug.md +++ b/src/team/prompts/stage2-bug.md @@ -4,6 +4,7 @@ You are documenting a bug investigation or fix. ## Knowledge Unit +**Title**: {{title}} **Topic**: {{topic}} **Category**: {{category}} **Entities**: {{entities}} @@ -23,4 +24,5 @@ Transform this bug knowledge into a structured incident/fix document with these 4. **Fix** - What changed and why that fixes it 5. **Prevention** - How to avoid this in future (tests, checks, architecture) +Start with a `# Heading` using the title above. Return only the markdown body, no frontmatter. Use clear headings and be concise. diff --git a/src/team/prompts/stage2-code.md b/src/team/prompts/stage2-code.md index f9127f7..5aee510 100644 --- a/src/team/prompts/stage2-code.md +++ b/src/team/prompts/stage2-code.md @@ -4,6 +4,7 @@ You are documenting code implementation work. ## Knowledge Unit +**Title**: {{title}} **Topic**: {{topic}} **Category**: {{category}} **Entities**: {{entities}} @@ -23,4 +24,5 @@ Transform this into a code implementation guide with these sections: 4. **Usage Example** - Brief example of how to use this code 5. **Related Code** - Links to similar implementations or dependencies +Start with a `# Heading` using the title above. Return only the markdown body, no frontmatter. Include brief code snippets if helpful. diff --git a/src/team/prompts/stage2-feature.md b/src/team/prompts/stage2-feature.md index 919685e..94428fd 100644 --- a/src/team/prompts/stage2-feature.md +++ b/src/team/prompts/stage2-feature.md @@ -4,6 +4,7 @@ You are documenting feature design or implementation. ## Knowledge Unit +**Title**: {{title}} **Topic**: {{topic}} **Category**: {{category}} **Entities**: {{entities}} @@ -23,4 +24,5 @@ Transform this into feature documentation with these sections: 4. **Testing** - How to test or verify the feature 5. **Future Enhancements** - Known limitations or improvements +Start with a `# Heading` using the title above. Return only the markdown body, no frontmatter. Focus on clarity for future team members. diff --git a/src/team/prompts/stage2-project.md b/src/team/prompts/stage2-project.md index c1d9d4a..f8273a1 100644 --- a/src/team/prompts/stage2-project.md +++ b/src/team/prompts/stage2-project.md @@ -4,6 +4,7 @@ You are documenting project setup, configuration, or scaffolding work. ## Knowledge Unit +**Title**: {{title}} **Topic**: {{topic}} **Category**: {{category}} **Entities**: {{entities}} @@ -23,4 +24,5 @@ Transform this into a project setup guide with these sections: 4. **Verification** - How to verify it worked 5. **Troubleshooting** - Common issues and solutions +Start with a `# Heading` using the title above. Return only the markdown body, no frontmatter. Make it step-by-step and actionable. diff --git a/src/team/prompts/stage2-topic.md b/src/team/prompts/stage2-topic.md index 42eacf4..feaf24b 100644 --- a/src/team/prompts/stage2-topic.md +++ b/src/team/prompts/stage2-topic.md @@ -4,6 +4,7 @@ You are documenting a learning topic or explanation. ## Knowledge Unit +**Title**: {{title}} **Topic**: {{topic}} **Category**: {{category}} **Entities**: {{entities}} @@ -21,6 +22,7 @@ Transform this into educational documentation with these sections: 2. **Relevance** - Why this matters (in our project/domain) 3. **Key Points** - Main takeaways (3-5 bullets) 4. **Examples** - Concrete examples from our codebase -5. **Resources** - Links to further reading +Start with a `# Heading` using the title above. Return only the markdown body, no frontmatter. Make it accessible to junior team members. +Do not include external links or URLs (they will be outdated). diff --git a/src/team/segment.ts b/src/team/segment.ts index b150d9c..dbd8b89 100644 --- a/src/team/segment.ts +++ b/src/team/segment.ts @@ -6,11 +6,12 @@ * documented independently. */ -import { OLLAMA_HOST, OLLAMA_MODEL } from "../config"; import { join, dirname } from "path"; import type { Database } from "bun:sqlite"; import type { RawMessage } from "./formatter"; -import { filterMessages, mergeConsecutive, sanitizeContent } from "./formatter"; +import { filterMessages, mergeConsecutive } from "./formatter"; +import { callOllama } from "./ollama"; +import { isValidCategory } from "../categorize/schema"; import type { KnowledgeUnit, SegmentationResult, @@ -78,8 +79,19 @@ function extractSessionMetadata( ? "Tests run" : "No tests recorded"; - // Calculate duration - const duration = messages.length > 0 ? Math.ceil(messages.length / 2) : 0; + // Calculate duration from message timestamps + const msgTimestamps = db + .prepare( + `SELECT MIN(created_at) as first_at, MAX(created_at) as last_at + FROM memory_messages WHERE session_id = ?` + ) + .get(sessionId) as { first_at: string | null; last_at: string | null } | null; + + let duration = 0; + if (msgTimestamps?.first_at && msgTimestamps?.last_at) { + const diffMs = new Date(msgTimestamps.last_at).getTime() - new Date(msgTimestamps.first_at).getTime(); + duration = Math.max(1, Math.ceil(diffMs / 60_000)); + } return { duration_minutes: String(duration), @@ -159,23 +171,17 @@ function parseSegmentationResponse(text: string): RawSegmentationUnit[] { // ============================================================================= /** - * Validate and normalize a category against known taxonomy + * Validate and normalize a category against known taxonomy. + * Falls back to parent category, then "uncategorized". */ function validateCategory(db: Database, category: string): string { - const valid = db - .prepare(`SELECT id FROM smriti_categories WHERE id = ?`) - .get(category) as { id: string } | null; - - if (valid) return category; + if (isValidCategory(db, category)) return category; // Try parent category const parts = category.split("/"); if (parts.length > 1) { const parent = parts[0]; - const parentValid = db - .prepare(`SELECT id FROM smriti_categories WHERE id = ?`) - .get(parent) as { id: string } | null; - if (parentValid) return parent; + if (isValidCategory(db, parent)) return parent; } return "uncategorized"; @@ -254,7 +260,7 @@ export async function segmentSession( let units: KnowledgeUnit[] = []; try { - const response = await callOllama(prompt, options.model); + const response = await callOllama(prompt, { model: options.model }); const rawUnits = parseSegmentationResponse(response); units = normalizeUnits(rawUnits, db, messages); } catch (err) { @@ -317,34 +323,3 @@ export function fallbackToSingleUnit( processingDurationMs: 0, }; } - -// ============================================================================= -// Ollama Integration -// ============================================================================= - -/** - * Call Ollama generate API - */ -async function callOllama(prompt: string, model?: string): Promise { - const ollamaModel = model || OLLAMA_MODEL; - - const response = await fetch(`${OLLAMA_HOST}/api/generate`, { - method: "POST", - headers: { "Content-Type": "application/json" }, - body: JSON.stringify({ - model: ollamaModel, - prompt, - stream: false, - temperature: 0.7, - }), - }); - - if (!response.ok) { - throw new Error( - `Ollama API error: ${response.status} ${response.statusText}` - ); - } - - const data = (await response.json()) as { response: string }; - return data.response || ""; -} diff --git a/src/team/share.ts b/src/team/share.ts index 7b6c6c5..49894fd 100644 --- a/src/team/share.ts +++ b/src/team/share.ts @@ -9,7 +9,7 @@ import type { Database } from "bun:sqlite"; import { SMRITI_DIR, AUTHOR } from "../config"; import { hashContent } from "../qmd"; import { existsSync, mkdirSync } from "fs"; -import { join, basename } from "path"; +import { join } from "path"; import { formatSessionAsFallback, isSessionWorthSharing, @@ -25,6 +25,7 @@ import { } from "./reflect"; import { segmentSession } from "./segment"; import { generateDocumentsSequential, generateFrontmatter } from "./document"; +import { slugify, datePrefix } from "./utils"; import type { RawMessage } from "./formatter"; // ============================================================================= @@ -51,25 +52,9 @@ export type ShareResult = { }; // ============================================================================= -// Helpers +// Shared Helpers // ============================================================================= -/** Generate a slug from text */ -function slugify(text: string, maxLen: number = 50): string { - return text - .toLowerCase() - .replace(/[^a-z0-9\s-]/g, "") - .replace(/\s+/g, "-") - .replace(/-+/g, "-") - .slice(0, maxLen) - .replace(/-$/, ""); -} - -/** Format a date as YYYY-MM-DD */ -function datePrefix(isoDate: string): string { - return isoDate.slice(0, 10); -} - /** Generate YAML frontmatter */ function frontmatter(meta: Record): string { const lines = ["---"]; @@ -84,54 +69,34 @@ function frontmatter(meta: Record): string { return lines.join("\n"); } -// ============================================================================= -// Segmented Sharing (3-Stage Pipeline) -// ============================================================================= - -/** - * Share knowledge using 3-stage segmentation pipeline - * Stage 1: Segment session into knowledge units - * Stage 2: Generate documentation per unit - * Stage 3: Save and deduplicate (deferred) - */ -async function shareSegmentedKnowledge( - db: Database, - options: ShareOptions = {} -): Promise { - const author = options.author || AUTHOR; - const minRelevance = options.minRelevance ?? 6; - - const result: ShareResult = { - filesCreated: 0, - filesSkipped: 0, - outputDir: "", - errors: [], - }; - - // Determine output directory - let outputDir: string; +/** Resolve the output directory from options */ +function resolveOutputDir(db: Database, options: ShareOptions): string { if (options.outputDir) { - outputDir = options.outputDir; - } else if (options.project) { + return options.outputDir; + } + if (options.project) { const project = db .prepare(`SELECT path FROM smriti_projects WHERE id = ?`) .get(options.project) as { path: string } | null; if (project?.path) { - outputDir = join(project.path, SMRITI_DIR); - } else { - outputDir = join(process.cwd(), SMRITI_DIR); + return join(project.path, SMRITI_DIR); } - } else { - outputDir = join(process.cwd(), SMRITI_DIR); } + return join(process.cwd(), SMRITI_DIR); +} - result.outputDir = outputDir; - - // Ensure directory structure - const knowledgeDir = join(outputDir, "knowledge"); - mkdirSync(knowledgeDir, { recursive: true }); - - // Build query for sessions to share +/** Build and execute session query with filters */ +function querySessions( + db: Database, + options: ShareOptions +): Array<{ + id: string; + title: string; + created_at: string; + summary: string | null; + agent_id: string | null; + project_id: string | null; +}> { const conditions: string[] = ["ms.active = 1"]; const params: any[] = []; @@ -161,7 +126,7 @@ async function shareSegmentedKnowledge( params.push(options.sessionId); } - const sessions = db + return db .prepare( `SELECT ms.id, ms.title, ms.created_at, ms.summary, sm.agent_id, sm.project_id @@ -170,15 +135,98 @@ async function shareSegmentedKnowledge( WHERE ${conditions.join(" AND ")} ORDER BY ms.updated_at DESC` ) - .all(...params) as Array<{ - id: string; - title: string; - created_at: string; - summary: string | null; - agent_id: string | null; - project_id: string | null; - }>; + .all(...params) as any; +} + +/** Get messages for a session */ +function getSessionMessages( + db: Database, + sessionId: string +): Array<{ + id: number; + role: string; + content: string; + hash: string; + created_at: string; +}> { + return db + .prepare( + `SELECT mm.id, mm.role, mm.content, mm.hash, mm.created_at + FROM memory_messages mm + WHERE mm.session_id = ? + ORDER BY mm.id` + ) + .all(sessionId) as any; +} + +/** Write manifest and config files, generate CLAUDE.md */ +async function writeManifest( + outputDir: string, + newEntries: Array<{ id: string; category: string; file: string; shared_at: string }> +): Promise { + const indexPath = join(outputDir, "index.json"); + let existingManifest: any[] = []; + try { + const existing = await Bun.file(indexPath).text(); + existingManifest = JSON.parse(existing); + } catch { + // No existing manifest + } + + const fullManifest = [...existingManifest, ...newEntries]; + await Bun.write(indexPath, JSON.stringify(fullManifest, null, 2)); + // Write config if it doesn't exist + const configPath = join(outputDir, "config.json"); + if (!existsSync(configPath)) { + await Bun.write( + configPath, + JSON.stringify( + { + version: 1, + allowedCategories: ["*"], + autoSync: false, + }, + null, + 2 + ) + ); + } + + // Generate CLAUDE.md + await generateClaudeMd(outputDir, fullManifest); +} + +// ============================================================================= +// Segmented Sharing (3-Stage Pipeline) +// ============================================================================= + +/** + * Share knowledge using 3-stage segmentation pipeline + * Stage 1: Segment session into knowledge units + * Stage 2: Generate documentation per unit + * Stage 3: Save and deduplicate + */ +async function shareSegmentedKnowledge( + db: Database, + options: ShareOptions = {} +): Promise { + const author = options.author || AUTHOR; + const minRelevance = options.minRelevance ?? 6; + + const outputDir = resolveOutputDir(db, options); + const result: ShareResult = { + filesCreated: 0, + filesSkipped: 0, + outputDir, + errors: [], + }; + + // Ensure directory structure + const knowledgeDir = join(outputDir, "knowledge"); + mkdirSync(knowledgeDir, { recursive: true }); + + const sessions = querySessions(db, options); const manifest: Array<{ id: string; category: string; @@ -188,22 +236,7 @@ async function shareSegmentedKnowledge( for (const session of sessions) { try { - // Get messages for this session - const messages = db - .prepare( - `SELECT mm.id, mm.role, mm.content, mm.hash, mm.created_at - FROM memory_messages mm - WHERE mm.session_id = ? - ORDER BY mm.id` - ) - .all(session.id) as Array<{ - id: number; - role: string; - content: string; - hash: string; - created_at: string; - }>; - + const messages = getSessionMessages(db, session.id); if (messages.length === 0) continue; // Skip noise-only sessions @@ -245,23 +278,23 @@ async function shareSegmentedKnowledge( // Write documents and track dedup for (const doc of docs) { try { - const categoryDir = join(knowledgeDir, doc.category.replace("/", "-")); + const categoryDir = join(knowledgeDir, doc.category.replaceAll("/", "-")); mkdirSync(categoryDir, { recursive: true }); const filePath = join(categoryDir, doc.filename); // Build frontmatter - const frontmatter = generateFrontmatter( + const fm = generateFrontmatter( session.id, doc.unitId, - doc.frontmatter, + { ...doc.frontmatter, pipeline: "segmented" }, author, session.project_id || undefined ); - const content = frontmatter + "\n\n" + doc.markdown; + const content = fm + "\n\n" + doc.markdown; - // Check unit-level dedup + // Check unit-level dedup via content hash only const unitHash = await hashContent( JSON.stringify({ content: doc.markdown, @@ -272,10 +305,9 @@ async function shareSegmentedKnowledge( const exists = db .prepare( - `SELECT 1 FROM smriti_shares - WHERE content_hash = ? AND unit_id = ?` + `SELECT 1 FROM smriti_shares WHERE content_hash = ?` ) - .get(unitHash, doc.unitId); + .get(unitHash); if (exists) { result.filesSkipped++; @@ -301,10 +333,11 @@ async function shareSegmentedKnowledge( JSON.stringify(doc.frontmatter.entities) ); + const relPath = `knowledge/${doc.category.replaceAll("/", "-")}/${doc.filename}`; manifest.push({ id: session.id, category: doc.category, - file: `knowledge/${doc.category.replace("/", "-")}/${doc.filename}`, + file: relPath, shared_at: new Date().toISOString(), }); @@ -318,39 +351,7 @@ async function shareSegmentedKnowledge( } } - // Write manifest and CLAUDE.md - const indexPath = join(outputDir, "index.json"); - let existingManifest: any[] = []; - try { - const existing = await Bun.file(indexPath).text(); - existingManifest = JSON.parse(existing); - } catch { - // No existing manifest - } - - const fullManifest = [...existingManifest, ...manifest]; - await Bun.write(indexPath, JSON.stringify(fullManifest, null, 2)); - - // Write config if it doesn't exist - const configPath = join(outputDir, "config.json"); - if (!existsSync(configPath)) { - await Bun.write( - configPath, - JSON.stringify( - { - version: 1, - allowedCategories: ["*"], - autoSync: false, - }, - null, - 2 - ) - ); - } - - // Generate CLAUDE.md - await generateClaudeMd(outputDir, fullManifest); - + await writeManifest(outputDir, manifest); return result; } @@ -373,84 +374,19 @@ export async function shareKnowledge( // Otherwise use legacy single-stage pipeline const author = options.author || AUTHOR; + const outputDir = resolveOutputDir(db, options); const result: ShareResult = { filesCreated: 0, filesSkipped: 0, - outputDir: "", + outputDir, errors: [], }; - // Determine output directory - let outputDir: string; - if (options.outputDir) { - outputDir = options.outputDir; - } else if (options.project) { - // Look up project path - const project = db - .prepare(`SELECT path FROM smriti_projects WHERE id = ?`) - .get(options.project) as { path: string } | null; - if (project?.path) { - outputDir = join(project.path, SMRITI_DIR); - } else { - outputDir = join(process.cwd(), SMRITI_DIR); - } - } else { - outputDir = join(process.cwd(), SMRITI_DIR); - } - - result.outputDir = outputDir; - // Ensure directory structure const knowledgeDir = join(outputDir, "knowledge"); mkdirSync(knowledgeDir, { recursive: true }); - // Build query for sessions to share - const conditions: string[] = ["ms.active = 1"]; - const params: any[] = []; - - if (options.category) { - conditions.push( - `EXISTS ( - SELECT 1 FROM smriti_session_tags st - WHERE st.session_id = ms.id - AND (st.category_id = ? OR st.category_id LIKE ? || '/%') - )` - ); - params.push(options.category, options.category); - } - - if (options.project) { - conditions.push( - `EXISTS ( - SELECT 1 FROM smriti_session_meta sm - WHERE sm.session_id = ms.id AND sm.project_id = ? - )` - ); - params.push(options.project); - } - - if (options.sessionId) { - conditions.push(`ms.id = ?`); - params.push(options.sessionId); - } - - const sessions = db - .prepare( - `SELECT ms.id, ms.title, ms.created_at, ms.summary, - sm.agent_id, sm.project_id - FROM memory_sessions ms - LEFT JOIN smriti_session_meta sm ON sm.session_id = ms.id - WHERE ${conditions.join(" AND ")} - ORDER BY ms.updated_at DESC` - ) - .all(...params) as Array<{ - id: string; - title: string; - created_at: string; - summary: string | null; - agent_id: string | null; - project_id: string | null; - }>; + const sessions = querySessions(db, options); // Get existing share hashes for dedup const existingHashes = new Set( @@ -470,22 +406,7 @@ export async function shareKnowledge( for (const session of sessions) { try { - // Get messages for this session - const messages = db - .prepare( - `SELECT mm.id, mm.role, mm.content, mm.hash, mm.created_at - FROM memory_messages mm - WHERE mm.session_id = ? - ORDER BY mm.id` - ) - .all(session.id) as Array<{ - id: number; - role: string; - content: string; - hash: string; - created_at: string; - }>; - + const messages = getSessionMessages(db, session.id); if (messages.length === 0) continue; // Check dedup via content hash @@ -507,7 +428,7 @@ export async function shareKnowledge( categories[0]?.category_id || "uncategorized"; // Create category subdirectory - const categoryDir = join(knowledgeDir, primaryCategory.replace("/", "-")); + const categoryDir = join(knowledgeDir, primaryCategory.replaceAll("/", "-")); mkdirSync(categoryDir, { recursive: true }); // Skip noise-only sessions @@ -582,10 +503,11 @@ export async function shareKnowledge( sessionHash ); + const relPath = `knowledge/${primaryCategory.replaceAll("/", "-")}/${filename}`; manifest.push({ id: session.id, category: primaryCategory, - file: `knowledge/${primaryCategory.replace("/", "-")}/${filename}`, + file: relPath, shared_at: new Date().toISOString(), }); @@ -595,39 +517,7 @@ export async function shareKnowledge( } } - // Write manifest - const indexPath = join(outputDir, "index.json"); - let existingManifest: any[] = []; - try { - const existing = await Bun.file(indexPath).text(); - existingManifest = JSON.parse(existing); - } catch { - // No existing manifest - } - - const fullManifest = [...existingManifest, ...manifest]; - await Bun.write(indexPath, JSON.stringify(fullManifest, null, 2)); - - // Write config if it doesn't exist - const configPath = join(outputDir, "config.json"); - if (!existsSync(configPath)) { - await Bun.write( - configPath, - JSON.stringify( - { - version: 1, - allowedCategories: ["*"], - autoSync: false, - }, - null, - 2 - ) - ); - } - - // Generate CLAUDE.md so Claude Code discovers shared knowledge - await generateClaudeMd(outputDir, fullManifest); - + await writeManifest(outputDir, manifest); return result; } diff --git a/src/team/sync.ts b/src/team/sync.ts index aa8d1f0..a0612a5 100644 --- a/src/team/sync.ts +++ b/src/team/sync.ts @@ -161,7 +161,13 @@ export async function syncTeamKnowledge( continue; } - const messages = extractMessages(body); + // Segmented pipeline docs don't have **user**/**assistant** patterns; + // treat the whole body as a single assistant message. + const isSegmented = meta.pipeline === "segmented"; + const messages = isSegmented + ? [{ role: "assistant", content: body.trim() }] + : extractMessages(body); + if (messages.length === 0) { result.skipped++; continue; diff --git a/src/team/utils.ts b/src/team/utils.ts new file mode 100644 index 0000000..18bc507 --- /dev/null +++ b/src/team/utils.ts @@ -0,0 +1,19 @@ +/** + * team/utils.ts - Shared utilities for the team pipeline + */ + +/** Generate a URL-friendly slug from text */ +export function slugify(text: string, maxLen: number = 50): string { + return text + .toLowerCase() + .replace(/[^a-z0-9\s-]/g, "") + .replace(/\s+/g, "-") + .replace(/-+/g, "-") + .slice(0, maxLen) + .replace(/-$/, ""); +} + +/** Format a date as YYYY-MM-DD */ +export function datePrefix(isoDate: string): string { + return isoDate.slice(0, 10); +} diff --git a/test/team-segmented.test.ts b/test/team-segmented.test.ts index e49df2e..cbc9727 100644 --- a/test/team-segmented.test.ts +++ b/test/team-segmented.test.ts @@ -2,12 +2,13 @@ * test/team-segmented.test.ts - Tests for 3-stage segmentation pipeline */ -import { test, expect, beforeAll, afterAll } from "bun:test"; +import { test, expect, beforeAll, afterAll, mock } from "bun:test"; import { initSmriti, closeDb, getDb } from "../src/db"; import type { Database } from "bun:sqlite"; import type { RawMessage } from "../src/team/formatter"; import { segmentSession, fallbackToSingleUnit } from "../src/team/segment"; import { generateDocument, generateDocumentsSequential } from "../src/team/document"; +import { isValidCategory } from "../src/categorize/schema"; import type { KnowledgeUnit } from "../src/team/types"; // ============================================================================= @@ -125,58 +126,119 @@ test("KnowledgeUnit has valid schema", () => { }); // ============================================================================= -// Documentation Generation Tests +// Documentation Generation Tests (with mocked Ollama) // ============================================================================= -test("generateDocument creates valid result", async () => { - const unit: KnowledgeUnit = { - id: "unit-test-1", - topic: "Token expiry bug fix", - category: "bug/fix", - relevance: 8, - entities: ["JWT", "Authentication"], - files: ["src/auth.ts"], - plainText: "Fixed token expiry by reading from environment variable", - lineRanges: [{ start: 0, end: 5 }], - }; - - // Mock Ollama to avoid network calls in tests - // For now, just validate the structure - const title = "Token Expiry Bug Fix"; - - // Check that we can create a document result structure - expect(unit.id).toBeDefined(); - expect(unit.category).toBe("bug/fix"); +test("generateDocument creates valid result with mocked Ollama", async () => { + // Mock fetch to return a realistic Ollama response + const originalFetch = globalThis.fetch; + globalThis.fetch = mock(async () => + new Response( + JSON.stringify({ + response: "# Token Expiry Bug Fix\n\n## Symptoms\nSessions expired after 1 hour.\n\n## Root Cause\nHardcoded TTL of 3600s.", + }), + { status: 200 } + ) + ); + + try { + const unit: KnowledgeUnit = { + id: "unit-test-1", + topic: "Token expiry bug fix", + category: "bug/fix", + relevance: 8, + entities: ["JWT", "Authentication"], + files: ["src/auth.ts"], + plainText: "Fixed token expiry by reading from environment variable", + lineRanges: [{ start: 0, end: 5 }], + }; + + const result = await generateDocument(unit, "Token Expiry Bug Fix"); + + expect(result.unitId).toBe("unit-test-1"); + expect(result.category).toBe("bug/fix"); + expect(result.title).toBe("Token Expiry Bug Fix"); + expect(result.markdown).toContain("Token Expiry Bug Fix"); + expect(result.filename).toMatch(/^\d{4}-\d{2}-\d{2}_token-expiry-bug-fix\.md$/); + expect(result.tokenEstimate).toBeGreaterThan(0); + } finally { + globalThis.fetch = originalFetch; + } }); -test("generateDocumentsSequential processes units in order", async () => { - const units: KnowledgeUnit[] = [ - { - id: "unit-1", - topic: "First unit", - category: "code/implementation", +test("generateDocumentsSequential processes units in order with mocked Ollama", async () => { + let callOrder = 0; + const originalFetch = globalThis.fetch; + globalThis.fetch = mock(async () => { + callOrder++; + return new Response( + JSON.stringify({ + response: `# Document ${callOrder}\n\nContent for document ${callOrder}.`, + }), + { status: 200 } + ); + }); + + try { + const units: KnowledgeUnit[] = [ + { + id: "unit-1", + topic: "First unit", + category: "code/implementation", + relevance: 7, + entities: ["TypeScript"], + files: ["src/main.ts"], + plainText: "First unit content", + lineRanges: [{ start: 0, end: 2 }], + }, + { + id: "unit-2", + topic: "Second unit", + category: "architecture/decision", + relevance: 8, + entities: ["Database"], + files: ["src/db.ts"], + plainText: "Second unit content", + lineRanges: [{ start: 3, end: 5 }], + }, + ]; + + const results = await generateDocumentsSequential(units); + + expect(results.length).toBe(2); + expect(results[0].unitId).toBe("unit-1"); + expect(results[1].unitId).toBe("unit-2"); + expect(results[0].category).toBe("code/implementation"); + expect(results[1].category).toBe("architecture/decision"); + } finally { + globalThis.fetch = originalFetch; + } +}); + +test("generateDocument falls back to plainText on Ollama failure", async () => { + const originalFetch = globalThis.fetch; + globalThis.fetch = mock(async () => { + throw new Error("Connection refused"); + }); + + try { + const unit: KnowledgeUnit = { + id: "unit-fallback", + topic: "Fallback test", + category: "topic/learning", relevance: 7, - entities: ["TypeScript"], - files: ["src/main.ts"], - plainText: "First unit content", - lineRanges: [{ start: 0, end: 2 }], - }, - { - id: "unit-2", - topic: "Second unit", - category: "architecture/decision", - relevance: 8, - entities: ["Database"], - files: ["src/db.ts"], - plainText: "Second unit content", - lineRanges: [{ start: 3, end: 5 }], - }, - ]; + entities: [], + files: [], + plainText: "This is the raw content that should appear as fallback", + lineRanges: [{ start: 0, end: 1 }], + }; + + const result = await generateDocument(unit, "Fallback Test"); - // Verify units are distinct - expect(units[0].id).not.toBe(units[1].id); - expect(units[0].category).not.toBe(units[1].category); - expect(units.length).toBe(2); + expect(result.markdown).toContain("raw content that should appear as fallback"); + } finally { + globalThis.fetch = originalFetch; + } }); // ============================================================================= @@ -258,10 +320,10 @@ test("Custom relevance threshold filters correctly", () => { }); // ============================================================================= -// Category Validation Tests +// Category Validation Tests (using real DB) // ============================================================================= -test("Valid categories pass validation", () => { +test("Valid categories pass DB validation", () => { const validCategories = [ "bug/fix", "architecture/decision", @@ -273,17 +335,14 @@ test("Valid categories pass validation", () => { ]; for (const cat of validCategories) { - // Should not throw - expect(cat.length > 0).toBe(true); + expect(isValidCategory(db, cat)).toBe(true); } }); -test("Invalid categories fallback gracefully", () => { - const invalidCategory = "made/up/invalid/category"; - - // In real implementation, this would validate against DB - // For test, just verify the structure handles it - expect(typeof invalidCategory).toBe("string"); +test("Invalid categories are rejected by DB validation", () => { + expect(isValidCategory(db, "made/up/invalid/category")).toBe(false); + expect(isValidCategory(db, "nonexistent")).toBe(false); + expect(isValidCategory(db, "")).toBe(false); }); // ============================================================================= diff --git a/test/team.test.ts b/test/team.test.ts index c5dd4c5..daa2c36 100644 --- a/test/team.test.ts +++ b/test/team.test.ts @@ -1,57 +1,114 @@ -import { isValidCategory } from './categorize/schema'; -import { parseFrontmatter } from '../src/team/sync'; - -// Test cases for tag parsing -const tagTests = [ - { - input: 'tags: ["project", "project/dependency", "decision/tooling"]', - expected: ['project', 'project/dependency', 'decision/tooling'] - }, - { - input: 'tags: ["a", "b/c", "d"]', - expected: ['a', 'b/c', 'd'] - }, - { - input: 'category: project\ntags: ["a", "b"]', - expected: ['a', 'b'] - } -]; - -// Test for backward compatibility -const compatTestCases = [ - { - input: 'category: project', - expected: ['project'] - }, - { - input: 'tags: ["invalid"]', - expected: [] - } -]; - -// Roundtrip test -const roundtripTestCases = [ - { - input: 'category: project\ntags: ["a", "b/c"]', - expected: ['a', 'b/c'] - } -]; - -// Run tests -for (const test of tagTests) { - const parsed = parseFrontmatter(test.input); - console.assert(JSON.stringify(parsed.tags) === JSON.stringify(test.expected), ` - Test failed: Input ${test.input} expected ${test.expected} but got ${parsed.tags}`); -} - -for (const test of compatTestCases) { - const parsed = parseFrontmatter(test.input); - console.assert(JSON.stringify(parsed.tags) === JSON.stringify(test.expected), ` - Compatibility test failed: Input ${test.input} expected ${test.expected} but got ${parsed.tags}`); -} - -for (const test of roundtripTestCases) { - const parsed = parseFrontmatter(test.input); - console.assert(JSON.stringify(parsed.tags) === JSON.stringify(test.expected), ` - Roundtrip test failed: Input ${test.input} expected ${test.expected} but got ${parsed.tags}`); -} +/** + * test/team.test.ts - Tests for team sharing pipeline utilities + */ + +import { test, expect } from "bun:test"; +import { isValidCategory } from "../src/categorize/schema"; +import { parseFrontmatter } from "../src/team/sync"; +import { initSmriti, closeDb } from "../src/db"; +import type { Database } from "bun:sqlite"; + +// ============================================================================= +// Setup +// ============================================================================= + +const db: Database = initSmriti(":memory:"); + +// ============================================================================= +// Tag Parsing Tests +// ============================================================================= + +test("parseFrontmatter extracts tags array", () => { + const input = `--- +tags: ["project", "project/dependency", "decision/tooling"] +--- +Body content here`; + + const parsed = parseFrontmatter(input); + expect(parsed.meta.tags).toBe(`["project", "project/dependency", "decision/tooling"]`); + expect(parsed.body).toContain("Body content here"); +}); + +test("parseFrontmatter extracts multiple fields", () => { + const input = `--- +category: project +tags: ["a", "b"] +--- +Body`; + + const parsed = parseFrontmatter(input); + expect(parsed.meta.category).toBe("project"); + expect(parsed.meta.tags).toBe(`["a", "b"]`); +}); + +test("parseFrontmatter handles content without frontmatter", () => { + const input = "Just plain text without frontmatter delimiters"; + const parsed = parseFrontmatter(input); + expect(Object.keys(parsed.meta).length).toBe(0); + expect(parsed.body).toBe(input); +}); + +// ============================================================================= +// Backward Compatibility Tests +// ============================================================================= + +test("parseFrontmatter returns single category field", () => { + const input = `--- +category: project +--- +Some body`; + + const parsed = parseFrontmatter(input); + expect(parsed.meta.category).toBe("project"); +}); + +test("parseFrontmatter extracts pipeline field for segmented docs", () => { + const input = `--- +category: bug/fix +pipeline: segmented +--- +# Bug Fix Title + +Some documented content`; + + const parsed = parseFrontmatter(input); + expect(parsed.meta.pipeline).toBe("segmented"); + expect(parsed.meta.category).toBe("bug/fix"); +}); + +// ============================================================================= +// Category Validation Tests +// ============================================================================= + +test("isValidCategory accepts known categories", () => { + expect(isValidCategory(db, "bug/fix")).toBe(true); + expect(isValidCategory(db, "architecture/decision")).toBe(true); + expect(isValidCategory(db, "code/implementation")).toBe(true); +}); + +test("isValidCategory rejects unknown categories", () => { + expect(isValidCategory(db, "made/up/invalid")).toBe(false); + expect(isValidCategory(db, "nonexistent")).toBe(false); +}); + +// ============================================================================= +// Roundtrip Tests +// ============================================================================= + +test("parseFrontmatter roundtrip preserves body content", () => { + const input = `--- +category: project +author: testuser +--- +# Session Title + +**user**: Hello world + +**assistant**: Hi there`; + + const parsed = parseFrontmatter(input); + expect(parsed.meta.category).toBe("project"); + expect(parsed.meta.author).toBe("testuser"); + expect(parsed.body).toContain("# Session Title"); + expect(parsed.body).toContain("**user**: Hello world"); +});