PRD: Scribe v2. Writing Companion Redesign
Date: 2026-04-12
Status: Draft
Problem
The scribe writing companion takes 8-15 seconds to return grammar annotations. The bottleneck is not the AI model. It's the delivery path: DevTUI shells out to overmind CLI, which manages a session, which calls the Claude API, which streams back through a subscriber process. Three process boundaries, IPC hops, and a newline-escaping hack to work around overmind dropping multiline arguments.
The user writes blog posts in both English and Portuguese. The current scribe only provides grammar/spelling checks. There is no way to interact with the AI, ask questions about the draft, or pull context from the user's Obsidian vault (which has 10+ blog drafts, 44 TILs, and a structured idea backlog already indexed by qmd).
Writing is a conversation. The scribe today is a one-way annotation feed.
Background
The scribe was built as a proof of concept using overmind (the user's own Elixir-based AI agent platform). It works but the architecture adds latency that makes it feel sluggish. The user writes ~1 post per week, alternating between English and Portuguese.
Three insights from recent discussion:
- Grammar/spelling is a solved problem. harper-core (pure Rust, Automattic, 10K+ stars) lints English in milliseconds. No LLM needed for typos.
- The overmind middleman is the bottleneck. A direct HTTP call to Claude Haiku takes 2-3s. Through overmind it takes 8-15s.
- The real value is the conversation. The user has a rich Obsidian vault with ideas, TILs, and drafts. A writing companion that can search the vault and discuss the draft is more valuable than one that just flags typos.
Current architecture: DevTUI → overmind CLI → overmind daemon → Claude API → overmind daemon → overmind subscribe CLI → DevTUI.
Target architecture: DevTUI → harper-core (in-process) + reqwest → LLM API (direct).
Requirements
Must Have
- Instant English grammar/spelling. harper-core runs on every content change. Annotations appear in <50ms. No network, no API key, no external process.
- LLM writing hints for EN and PT-BR. Style, phrasing, factual suggestions via direct API call. Configurable provider (Groq or Claude). Triggers on demand or after idle. Works for both languages.
- Remove overmind dependency. Delete
start_scribe_session, send_to_scribe, run_subscriber, escape_for_overmind, kill_scribe_session from ops.rs. No sessions, no subscriber threads, no daemon management.
- Provider configuration. The user can set which LLM provider and model to use. API key read from environment variable. Defaults to Groq (Llama 3.1 8B) for speed.
Should Have
- Chat pane. Interactive writing companion as a new layout in the Ctrl+G cycle: Preview → Scribe → Chat → Editor only. The user types a question, AI responds with context from the current draft. Conversation history preserved while editing the same article.
- Vault context in chat. Before sending a chat message to the LLM, run
qmd query -c vault "<question>" and include the top 3-5 results as context. The AI sees the draft + related vault notes + the question.
- Portuguese spelling via hunspell. Basic PT-BR spell check using hunspell with
pt_BR dictionary. Not grammar (no good pure-Rust option exists). Complements the LLM path for PT-BR posts.
Out of Scope
- Full PT-BR grammar checking without LLM. No pure-Rust Portuguese grammar checker exists. LanguageTool (Java, 400MB+ RAM) is too heavy for a TUI. The LLM path covers this adequately.
- Streaming responses in scribe. The annotation format (JSON array) doesn't benefit from streaming. Chat pane streaming is a future enhancement.
- Local LLM inference (Ollama). The M4 16GB runs Llama 3.1 8B at ~30-50 tok/s (6-10s per response). Groq cloud delivers the same model in <1s. Local inference doesn't improve the experience at this hardware tier.
- Multiple vault collections. Chat searches the single
vault qmd collection. Multi-collection support is unnecessary.
- Persisting chat history across articles. Chat resets when switching articles in CMS mode. The conversation is about the current draft.
Constraints
- harper-core is English-only. Supports US, UK, CA, AU dialects. No Portuguese, no Spanish. This is a library limitation, not a design choice.
- qmd must be installed for vault search. Chat works without qmd (just no vault context), but the full experience requires it. Graceful fallback: if
qmd is not found, skip vault search and note it in the chat.
- API key required for LLM features. Grammar (harper) works offline. Hints and chat require a configured API key. Clear error message if missing.
- Sync HTTP in a thread. The editor uses no async runtime. LLM calls run in
thread::spawn with a sync HTTP client (ureq or reqwest blocking), writing results into the existing Arc<Mutex<Option<Result>>> slot. This matches the current pattern.
Acceptance Criteria
Instant Grammar (English)
- Given an English post with a typo ("teh"), when content changes, then a red annotation appears on the affected line within 100ms.
- Given a Portuguese post, when content changes, then harper does not run (language detection or manual toggle).
- Given a post with code blocks, when content changes, then grammar checking skips fenced code blocks.
LLM Writing Hints
- Given any post (EN or PT-BR), when the user triggers a check (Ctrl+T or idle >10s), then annotations appear within 3s (Groq) or 5s (Claude).
- Given no API key configured, when the user triggers a check, then the scribe pane shows "No API key. Set GROQ_API_KEY or ANTHROPIC_API_KEY."
- Given the LLM provider is unreachable, when a check is triggered, then the error appears in the status log and harper annotations remain visible.
Chat Pane
- Given the user presses Ctrl+G to cycle to Chat, then an input area appears at the bottom of the right pane with conversation history above.
- Given the user types a question and presses Enter, then the question appears in the conversation, a "thinking..." indicator shows, and the AI response appears within 3-5s.
- Given qmd is installed, when the user sends a chat message, then the AI response reflects knowledge from related vault notes (visible as "Found N related notes" in the response context).
- Given qmd is not installed, when the user sends a chat message, then the chat works without vault context and shows a one-time note: "Install qmd for vault search."
- Given the user switches articles in CMS mode, then chat history clears.
- Given the user is in Chat layout, when they press a key, then the keystroke goes to the chat input box, not to vim. Esc returns focus to vim.
Overmind Removal
- Given a fresh DevTUI build, when the user runs the editor, then overmind is not required and not referenced in any process or error message.
- Given the scribe pane is active, when the editor exits, then no
overmind kill process is spawned.
PRD: Scribe v2. Writing Companion Redesign
Date: 2026-04-12
Status: Draft
Problem
The scribe writing companion takes 8-15 seconds to return grammar annotations. The bottleneck is not the AI model. It's the delivery path: DevTUI shells out to overmind CLI, which manages a session, which calls the Claude API, which streams back through a subscriber process. Three process boundaries, IPC hops, and a newline-escaping hack to work around overmind dropping multiline arguments.
The user writes blog posts in both English and Portuguese. The current scribe only provides grammar/spelling checks. There is no way to interact with the AI, ask questions about the draft, or pull context from the user's Obsidian vault (which has 10+ blog drafts, 44 TILs, and a structured idea backlog already indexed by qmd).
Writing is a conversation. The scribe today is a one-way annotation feed.
Background
The scribe was built as a proof of concept using overmind (the user's own Elixir-based AI agent platform). It works but the architecture adds latency that makes it feel sluggish. The user writes ~1 post per week, alternating between English and Portuguese.
Three insights from recent discussion:
Current architecture: DevTUI → overmind CLI → overmind daemon → Claude API → overmind daemon → overmind subscribe CLI → DevTUI.
Target architecture: DevTUI → harper-core (in-process) + reqwest → LLM API (direct).
Requirements
Must Have
start_scribe_session,send_to_scribe,run_subscriber,escape_for_overmind,kill_scribe_sessionfrom ops.rs. No sessions, no subscriber threads, no daemon management.Should Have
qmd query -c vault "<question>"and include the top 3-5 results as context. The AI sees the draft + related vault notes + the question.pt_BRdictionary. Not grammar (no good pure-Rust option exists). Complements the LLM path for PT-BR posts.Out of Scope
vaultqmd collection. Multi-collection support is unnecessary.Constraints
qmdis not found, skip vault search and note it in the chat.thread::spawnwith a sync HTTP client (ureq or reqwest blocking), writing results into the existingArc<Mutex<Option<Result>>>slot. This matches the current pattern.Acceptance Criteria
Instant Grammar (English)
LLM Writing Hints
Chat Pane
Overmind Removal
overmind killprocess is spawned.