Skip to content

prd: scribe v2 — writing companion redesign #7

@leandronsp

Description

@leandronsp

PRD: Scribe v2. Writing Companion Redesign

Date: 2026-04-12
Status: Draft

Problem

The scribe writing companion takes 8-15 seconds to return grammar annotations. The bottleneck is not the AI model. It's the delivery path: DevTUI shells out to overmind CLI, which manages a session, which calls the Claude API, which streams back through a subscriber process. Three process boundaries, IPC hops, and a newline-escaping hack to work around overmind dropping multiline arguments.

The user writes blog posts in both English and Portuguese. The current scribe only provides grammar/spelling checks. There is no way to interact with the AI, ask questions about the draft, or pull context from the user's Obsidian vault (which has 10+ blog drafts, 44 TILs, and a structured idea backlog already indexed by qmd).

Writing is a conversation. The scribe today is a one-way annotation feed.

Background

The scribe was built as a proof of concept using overmind (the user's own Elixir-based AI agent platform). It works but the architecture adds latency that makes it feel sluggish. The user writes ~1 post per week, alternating between English and Portuguese.

Three insights from recent discussion:

  1. Grammar/spelling is a solved problem. harper-core (pure Rust, Automattic, 10K+ stars) lints English in milliseconds. No LLM needed for typos.
  2. The overmind middleman is the bottleneck. A direct HTTP call to Claude Haiku takes 2-3s. Through overmind it takes 8-15s.
  3. The real value is the conversation. The user has a rich Obsidian vault with ideas, TILs, and drafts. A writing companion that can search the vault and discuss the draft is more valuable than one that just flags typos.

Current architecture: DevTUI → overmind CLI → overmind daemon → Claude API → overmind daemon → overmind subscribe CLI → DevTUI.

Target architecture: DevTUI → harper-core (in-process) + reqwest → LLM API (direct).

Requirements

Must Have

  • Instant English grammar/spelling. harper-core runs on every content change. Annotations appear in <50ms. No network, no API key, no external process.
  • LLM writing hints for EN and PT-BR. Style, phrasing, factual suggestions via direct API call. Configurable provider (Groq or Claude). Triggers on demand or after idle. Works for both languages.
  • Remove overmind dependency. Delete start_scribe_session, send_to_scribe, run_subscriber, escape_for_overmind, kill_scribe_session from ops.rs. No sessions, no subscriber threads, no daemon management.
  • Provider configuration. The user can set which LLM provider and model to use. API key read from environment variable. Defaults to Groq (Llama 3.1 8B) for speed.

Should Have

  • Chat pane. Interactive writing companion as a new layout in the Ctrl+G cycle: Preview → Scribe → Chat → Editor only. The user types a question, AI responds with context from the current draft. Conversation history preserved while editing the same article.
  • Vault context in chat. Before sending a chat message to the LLM, run qmd query -c vault "<question>" and include the top 3-5 results as context. The AI sees the draft + related vault notes + the question.
  • Portuguese spelling via hunspell. Basic PT-BR spell check using hunspell with pt_BR dictionary. Not grammar (no good pure-Rust option exists). Complements the LLM path for PT-BR posts.

Out of Scope

  • Full PT-BR grammar checking without LLM. No pure-Rust Portuguese grammar checker exists. LanguageTool (Java, 400MB+ RAM) is too heavy for a TUI. The LLM path covers this adequately.
  • Streaming responses in scribe. The annotation format (JSON array) doesn't benefit from streaming. Chat pane streaming is a future enhancement.
  • Local LLM inference (Ollama). The M4 16GB runs Llama 3.1 8B at ~30-50 tok/s (6-10s per response). Groq cloud delivers the same model in <1s. Local inference doesn't improve the experience at this hardware tier.
  • Multiple vault collections. Chat searches the single vault qmd collection. Multi-collection support is unnecessary.
  • Persisting chat history across articles. Chat resets when switching articles in CMS mode. The conversation is about the current draft.

Constraints

  • harper-core is English-only. Supports US, UK, CA, AU dialects. No Portuguese, no Spanish. This is a library limitation, not a design choice.
  • qmd must be installed for vault search. Chat works without qmd (just no vault context), but the full experience requires it. Graceful fallback: if qmd is not found, skip vault search and note it in the chat.
  • API key required for LLM features. Grammar (harper) works offline. Hints and chat require a configured API key. Clear error message if missing.
  • Sync HTTP in a thread. The editor uses no async runtime. LLM calls run in thread::spawn with a sync HTTP client (ureq or reqwest blocking), writing results into the existing Arc<Mutex<Option<Result>>> slot. This matches the current pattern.

Acceptance Criteria

Instant Grammar (English)

  • Given an English post with a typo ("teh"), when content changes, then a red annotation appears on the affected line within 100ms.
  • Given a Portuguese post, when content changes, then harper does not run (language detection or manual toggle).
  • Given a post with code blocks, when content changes, then grammar checking skips fenced code blocks.

LLM Writing Hints

  • Given any post (EN or PT-BR), when the user triggers a check (Ctrl+T or idle >10s), then annotations appear within 3s (Groq) or 5s (Claude).
  • Given no API key configured, when the user triggers a check, then the scribe pane shows "No API key. Set GROQ_API_KEY or ANTHROPIC_API_KEY."
  • Given the LLM provider is unreachable, when a check is triggered, then the error appears in the status log and harper annotations remain visible.

Chat Pane

  • Given the user presses Ctrl+G to cycle to Chat, then an input area appears at the bottom of the right pane with conversation history above.
  • Given the user types a question and presses Enter, then the question appears in the conversation, a "thinking..." indicator shows, and the AI response appears within 3-5s.
  • Given qmd is installed, when the user sends a chat message, then the AI response reflects knowledge from related vault notes (visible as "Found N related notes" in the response context).
  • Given qmd is not installed, when the user sends a chat message, then the chat works without vault context and shows a one-time note: "Install qmd for vault search."
  • Given the user switches articles in CMS mode, then chat history clears.
  • Given the user is in Chat layout, when they press a key, then the keystroke goes to the chat input box, not to vim. Esc returns focus to vim.

Overmind Removal

  • Given a fresh DevTUI build, when the user runs the editor, then overmind is not required and not referenced in any process or error message.
  • Given the scribe pane is active, when the editor exits, then no overmind kill process is spawned.

Metadata

Metadata

Assignees

No one assigned

    Labels

    prdProduct Requirements Document

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions