Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,14 @@

## [Unreleased]

### Ask this book — conversational, streaming web chat — backend (AI-028) (2026-06-19)

Backend for the conversational "Ask this book" upgrade: **model bump + multi-turn memory + warm-companion prompt + SSE streaming**, with grounding, citations, and the spoiler gate intact. `rag.ask` now routes to a dedicated keyed provider `openai-rag` on **gpt-4.1-mini** (was gpt-4.1-nano), mirroring `openai-explain` (`OpenAI:RagAsk:Model`, `Ai:Routes:rag.ask → openai-rag`, decorator-loop entry, `ModelRegistrySeeder` row). The system prompt is rewritten from "answer ONLY from excerpts else refuse" to a **warm reading companion** that is still strictly grounded — every book-fact claim must come from the numbered excerpts and cite `[n]` (citation contract + parser unchanged), but greetings/meta ("hi", "what can you do") get a warm invite with **no forced citation and no refusal**, and a genuine question with no matching excerpt gets a graceful "I don't see that in what you've read so far" rather than an invented fact. **Multi-turn**: `AskRequest` gains `History: AskTurnDto[]` (role `"user"`/`"assistant"`); the server defensively clamps to the **last 6 turns**, caps each turn at 4000 chars, normalizes roles, and assembles a real chat (system → numbered-excerpts context block → prior turns → new question last). Retrieval still runs on the latest question only, so the grounding eval is byte-identical with `[]` history. **SSE** (content-negotiated, mirrors Explain): `Accept: text/event-stream` → `delta` events (token fragments) then a terminal `done` carrying `{ citations, lastReadOrd, insufficient }` (camelCase, identical citation shape to the JSON path); empty-chunks → one friendly `delta` + `done {insufficient:true}` with **no model call**; provider/mid-stream failure → terminal `error`. JSON path returns the unchanged `AskResponse` (eval + mobile keep working). Ask `MaxOutputTokens` raised 320 → 400 for conversational length. `dotnet build -c Release` clean; 868 unit tests green (history clamp, multi-turn message assembly, SSE event sequencing over a fake delta stream, companion greeting-vs-content prompt structure) + integration (catalog spoiler-gate, owner-404, SSE content-type + framing, JSON history passthrough — skip-on-unavailable). **Note: the grounding golden eval (`RagEvalRunner`) MUST be re-run on mini post-deploy** — the companion prompt loosened the refusal rule, so this is the real hallucination-risk gate (paid; not runnable in CI). Frontend = parallel agent (AI-026e).

### Ask this book — conversational, streaming web chat (AI-026e) (2026-06-19)

"Ask this book" goes from one-shot Q&A to a streaming, multi-turn reading companion on the web. Answers now stream **token-by-token**: the in-progress assistant turn grows live with a subtle blinking caret, citation chips render under it once the stream's `done` event lands. The panel sends the **last 6 turns** of history (bounded client-side) on each request for follow-up context, and shows **3–4 suggested starter questions** ("Summarize what I've read so far", "Who are the main characters?", "Explain the key idea so far", "What should I pay attention to?") on an empty, Ready thread — one click submits. Reuses the existing `postSse` SSE-over-POST client (same path Explain streams on) — `askStream(target, question, history, {onDelta,onDone,onError,signal})` POSTs `{question, history, currentChapterId?}` with `Accept: text/event-stream`, routing by `target.kind` to catalog `/books/{id}/ask` or userbook `/me/books/{id}/ask`. Browsers that can't stream fall back to the JSON single-turn endpoint; 401 → sign-in. Grounding, citation chips + jump-to-passage, and the Prepare → Preparing N/M → Ready/Failed index state machine are **unchanged** (server owns grounding). Shared `AskTurnDto` added to `@textstack/shared` (web + mobile consume; mobile keeps its JSON path). tsc + web build clean; mobile tsc clean; 564 web tests green (extended `useAsk` for delta accumulation → done-sets-citations, history bounded to 6, abort-on-unmount; `AskPanel` for starters render + submit). Backend = parallel agent.

### Ask this book — on-demand indexing, Phase 2 (user-uploaded books) — backend (2026-06-19)

Backend for "Ask this book" over **user-uploaded books** (`UserBook`/`UserChapter`), mirroring the P1 catalog path with **per-user isolation as a hard requirement** (a user must never retrieve another user's chunks). New **isolated** `user_chapter_chunk` table (NOT polymorphic with `chapter_chunk`) carries a **denormalized `user_id`** alongside `user_book_id`; retrieval filters on **both** in SQL (defense in depth), in both the vector and lexical branches. `UserBook` gains the same `rag_status/rag_chunk_count/rag_embedded_count/rag_indexed_at/rag_error` fields as `Edition`. `BookChunkingService.ChunkUserBookAsync` chunks `UserChapter.PlainText` into the new table (stamping the owner id from the book); the existing `ChapterEmbeddingWorker` now runs a **second batch poll** over `user_chapter_chunk` on the **same single OpenAI drain** (no second worker) and flips `user_books.rag_status → Ready` when `embedded == chunk`. `IRagService.RetrieveUserBookAsync` reuses the identical RRF hybrid (vector NN + lexical FTS) via a shared private helper, so fusion/vector-format/timeout stay byte-identical to the catalog path. Owner-scoped endpoints: `POST/GET /me/books/{id}/index` (atomic claim `WHERE id=@id AND user_id=@uid AND rag_status IN (0,3)`, clears stale chunks before re-chunk, rate-limited `rag.index`) and `POST /me/books/{id}/ask` (**no spoiler gate** — full-book retrieval over the user's own document, no private-notes corpus; reuses `RagAskService.AskFromChunksAsync`; 404 if not the owner's book). `GET /me/books/{id}` detail DTO gains `ragStatus`/counts. P1 catalog path unchanged. Migration `AddUserBookRagIndex` (creates `user_chapter_chunk` + HNSW/GIN/`(user_id, user_book_id)` indexes + generated `search_vector` + cascade FKs from both `user_books` and `user_chapters`, plus the `user_books.rag_*` columns; Up/Down verified against pgvector). 852 unit tests green (incl. `ChunkUserBookAsync` row shape + a SQL-level isolation guard asserting both `user_id`/`user_book_id` filters in both retrievers) + integration tests (unauth→401, non-owner→404, owner→202/200/answer). Frontend = parallel agent.
Expand Down
71 changes: 69 additions & 2 deletions apps/web/src/api/ask.ts
Original file line number Diff line number Diff line change
@@ -1,10 +1,77 @@
import { authFetch } from './client'
import type { AskResponse } from '@textstack/shared'
import type { AskResponse, AskCitation, AskTurnDto } from '@textstack/shared'
import { postSse } from '../lib/sse'
import type { RagIndexState, RagIndexStatus } from '../types/api'

export type { AskResponse, AskCitation } from '@textstack/shared'
export type { AskResponse, AskCitation, AskTurnDto } from '@textstack/shared'
export type { RagIndexState, RagIndexStatus } from '../types/api'

// API_BASE: host (dev) or '' (prod, nginx strips /api). Backend route has no prefix.
const API_BASE = import.meta.env.VITE_API_URL ?? ''

/** Most recent conversation turns to send back for multi-turn context (AI-026e). */
export const MAX_HISTORY_TURNS = 6

/** Terminal `done` payload of a streamed ask (AI-026e). */
export interface AskDone {
citations: AskCitation[]
lastReadOrd: number
insufficient: boolean
}

interface AskStreamCallbacks {
onDelta: (fragment: string) => void
onDone: (done: AskDone) => void
onError: (message: string) => void
signal?: AbortSignal
}

/**
* Streaming "Ask this book" (AI-026e). POSTs the question + bounded `history` with
* `Accept: text/event-stream` and consumes SSE via {@link postSse}: `delta` fragments append to the
* answer, `done` carries citations/insufficient, `error` surfaces a message. URL routes by
* `target.kind` (catalog vs userbook). Throws `SseUnauthorizedError`/`SseUnsupportedError` from
* `postSse` for the caller to handle (sign-in / JSON fallback).
*/
export function askStream(
target: AskTarget,
question: string,
history: AskTurnDto[],
{ onDelta, onDone, onError, signal }: AskStreamCallbacks,
currentChapterId?: string,
): Promise<void> {
const url =
target.kind === 'userbook'
? `${API_BASE}/me/books/${target.id}/ask`
: `${API_BASE}/books/${target.id}/ask`
const body = {
question,
history: history.slice(-MAX_HISTORY_TURNS),
...(currentChapterId ? { currentChapterId } : {}),
}
return postSse(
url,
body,
e => {
if (signal?.aborted) return
if (e.event === 'delta') onDelta(e.data)
else if (e.event === 'done') {
try {
const parsed = JSON.parse(e.data) as Partial<AskDone>
onDone({
citations: parsed.citations ?? [],
lastReadOrd: parsed.lastReadOrd ?? 0,
insufficient: Boolean(parsed.insufficient),
})
} catch {
onDone({ citations: [], lastReadOrd: 0, insufficient: false })
}
} else if (e.event === 'error') onError(e.data || 'Ask failed')
},
signal,
)
}

/**
* Identifies what the "Ask this book" panel is pointed at (AI-027 P2). A catalog `edition`
* routes to `/books/{id}/...`; a user-uploaded `userbook` routes to `/me/books/{id}/...`.
Expand Down
43 changes: 35 additions & 8 deletions apps/web/src/components/reader/AskPanel.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,15 @@ export function AskPanel({
setInput('')
}

const submitStarter = (q: string) => {
if (isLoading) return
ask(q)
}

// Suggested starter questions, shown only on an empty, Ready thread (AI-026e).
const starterKeys = ['summary', 'characters', 'keyIdea', 'attention'] as const
const showStarters = history.length === 0 && status === 'Ready' && isAuthenticated && !isLoading

return (
<>
<div className="reader-drawer-backdrop" onClick={onClose} />
Expand All @@ -69,11 +78,35 @@ export function AskPanel({
{history.length === 0 && !isLoading && (
<p className="ask-panel__empty">{t('reader.ask.empty')}</p>
)}
{showStarters && (
<div className="ask-panel__starters">
<p className="ask-panel__starters-title">{t('reader.ask.startersTitle')}</p>
{starterKeys.map(key => (
<button
key={key}
type="button"
className="ask-panel__starter"
onClick={() => submitStarter(t(`reader.ask.starters.${key}`))}
>
{t(`reader.ask.starters.${key}`)}
</button>
))}
</div>
)}
{history.map((turn, i) => (
<div key={i} className="ask-panel__turn">
<p className="ask-panel__question">{turn.question}</p>
<p className="ask-panel__answer">{turn.answer}</p>
{turn.citations.length > 0 && (
<p className="ask-panel__answer">
{turn.answer}
{turn.streaming && <span className="ask-panel__cursor" aria-hidden="true" />}
</p>
{turn.streaming && !turn.answer && (
<div className="ask-panel__loading">
<span className="ask-panel__spinner" />
{t('reader.ask.thinking')}
</div>
)}
{!turn.streaming && turn.citations.length > 0 && (
<div className="ask-panel__citations">
{turn.citations.map(c => (
<button
Expand All @@ -89,12 +122,6 @@ export function AskPanel({
)}
</div>
))}
{isLoading && (
<div className="ask-panel__loading">
<span className="ask-panel__spinner" />
{t('reader.ask.thinking')}
</div>
)}
{error && error !== 'auth' && <p className="ask-panel__error">{error}</p>}
</div>

Expand Down
31 changes: 30 additions & 1 deletion apps/web/src/components/reader/__tests__/AskPanel.test.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@ const baseProps = {
afterEach(() => {
cleanup()
askState.history = []
askState.ask = vi.fn()
ragState.status = 'Ready'
ragState.chunkCount = 0
ragState.embeddedCount = 0
Expand Down Expand Up @@ -96,7 +97,7 @@ describe('AskPanel', () => {

it('renders a citation chip and navigates on click', () => {
const citation = { marker: 1, chunkId: 'c1', chapterId: 'ch1', chapterOrd: 4, charStart: 0, charEnd: 1, preview: 'snippet' }
askState.history = [{ question: 'q', answer: 'a [1]', citations: [citation], insufficient: false }]
askState.history = [{ question: 'q', answer: 'a [1]', citations: [citation], insufficient: false, streaming: false }]
const onNavigateToCitation = vi.fn()

render(<AskPanel {...baseProps} isAuthenticated={true} onNavigateToCitation={onNavigateToCitation} />)
Expand All @@ -105,4 +106,32 @@ describe('AskPanel', () => {

expect(onNavigateToCitation).toHaveBeenCalledWith(citation)
})

it('shows starter questions on an empty, Ready thread and submits one on click', () => {
askState.history = []
ragState.status = 'Ready'
const ask = vi.fn()
askState.ask = ask

render(<AskPanel {...baseProps} isAuthenticated={true} />)

expect(screen.getByText('reader.ask.startersTitle')).toBeTruthy()
const starter = screen.getByText('reader.ask.starters.summary')
fireEvent.click(starter)
expect(ask).toHaveBeenCalledWith('reader.ask.starters.summary')
})

it('hides starters once the thread has a turn', () => {
askState.history = [{ question: 'q', answer: 'a', citations: [], insufficient: false, streaming: false }]
ragState.status = 'Ready'
render(<AskPanel {...baseProps} isAuthenticated={true} />)
expect(screen.queryByText('reader.ask.startersTitle')).toBeNull()
})

it('does not show starters until the index is Ready', () => {
askState.history = []
ragState.status = 'NotIndexed'
render(<AskPanel {...baseProps} isAuthenticated={true} />)
expect(screen.queryByText('reader.ask.startersTitle')).toBeNull()
})
})
Loading
Loading