Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,14 @@

## [Unreleased]

### Ask this book — on-demand indexing, Phase 2 (user-uploaded books) — backend (2026-06-19)

Backend for "Ask this book" over **user-uploaded books** (`UserBook`/`UserChapter`), mirroring the P1 catalog path with **per-user isolation as a hard requirement** (a user must never retrieve another user's chunks). New **isolated** `user_chapter_chunk` table (NOT polymorphic with `chapter_chunk`) carries a **denormalized `user_id`** alongside `user_book_id`; retrieval filters on **both** in SQL (defense in depth), in both the vector and lexical branches. `UserBook` gains the same `rag_status/rag_chunk_count/rag_embedded_count/rag_indexed_at/rag_error` fields as `Edition`. `BookChunkingService.ChunkUserBookAsync` chunks `UserChapter.PlainText` into the new table (stamping the owner id from the book); the existing `ChapterEmbeddingWorker` now runs a **second batch poll** over `user_chapter_chunk` on the **same single OpenAI drain** (no second worker) and flips `user_books.rag_status → Ready` when `embedded == chunk`. `IRagService.RetrieveUserBookAsync` reuses the identical RRF hybrid (vector NN + lexical FTS) via a shared private helper, so fusion/vector-format/timeout stay byte-identical to the catalog path. Owner-scoped endpoints: `POST/GET /me/books/{id}/index` (atomic claim `WHERE id=@id AND user_id=@uid AND rag_status IN (0,3)`, clears stale chunks before re-chunk, rate-limited `rag.index`) and `POST /me/books/{id}/ask` (**no spoiler gate** — full-book retrieval over the user's own document, no private-notes corpus; reuses `RagAskService.AskFromChunksAsync`; 404 if not the owner's book). `GET /me/books/{id}` detail DTO gains `ragStatus`/counts. P1 catalog path unchanged. Migration `AddUserBookRagIndex` (creates `user_chapter_chunk` + HNSW/GIN/`(user_id, user_book_id)` indexes + generated `search_vector` + cascade FKs from both `user_books` and `user_chapters`, plus the `user_books.rag_*` columns; Up/Down verified against pgvector). 852 unit tests green (incl. `ChunkUserBookAsync` row shape + a SQL-level isolation guard asserting both `user_id`/`user_book_id` filters in both retrievers) + integration tests (unauth→401, non-owner→404, owner→202/200/answer). Frontend = parallel agent.

### Ask this book — on-demand indexing, Phase 2 (user-uploaded books) — web (2026-06-19)

P1 shipped on-demand "Ask this book" for catalog editions; P2 unhides it for **user-uploaded books** in the web reader — the priority case (users upload books to read *and* ask about them). The Ask toolbar button + panel now render for `mode==='userbook'`, routing index status / prepare / ask through the owner-scoped `/me/books/{id}/{index,ask}` endpoints (no spoiler gate — it's the user's own document, so answers draw from the whole book). The P1 catalog path is unchanged. The reader builds a single `AskTarget` (`{ kind: 'edition'|'userbook', id, ragStatus, ragChunkCount, ragEmbeddedCount }`) from whichever book it loaded and threads it through `AskPanel` → `useAsk` / `useRagIndex`; the hooks select the endpoint family by `kind`, so the index state-machine (Prepare → Preparing N/M → Ready → Ask) and styling are reused verbatim. The user-book detail (`GET /me/books/{id}`) gains `ragStatus`/counts, surfaced via `NormalizedBook` to seed the panel with no extra fetch. (StudyBuddy stays catalog-only.) tsc + web build clean; 559 web tests green (extended `useRagIndex`/`useAsk`/`AskPanel` for the userbook target hitting `/me/books/{id}/...`). Backend = parallel agent. (Phase 3 = observability.)

### Ask this book — on-demand indexing, Phase 1 (catalog) (2026-06-19)

"Ask this book" returned a misleading "you haven't read enough" for **every** catalog book — because none were RAG-indexed (the 614 editions were imported before RAG; `chapter_chunk` was empty). Books are now indexed **on demand, per book**: a reader clicks **"Prepare this book for questions"** → `POST /books/{editionId}/index` **atomically claims** the edition (`UPDATE … WHERE rag_status IN (NotIndexed, Failed)` — DB-level dedup, concurrent triggers index once, a Ready book is a no-op so OpenAI is never re-billed), chunks it (`BookChunkingService`, extracted from ingestion + reused), and the existing `ChapterEmbeddingWorker` fills embeddings and flips `rag_status → Ready` once `embedded == chunk`. `GET /books/{editionId}/index` polls progress; the reader shows Prepare → "Preparing… N/M" → Ask. **No bulk indexing** — only books someone actually asks, one-time embed per book, rate-limited (`rag.index`, 20/hr/IP). The misleading message now only appears for a genuinely-indexed book hitting the spoiler gate; un-indexed books show the Prepare CTA. New `editions.rag_status/rag_chunk_count/rag_embedded_count/rag_indexed_at/rag_error` (migration `AddEditionRagIndexState`, with a backfill that marks already-chunked editions Ready at $0). architect → backend + frontend → adversarial QA (P1 double-chunk/re-embed-on-legacy + unmount-poll fixed). 845 unit + 555 web tests green. (Phase 2 = user-uploaded books; Phase 3 = observability.)
Expand Down
49 changes: 48 additions & 1 deletion apps/web/src/api/ask.ts
Original file line number Diff line number Diff line change
@@ -1,10 +1,23 @@
import { authFetch } from './client'
import type { AskResponse } from '@textstack/shared'
import type { RagIndexState } from '../types/api'
import type { RagIndexState, RagIndexStatus } from '../types/api'

export type { AskResponse, AskCitation } from '@textstack/shared'
export type { RagIndexState, RagIndexStatus } from '../types/api'

/**
* Identifies what the "Ask this book" panel is pointed at (AI-027 P2). A catalog `edition`
* routes to `/books/{id}/...`; a user-uploaded `userbook` routes to `/me/books/{id}/...`.
* The reader builds this from whichever book it loaded and threads it through the panel/hooks.
*/
export interface AskTarget {
kind: 'edition' | 'userbook'
id: string
ragStatus?: RagIndexStatus
ragChunkCount?: number
ragEmbeddedCount?: number
}

/**
* On-demand RAG index (AI-027 P1). Reads the current index state for a catalog edition.
* Cookie auth via {@link authFetch}; throws `ApiError` on failure.
Expand Down Expand Up @@ -39,3 +52,37 @@ export function ask(
signal,
})
}

/**
* User-uploaded book variant of {@link getIndexStatus} (AI-027 P2). Owner-scoped via cookie auth.
*/
export function getUserIndexStatus(id: string, signal?: AbortSignal): Promise<RagIndexState> {
return authFetch<RagIndexState>(`/me/books/${id}/index`, { method: 'GET', signal })
}

/**
* User-uploaded book variant of {@link prepareIndex} (AI-027 P2). Owner-scoped.
*/
export function prepareUserIndex(id: string, signal?: AbortSignal): Promise<RagIndexState> {
return authFetch<RagIndexState>(`/me/books/${id}/index`, { method: 'POST', signal })
}

/**
* User-uploaded book variant of {@link ask} (AI-027 P2). No spoiler gate — it's the user's own
* document, so answers draw from the whole book; `currentChapterId` is still passed for citation
* context. Owner-scoped via cookie auth; throws `ApiError` on failure.
*/
export function askUserBook(
id: string,
question: string,
k?: number,
signal?: AbortSignal,
currentChapterId?: string,
): Promise<AskResponse> {
return authFetch<AskResponse>(`/me/books/${id}/ask`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ question, k, ...(currentChapterId ? { currentChapterId } : {}) }),
signal,
})
}
5 changes: 5 additions & 0 deletions apps/web/src/api/userBooks.ts
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
import { authFetch, API_BASE } from './client'
import { trackBookUploaded } from '../lib/analytics'
import type { RagIndexStatus } from '../types/api'

export interface UserBook {
id: string
Expand Down Expand Up @@ -61,6 +62,10 @@ export interface UserBookDetail {
createdAt: string
updatedAt: string
completedAt: string | null
// On-demand RAG index for "Ask this book" (AI-027 P2). Absent on older payloads → NotIndexed.
ragStatus?: RagIndexStatus
ragChunkCount?: number
ragEmbeddedCount?: number
}

export interface UserChapter {
Expand Down
28 changes: 10 additions & 18 deletions apps/web/src/components/reader/AskPanel.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -3,45 +3,37 @@ import { useFocusTrap } from '../../hooks/useFocusTrap'
import { useTranslation } from '../../hooks/useTranslation'
import { useAsk } from '../../hooks/useAsk'
import { useRagIndex } from '../../hooks/useRagIndex'
import type { AskCitation } from '../../api/ask'
import type { RagIndexStatus } from '../../types/api'
import type { AskCitation, AskTarget } from '../../api/ask'

interface Props {
open: boolean
editionId: string
/**
* What this panel asks against (AI-027 P2). Carries the kind (catalog edition vs user upload),
* the id, and the seeded RAG index state/counts so the panel renders correctly with no extra
* fetch and routes status/prepare/ask through the right endpoints.
*/
askTarget: AskTarget
/** GUID of the chapter the user is actively reading — gates the RAG spoiler check. */
currentChapterId?: string
isAuthenticated: boolean
/** Seed RAG index state from `publicBook` so the panel renders correctly with no extra fetch. */
initialRagStatus?: RagIndexStatus
initialChunkCount?: number
initialEmbeddedCount?: number
onSignIn: () => void
onNavigateToCitation: (citation: AskCitation) => void
onClose: () => void
}

export function AskPanel({
open,
editionId,
askTarget,
currentChapterId,
isAuthenticated,
initialRagStatus,
initialChunkCount,
initialEmbeddedCount,
onSignIn,
onNavigateToCitation,
onClose,
}: Props) {
const { t } = useTranslation()
const containerRef = useFocusTrap(open)
const { history, isLoading, error, ask } = useAsk(editionId, currentChapterId)
const { status, chunkCount, embeddedCount, preparing, prepare } = useRagIndex(
editionId,
initialRagStatus,
initialChunkCount,
initialEmbeddedCount,
)
const { history, isLoading, error, ask } = useAsk(askTarget, currentChapterId)
const { status, chunkCount, embeddedCount, preparing, prepare } = useRagIndex(askTarget)
const [input, setInput] = useState('')
const historyRef = useRef<HTMLDivElement>(null)

Expand Down
28 changes: 23 additions & 5 deletions apps/web/src/components/reader/__tests__/AskPanel.test.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -6,21 +6,30 @@ vi.mock('../../../hooks/useTranslation', () => ({
}))
vi.mock('../../../hooks/useFocusTrap', () => ({ useFocusTrap: () => ({ current: null }) }))

const { askState } = vi.hoisted(() => ({
const { askState, useAskSpy } = vi.hoisted(() => ({
askState: { history: [] as unknown[], isLoading: false, error: null as string | null, ask: vi.fn() },
useAskSpy: vi.fn(),
}))
vi.mock('../../../hooks/useAsk', () => ({
useAsk: (...args: unknown[]) => { useAskSpy(...args); return askState },
}))
vi.mock('../../../hooks/useAsk', () => ({ useAsk: () => askState }))

const { ragState } = vi.hoisted(() => ({
const { ragState, useRagIndexSpy } = vi.hoisted(() => ({
ragState: { status: 'Ready', chunkCount: 0, embeddedCount: 0, preparing: false, prepare: vi.fn() },
useRagIndexSpy: vi.fn(),
}))
vi.mock('../../../hooks/useRagIndex', () => ({
useRagIndex: (...args: unknown[]) => { useRagIndexSpy(...args); return ragState },
}))
vi.mock('../../../hooks/useRagIndex', () => ({ useRagIndex: () => ragState }))

import { AskPanel } from '../AskPanel'
import type { AskTarget } from '../../../api/ask'

const editionTarget: AskTarget = { kind: 'edition', id: 'ed-1' }

const baseProps = {
open: true,
editionId: 'ed-1',
askTarget: editionTarget,
onSignIn: vi.fn(),
onNavigateToCitation: vi.fn(),
onClose: vi.fn(),
Expand All @@ -33,6 +42,8 @@ afterEach(() => {
ragState.chunkCount = 0
ragState.embeddedCount = 0
ragState.prepare = vi.fn()
useAskSpy.mockReset()
useRagIndexSpy.mockReset()
})

describe('AskPanel', () => {
Expand Down Expand Up @@ -76,6 +87,13 @@ describe('AskPanel', () => {
expect(prepare).toHaveBeenCalled()
})

it('threads a userbook askTarget through to both hooks (AI-027 P2)', () => {
const userTarget: AskTarget = { kind: 'userbook', id: 'ub-1', ragStatus: 'NotIndexed' }
render(<AskPanel {...baseProps} askTarget={userTarget} isAuthenticated={true} />)
expect(useRagIndexSpy).toHaveBeenCalledWith(userTarget)
expect(useAskSpy).toHaveBeenCalledWith(userTarget, undefined)
})

it('renders a citation chip and navigates on click', () => {
const citation = { marker: 1, chunkId: 'c1', chapterId: 'ch1', chapterOrd: 4, charStart: 0, charEnd: 1, preview: 'snippet' }
askState.history = [{ question: 'q', answer: 'a [1]', citations: [citation], insufficient: false }]
Expand Down
35 changes: 27 additions & 8 deletions apps/web/src/hooks/useAsk.test.ts
Original file line number Diff line number Diff line change
@@ -1,22 +1,29 @@
import { describe, it, expect, vi, beforeEach } from 'vitest'
import { renderHook, act, waitFor } from '@testing-library/react'

vi.mock('../api/ask', () => ({ ask: vi.fn() }))
import { ask as askApi } from '../api/ask'
vi.mock('../api/ask', () => ({ ask: vi.fn(), askUserBook: vi.fn() }))
import { ask as askApi, askUserBook as askUserBookApi, type AskTarget } from '../api/ask'
import { useAsk } from './useAsk'

const mockAsk = askApi as unknown as ReturnType<typeof vi.fn>
const mockAskUserBook = askUserBookApi as unknown as ReturnType<typeof vi.fn>

const edition: AskTarget = { kind: 'edition', id: 'ed-1' }
const userbook: AskTarget = { kind: 'userbook', id: 'ub-1' }

const citation = {
marker: 1, chunkId: 'c1', chapterId: 'ch1', chapterOrd: 2, charStart: 0, charEnd: 5, preview: 'preview',
}

describe('useAsk', () => {
beforeEach(() => mockAsk.mockReset())
beforeEach(() => {
mockAsk.mockReset()
mockAskUserBook.mockReset()
})

it('appends a turn on success', async () => {
mockAsk.mockResolvedValueOnce({ answer: 'Because [1].', citations: [citation], lastReadOrd: 3, insufficient: false })
const { result } = renderHook(() => useAsk('ed-1'))
const { result } = renderHook(() => useAsk(edition))

await act(() => result.current.ask('why?'))

Expand All @@ -28,7 +35,7 @@ describe('useAsk', () => {

it('sets error and keeps history empty on failure', async () => {
mockAsk.mockRejectedValueOnce(new Error('boom'))
const { result } = renderHook(() => useAsk('ed-1'))
const { result } = renderHook(() => useAsk(edition))

await act(() => result.current.ask('why?'))

Expand All @@ -38,7 +45,7 @@ describe('useAsk', () => {

it('flags insufficient turns', async () => {
mockAsk.mockResolvedValueOnce({ answer: 'Read more first.', citations: [], lastReadOrd: 0, insufficient: true })
const { result } = renderHook(() => useAsk('ed-1'))
const { result } = renderHook(() => useAsk(edition))

await act(() => result.current.ask('why?'))

Expand All @@ -48,17 +55,29 @@ describe('useAsk', () => {

it('forwards currentChapterId to the api when provided', async () => {
mockAsk.mockResolvedValueOnce({ answer: 'ok', citations: [], lastReadOrd: 0, insufficient: false })
const { result } = renderHook(() => useAsk('ed-1', 'chap-guid-4'))
const { result } = renderHook(() => useAsk(edition, 'chap-guid-4'))

await act(() => result.current.ask('why?'))

await waitFor(() => expect(mockAsk).toHaveBeenCalled())
expect(mockAsk).toHaveBeenCalledWith('ed-1', 'why?', undefined, expect.anything(), 'chap-guid-4')
})

it('no-ops without an editionId', async () => {
it('routes a userbook target to askUserBook (not the catalog ask)', async () => {
mockAskUserBook.mockResolvedValueOnce({ answer: 'ok', citations: [], lastReadOrd: 0, insufficient: false })
const { result } = renderHook(() => useAsk(userbook, 'chap-guid-9'))

await act(() => result.current.ask('why?'))

await waitFor(() => expect(mockAskUserBook).toHaveBeenCalled())
expect(mockAskUserBook).toHaveBeenCalledWith('ub-1', 'why?', undefined, expect.anything(), 'chap-guid-9')
expect(mockAsk).not.toHaveBeenCalled()
})

it('no-ops without a target', async () => {
const { result } = renderHook(() => useAsk(undefined))
await act(() => result.current.ask('why?'))
expect(mockAsk).not.toHaveBeenCalled()
expect(mockAskUserBook).not.toHaveBeenCalled()
})
})
16 changes: 11 additions & 5 deletions apps/web/src/hooks/useAsk.ts
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
import { useState, useCallback, useRef, useEffect } from 'react'
import { ask as askApi, type AskCitation } from '../api/ask'
import { ask as askApi, askUserBook as askUserBookApi, type AskCitation, type AskTarget } from '../api/ask'
import { ApiError } from '../api/client'

export interface AskTurn {
Expand All @@ -12,8 +12,13 @@ export interface AskTurn {
/**
* Session "Ask this book" state (AI-026a): an in-memory Q&A history (not persisted), plus loading
* and error. `ask` appends a turn; in-flight requests are aborted on a new question / unmount.
*
* `target.kind` (AI-027 P2) routes the POST — catalog `/books/{id}/ask` vs user-upload
* `/me/books/{id}/ask`.
*/
export function useAsk(editionId: string | undefined, currentChapterId?: string) {
export function useAsk(target: AskTarget | undefined, currentChapterId?: string) {
const id = target?.id
const kind = target?.kind
const [history, setHistory] = useState<AskTurn[]>([])
const [isLoading, setIsLoading] = useState(false)
const [error, setError] = useState<string | null>(null)
Expand All @@ -24,7 +29,7 @@ export function useAsk(editionId: string | undefined, currentChapterId?: string)
const ask = useCallback(
async (question: string) => {
const q = question.trim()
if (!q || !editionId || isLoading) return
if (!q || !id || isLoading) return

abortRef.current?.abort()
const ctrl = new AbortController()
Expand All @@ -33,7 +38,8 @@ export function useAsk(editionId: string | undefined, currentChapterId?: string)
setError(null)

try {
const res = await askApi(editionId, q, undefined, ctrl.signal, currentChapterId)
const fn = kind === 'userbook' ? askUserBookApi : askApi
const res = await fn(id, q, undefined, ctrl.signal, currentChapterId)
setHistory(prev => [
...prev,
{ question: q, answer: res.answer, citations: res.citations, insufficient: res.insufficient },
Expand All @@ -46,7 +52,7 @@ export function useAsk(editionId: string | undefined, currentChapterId?: string)
if (abortRef.current === ctrl) setIsLoading(false)
}
},
[editionId, isLoading, currentChapterId],
[id, kind, isLoading, currentChapterId],
)

return { history, isLoading, error, ask }
Expand Down
Loading
Loading