The chat UI currently waits for the full Gemini response before displaying anything, which for a multi-paragraph answer is 3–8 seconds of spinner. Every production chat app streams. Streaming with SSE is far cheaper than migrating to WebSockets.
Current state:
/chat blocks until Gemini returns, then returns the full answer.
Proposed implementation:
- Switch Gemini call to
streamGenerateContent endpoint.
- Flask route returns
Response(generator, mimetype='text/event-stream') emitting data: {token}\n\n chunks.
- Frontend uses
EventSource (or fetch + ReadableStream, because EventSource is GET-only and the query is in the POST body — easier to POST and parse chunks manually).
- Source-attribution footer is sent as a final SSE event after the tokens.
Files likely affected:
app.py (chat route)
backend/rag_utils.py (LLMIntegration.generate_answer becomes a generator)
frontend/src/components/ChatInterface.js
frontend/src/utils/api.js
Acceptance criteria:
- First token visible in the browser within ~500 ms of sending the question.
- Sources and
chunks_found still render correctly after the stream completes.
- Aborting the request (user navigates away) closes the upstream Gemini connection.
The chat UI currently waits for the full Gemini response before displaying anything, which for a multi-paragraph answer is 3–8 seconds of spinner. Every production chat app streams. Streaming with SSE is far cheaper than migrating to WebSockets.
Current state:
/chatblocks until Gemini returns, then returns the full answer.Proposed implementation:
streamGenerateContentendpoint.Response(generator, mimetype='text/event-stream')emittingdata: {token}\n\nchunks.EventSource(orfetch+ReadableStream, becauseEventSourceis GET-only and the query is in the POST body — easier to POST and parse chunks manually).Files likely affected:
app.py(chat route)backend/rag_utils.py(LLMIntegration.generate_answerbecomes a generator)frontend/src/components/ChatInterface.jsfrontend/src/utils/api.jsAcceptance criteria:
chunks_foundstill render correctly after the stream completes.