diff --git a/docs/configuration.md b/docs/configuration.md index 6f04882..2d128a7 100644 --- a/docs/configuration.md +++ b/docs/configuration.md @@ -156,4 +156,4 @@ Common variables (see `app.rag.*` defaults in `application.properties`): - `RAG_TOP_K` - `RAG_RETURN_K` - `RAG_CITATIONS_K` -- `RAG_RERANKER_TIMEOUT` (default `12s`) — timeout budget for LLM reranking calls +- `RAG_RERANKER_TIMEOUT` (default `30s`) — timeout budget for LLM reranking calls diff --git a/docs/retrieval-pipeline.md b/docs/retrieval-pipeline.md index cccd94a..6dd6aae 100644 --- a/docs/retrieval-pipeline.md +++ b/docs/retrieval-pipeline.md @@ -172,7 +172,7 @@ Each document is presented as `[index] title | url` followed by the first 500 ch - Model: same provider as chat (OpenAI/GitHub Models) - Temperature: `0.0` (deterministic) -- Timeout: configurable via `app.rag.reranker-timeout` (default 12s) +- Timeout: configurable via `app.rag.reranker-timeout` (default 30s) - Response format: `{"order": [0, 3, 1, 2, ...]}` (0-based indices) ### Response parsing @@ -292,7 +292,7 @@ All four must be non-blank and distinct (validated on startup). |---|---|---|---| | `app.rag.search-top-k` | `12` | Must be > 0 | Candidates fetched from hybrid search before reranking | | `app.rag.search-return-k` | `6` | Must be > 0, must be <= `search-top-k` | Results returned to the LLM after reranking | -| `app.rag.reranker-timeout` | `12s` | Must be positive | Timeout for LLM reranking call | +| `app.rag.reranker-timeout` | `30s` | Must be positive | Timeout for LLM reranking call | | `app.rag.search-citations` | `3` | Must be >= 0 | Citation references included in the response | | `app.rag.search-mmr-lambda` | `0.5` | Must be in [0.0, 1.0] | MMR lambda (higher = relevance, lower = diversity) | | `app.rag.chunk-max-tokens` | `900` | Must be > 0 | Max tokens per ingested chunk (see [ingestion.md](ingestion.md)) |