Conversation
Introduces `codesearch explain <symbol>` which combines impact analysis with an LLM to produce a structured explanation of a symbol's purpose, data flow, and business requirements. How it works: - Runs BFS impact analysis (same engine as `impact`) to build the call graph up to a configurable depth (default: 3). - Resolves the root symbol's source file via its callees, then reads a 40-line window around each reference site directly from disk. - Sends a structured Markdown prompt — root source + depth-1 caller sources + summary of deeper nodes — to the configured LLM backend. - Supports both Anthropic (/v1/messages) and OpenAI-compatible (/v1/chat/completions, e.g. LM Studio) endpoints via `--llm`. - Opens the database read-only (no write lock) so it can run alongside concurrent indexing or search processes. New files: src/connector/api/controller/explain_controller.rs Changed files: src/cli/mod.rs — Explain command variant src/connector/api/controller/mod.rs — export ExplainController src/connector/api/router.rs — route Commands::Explain src/main.rs — read-only flag for Explain Usage: codesearch explain authenticate codesearch explain MyStruct::new --depth 5 --llm open-ai codesearch explain process_payment --repository my-repo https://claude.ai/code/session_01494XuGs5Ez5SvRxtaN8dFy
Add a new trait method for retrieving all chunks for a given file path without performing a similarity search. Useful for snippet-lookup use cases (e.g. TUI reference navigation). - Trait default: no-op returning empty vec for backwards compatibility - DuckDB adapter: SQL query with optional repository_id filter, ordered by start_line - InMemory adapter: HashMap filter with matching sort
Align method ordering in DuckdbVectorRepository impl block with the convention established in claude/fix-duckdb-tui-state-scA0C: flush and count precede find_chunks_by_file.
The enum controls which LLM provider is used for any LLM call (query expansion, explain command, etc.), so the narrower name was misleading. - cli: QueryExpansionTarget → LlmTarget - container: query_expansion_target field → llm_target - main: expand_query_target field → llm_target - lib: update re-export - explain_controller: update import and match arms
Drop the max_depth cap so the call graph walk runs until all reachable callers are visited. Removes the --depth flag from the impact and explain CLI commands and the depth field from the MCP ImpactToolInput schema.
Replace the flat depth-1-only snippet approach with path-based source gathering. reconstruct_paths() traces each leaf back to the root via via_symbol links (same algorithm as ImpactController), producing one Vec<&ImpactNode> per call chain ordered outermost-caller-first. build_prompt() then: - Collects unique symbols across all paths and reads a source window for each (capped at MAX_UNIQUE_SYMBOLS_WITH_SOURCE=20 to bound prompt size; no depth limit is applied to the traversal itself). - Renders every path as a chain header (A → B → … → root_symbol) with an inline source block per node, giving the LLM full context for each call chain rather than just the first five depth-1 callers.
Read source for every unique symbol across all call paths with no artificial limit. The exploration is now fully unbounded end to end.
📝 WalkthroughWalkthroughThe PR introduces a file-scoped chunk lookup capability via a new VectorRepository method, removes depth constraints from impact analysis traversal, renames the LLM-provider enum for generality, adds a new Explain command with an accompanying ExplainController for LLM-based call-flow analysis, and removes depth parameters from the CLI Impact command and related method signatures. Changes
Sequence DiagramsequenceDiagram
participant CLI as CLI/Router
participant EC as ExplainController
participant IA as ImpactAnalysis
participant VR as VectorRepository
participant LLM as LLM Service
CLI->>EC: explain(symbol, repo, llm_target)
EC->>IA: analyze(symbol, repo)
IA->>VR: query for symbol calls
VR-->>IA: return call graph
IA-->>EC: return ImpactAnalysis
alt No callers found
EC-->>CLI: return "no callers" message
else Callers found
EC->>VR: find_chunks_by_file(repo, file)
VR-->>EC: return code chunks
EC->>EC: reconstruct_paths(impact_nodes)
EC->>EC: build_prompt(symbol, paths, sources)
EC->>LLM: query with prompt + system context
LLM-->>EC: return explanation
EC->>EC: format with metrics
EC-->>CLI: return formatted explanation
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/connector/api/controller/explain_controller.rs`:
- Around line 139-143: The current maps (node_by_depth_symbol and the similar
map at lines ~199-213) use keys of (node.depth, node.symbol) which can collide
across files/repos; change the key to include a unique source identifier (e.g.
node.source_id, node.file_path, repo_id or node.id) so keys become (node.depth,
node.symbol.as_str(), node.source_id.as_str()) or equivalent unique field;
update the HashMap type signatures and all .entry(...) calls and lookups (both
where node_by_depth_symbol is built and in the later map at ~199-213) to use
that expanded tuple key so path reconstruction and snippet lookup use the full
identity rather than symbol+depth only.
- Around line 75-79: The call to
call_graph.find_callees(...).await.unwrap_or_default() in explain_controller.rs
is swallowing errors; change it to capture the Result, log the error with
tracing::warn! or tracing::error! (including the error value and context like
analysis.root_symbol and cg_query) and then fallback to an empty Vec only after
logging; locate the invocation around the variable callees (the
call_graph.find_callees call) and replace the unwrap_or_default() with explicit
error handling (e.g., match or if let Err(e) = ...) that logs the error before
assigning the default.
- Around line 252-256: Clamp the computed center index to the valid range before
slicing: when computing `center` from `center_line` (the existing `let center =
center_line.saturating_sub(1) as usize`), replace that with a clamped value
using `lines.len().saturating_sub(1)` to ensure `center <= last_index`; then
recompute `start` and `end` (as you already do) and add an early-return check
(e.g., return None) if `start >= end` to avoid slicing with out-of-range bounds
in the function that builds the source window (the block using `center`, `half`,
`start`, `end`, and `lines[start..end].join("\n")`).
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: fd915bd3-c7e4-43de-bab1-d2203c0313a0
📒 Files selected for processing (13)
src/application/interfaces/vector_repository.rssrc/application/use_cases/impact_analysis.rssrc/cli/mod.rssrc/connector/adapter/duckdb_vector_repository.rssrc/connector/adapter/in_memory_vector_repository.rssrc/connector/adapter/mcp/server.rssrc/connector/api/container.rssrc/connector/api/controller/explain_controller.rssrc/connector/api/controller/impact_controller.rssrc/connector/api/controller/mod.rssrc/connector/api/router.rssrc/lib.rssrc/main.rs
💤 Files with no reviewable changes (1)
- src/application/use_cases/impact_analysis.rs
…urce window - Expand node_by_depth_symbol key from (depth, symbol) to (depth, symbol, repository_id) so the same symbol name at the same depth in different repositories no longer clobbers each other. - Expand source_cache and dedup seen-set keys from symbol to (symbol, file_path) for the same reason. - Replace unwrap_or_default() on find_callees with explicit match that logs the error via tracing::warn! before falling back to an empty Vec. - Clamp center index to lines.len()-1 in read_source_window and add an early None return when start >= end to prevent an out-of-range slice panic when a stored line number exceeds the actual file length.
Summary by CodeRabbit
Release Notes
New Features
Improvements