feat: llm exploration by ArtemisMucaj · Pull Request #105 · ArtemisMucaj/codesearch

ArtemisMucaj · 2026-03-05T16:10:53Z

Summary by CodeRabbit

Release Notes

New Features
- Added "Explain" command to analyze code symbols with AI-powered insights (supports Anthropic and OpenAI backend selection)
- Added ability to retrieve code chunks by file path
Improvements
- Impact analysis now performs unrestricted traversal without depth limitations
- Simplified Impact command by removing the depth parameter

Introduces `codesearch explain <symbol>` which combines impact analysis with an LLM to produce a structured explanation of a symbol's purpose, data flow, and business requirements. How it works: - Runs BFS impact analysis (same engine as `impact`) to build the call graph up to a configurable depth (default: 3). - Resolves the root symbol's source file via its callees, then reads a 40-line window around each reference site directly from disk. - Sends a structured Markdown prompt — root source + depth-1 caller sources + summary of deeper nodes — to the configured LLM backend. - Supports both Anthropic (/v1/messages) and OpenAI-compatible (/v1/chat/completions, e.g. LM Studio) endpoints via `--llm`. - Opens the database read-only (no write lock) so it can run alongside concurrent indexing or search processes. New files: src/connector/api/controller/explain_controller.rs Changed files: src/cli/mod.rs — Explain command variant src/connector/api/controller/mod.rs — export ExplainController src/connector/api/router.rs — route Commands::Explain src/main.rs — read-only flag for Explain Usage: codesearch explain authenticate codesearch explain MyStruct::new --depth 5 --llm open-ai codesearch explain process_payment --repository my-repo https://claude.ai/code/session_01494XuGs5Ez5SvRxtaN8dFy

Add a new trait method for retrieving all chunks for a given file path without performing a similarity search. Useful for snippet-lookup use cases (e.g. TUI reference navigation). - Trait default: no-op returning empty vec for backwards compatibility - DuckDB adapter: SQL query with optional repository_id filter, ordered by start_line - InMemory adapter: HashMap filter with matching sort

Align method ordering in DuckdbVectorRepository impl block with the convention established in claude/fix-duckdb-tui-state-scA0C: flush and count precede find_chunks_by_file.

The enum controls which LLM provider is used for any LLM call (query expansion, explain command, etc.), so the narrower name was misleading. - cli: QueryExpansionTarget → LlmTarget - container: query_expansion_target field → llm_target - main: expand_query_target field → llm_target - lib: update re-export - explain_controller: update import and match arms

Drop the max_depth cap so the call graph walk runs until all reachable callers are visited. Removes the --depth flag from the impact and explain CLI commands and the depth field from the MCP ImpactToolInput schema.

Replace the flat depth-1-only snippet approach with path-based source gathering. reconstruct_paths() traces each leaf back to the root via via_symbol links (same algorithm as ImpactController), producing one Vec<&ImpactNode> per call chain ordered outermost-caller-first. build_prompt() then: - Collects unique symbols across all paths and reads a source window for each (capped at MAX_UNIQUE_SYMBOLS_WITH_SOURCE=20 to bound prompt size; no depth limit is applied to the traversal itself). - Renders every path as a chain header (A → B → … → root_symbol) with an inline source block per node, giving the LLM full context for each call chain rather than just the first five depth-1 callers.

Read source for every unique symbol across all call paths with no artificial limit. The exploration is now fully unbounded end to end.

coderabbitai · 2026-03-05T16:11:14Z

📝 Walkthrough

Walkthrough

The PR introduces a file-scoped chunk lookup capability via a new VectorRepository method, removes depth constraints from impact analysis traversal, renames the LLM-provider enum for generality, adds a new Explain command with an accompanying ExplainController for LLM-based call-flow analysis, and removes depth parameters from the CLI Impact command and related method signatures.

Changes

Cohort / File(s)	Summary
Vector Repository Interface & Adapters `src/application/interfaces/vector_repository.rs`, `src/connector/adapter/duckdb_vector_repository.rs`, `src/connector/adapter/in_memory_vector_repository.rs`	Added new `find_chunks_by_file` method to retrieve code chunks filtered by file path. DuckDB adapter uses conditional SQL queries based on repository_id presence; in-memory adapter filters and sorts by start_line.
Impact Analysis & Traversal `src/application/use_cases/impact_analysis.rs`, `src/connector/adapter/mcp/server.rs`, `src/connector/api/controller/impact_controller.rs`	Removed `max_depth` parameter from analyze method signature and all call sites, eliminating traversal depth limits. Depth field also removed from ImpactToolInput struct.
CLI & Command Definitions `src/cli/mod.rs`, `src/lib.rs`, `src/main.rs`	Renamed `QueryExpansionTarget` enum to `LlmTarget`; removed `depth` field from Impact command; added new Explain command variant with symbol, repository, and llm fields; updated public re-exports.
LLM Configuration & Container `src/connector/api/container.rs`	Updated ContainerConfig field from `query_expansion_target` to `llm_target` and adjusted routing logic to match renamed enum variants.
Explain Feature `src/connector/api/controller/explain_controller.rs`, `src/connector/api/controller/mod.rs`	New ExplainController (257 lines) implementing call-flow analysis via LLM. Includes impact analysis, path reconstruction, source window extraction, and prompt construction with call path details. Also introduced helper functions for path traversal and prompt building.
Router & Command Dispatch `src/connector/api/router.rs`	Added ExplainController field to Router; integrated new Commands::Explain branch to route explain requests; removed depth argument from Impact command dispatch.

Sequence Diagram

sequenceDiagram
    participant CLI as CLI/Router
    participant EC as ExplainController
    participant IA as ImpactAnalysis
    participant VR as VectorRepository
    participant LLM as LLM Service
    
    CLI->>EC: explain(symbol, repo, llm_target)
    EC->>IA: analyze(symbol, repo)
    IA->>VR: query for symbol calls
    VR-->>IA: return call graph
    IA-->>EC: return ImpactAnalysis
    
    alt No callers found
        EC-->>CLI: return "no callers" message
    else Callers found
        EC->>VR: find_chunks_by_file(repo, file)
        VR-->>EC: return code chunks
        EC->>EC: reconstruct_paths(impact_nodes)
        EC->>EC: build_prompt(symbol, paths, sources)
        EC->>LLM: query with prompt + system context
        LLM-->>EC: return explanation
        EC->>EC: format with metrics
        EC-->>CLI: return formatted explanation
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

PR #87: Modifies impact analysis traversal logic in src/application/use_cases/impact_analysis.rs (augments ImpactNode with line and via_symbol fields) — directly related to the depth-removal and traversal changes.
PR #99: Updates CLI enum from QueryExpansionTarget to a more generic target and modifies ContainerConfig — shares the same enum rename and field refactoring pattern.
PR #48: Adds initial max_depth parameter to ImpactAnalysisUseCase::analyze and modifies VectorRepository — directly superseded by the depth-removal in this PR.

Poem

🐰 A new Explain sprouts forth with paths so clear,
Depth limits shed, traversal roams without fear,
LlmTarget's name more broadly does apply,
Chunks by file now fetchable on high,
The call-flow whispers secrets to the sky! 🌙

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name	Status	Explanation	Resolution
Title check	❓ Inconclusive	The title 'feat: llm exploration' is vague and generic, using non-descriptive terms that don't convey the specific changes made in the pull request.	Consider using a more descriptive title that captures the main feature being added, such as 'feat: add explain controller for LLM-based code analysis' or 'feat: introduce symbol explanation with LLM integration'.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch claude/llm-onnx-exploration-LWsBv

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/connector/api/controller/explain_controller.rs`:
- Around line 139-143: The current maps (node_by_depth_symbol and the similar
map at lines ~199-213) use keys of (node.depth, node.symbol) which can collide
across files/repos; change the key to include a unique source identifier (e.g.
node.source_id, node.file_path, repo_id or node.id) so keys become (node.depth,
node.symbol.as_str(), node.source_id.as_str()) or equivalent unique field;
update the HashMap type signatures and all .entry(...) calls and lookups (both
where node_by_depth_symbol is built and in the later map at ~199-213) to use
that expanded tuple key so path reconstruction and snippet lookup use the full
identity rather than symbol+depth only.
- Around line 75-79: The call to
call_graph.find_callees(...).await.unwrap_or_default() in explain_controller.rs
is swallowing errors; change it to capture the Result, log the error with
tracing::warn! or tracing::error! (including the error value and context like
analysis.root_symbol and cg_query) and then fallback to an empty Vec only after
logging; locate the invocation around the variable callees (the
call_graph.find_callees call) and replace the unwrap_or_default() with explicit
error handling (e.g., match or if let Err(e) = ...) that logs the error before
assigning the default.
- Around line 252-256: Clamp the computed center index to the valid range before
slicing: when computing `center` from `center_line` (the existing `let center =
center_line.saturating_sub(1) as usize`), replace that with a clamped value
using `lines.len().saturating_sub(1)` to ensure `center <= last_index`; then
recompute `start` and `end` (as you already do) and add an early-return check
(e.g., return None) if `start >= end` to avoid slicing with out-of-range bounds
in the function that builds the source window (the block using `center`, `half`,
`start`, `end`, and `lines[start..end].join("\n")`).

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: fd915bd3-c7e4-43de-bab1-d2203c0313a0

📥 Commits

Reviewing files that changed from the base of the PR and between 8230ffd and 635ee6e.

📒 Files selected for processing (13)

src/application/interfaces/vector_repository.rs
src/application/use_cases/impact_analysis.rs
src/cli/mod.rs
src/connector/adapter/duckdb_vector_repository.rs
src/connector/adapter/in_memory_vector_repository.rs
src/connector/adapter/mcp/server.rs
src/connector/api/container.rs
src/connector/api/controller/explain_controller.rs
src/connector/api/controller/impact_controller.rs
src/connector/api/controller/mod.rs
src/connector/api/router.rs
src/lib.rs
src/main.rs

💤 Files with no reviewable changes (1)

src/application/use_cases/impact_analysis.rs

src/connector/api/controller/explain_controller.rs

…urce window - Expand node_by_depth_symbol key from (depth, symbol) to (depth, symbol, repository_id) so the same symbol name at the same depth in different repositories no longer clobbers each other. - Expand source_cache and dedup seen-set keys from symbol to (symbol, file_path) for the same reason. - Replace unwrap_or_default() on find_callees with explicit match that logs the error via tracing::warn! before falling back to an empty Vec. - Clamp center index to lines.len()-1 in read_source_window and add an early None return when start >= end to prevent an out-of-range slice panic when a stored line number exceeds the actual file length.

claude added 7 commits March 5, 2026 09:15

refactor: reorder find_chunks_by_file after flush and count

8dd2492

Align method ordering in DuckdbVectorRepository impl block with the convention established in claude/fix-duckdb-tui-state-scA0C: flush and count precede find_chunks_by_file.

feat: remove depth limit from BFS traversal in impact analysis

daf599f

Drop the max_depth cap so the call graph walk runs until all reachable callers are visited. Removes the --depth flag from the impact and explain CLI commands and the depth field from the MCP ImpactToolInput schema.

refactor: remove unique-symbol source cap in explain prompt builder

635ee6e

Read source for every unique symbol across all call paths with no artificial limit. The exploration is now fully unbounded end to end.

coderabbitai bot reviewed Mar 5, 2026

View reviewed changes

src/connector/api/controller/explain_controller.rs Outdated Show resolved Hide resolved

src/connector/api/controller/explain_controller.rs Outdated Show resolved Hide resolved

src/connector/api/controller/explain_controller.rs Outdated Show resolved Hide resolved

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: llm exploration#105

feat: llm exploration#105
ArtemisMucaj wants to merge 8 commits intomainfrom
claude/llm-onnx-exploration-LWsBv

ArtemisMucaj commented Mar 5, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Mar 5, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 inconclusive)

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ArtemisMucaj commented Mar 5, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai bot commented Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 inconclusive)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ArtemisMucaj commented Mar 5, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 5, 2026 •

edited

Loading