Skip to content

Conversation

@Cprakhar
Copy link

@Cprakhar Cprakhar commented Jan 20, 2026

Summary

Convert embedded LLM file citations (e.g. @file:{github.com/owner/repo::path/to/file.ext:10-20})
into portable, clickable URLs that point to a file on the originating code host.

This change ensures links emitted by the model are usable by users (not local paths).

Closes #576

What I changed

  • Updated link conversion logic in packages/web/src/features/chat/utils.ts
    • Added buildCodeHostFileUrl() to construct file URLs for GitHub, GitLab, Bitbucket, Azure DevOps, Gitea, Gerrit and generic git hosts.
    • convertLLMOutputToPortableMarkdown() now accepts an optional sources array so it can use the indexed revision for a file when available, falling back to main.
  • Added/updated tests in packages/web/src/features/chat/utils.test.ts covering multiple hosts, line anchors, ranges, .md/.mdx handling, and branch resolution from sources.

Why

LLM responses embed file references for traceability. Previously those produced local paths (e.g., /path/to/file), which are not portable when copying as Markdown. This change converts them into full remote URLs matching the code host's expected blob/src format and preserves line anchors.

Notes on platform support

Supported hosts and behaviors:

  • GitHub (github.com and GH Enterprise): .../blob/{branch}/{path}#L{n} (adds ?plain=1 for raw markdown views where appropriate)
  • GitLab: .../-/blob/{branch}/{path}#L{n}
  • Bitbucket: .../src/{branch}/{path}#lines-{n}
  • Azure DevOps: .../_git/{repo}?path=/{path}&version=GB{branch}&line={n}
  • Gitea, Gerrit, generic git hosts: reasonable fallbacks supported

If a file source (from the LLM retrieval metadata) is provided, the code will prefer the revision from that source to construct the URL instead of defaulting to main.

Summary by CodeRabbit

  • New Features

    • Answer cards can now include and pass along file-based sources so file references are resolved more accurately.
    • Markdown conversion now resolves file references to full code-host URLs across many providers and honors revisions and line/range info.
  • Tests

    • Extensive tests added for link generation and markdown conversion across multiple code hosts and edge cases.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Jan 20, 2026

Walkthrough

Adds generation of absolute code-host file URLs for file references in assistant answers: new buildCodeHostFileUrl helper, convertLLMOutputToPortableMarkdown now accepts optional sources to resolve revisions, and AnswerCard components propagate sources so copied Markdown uses absolute links.

Changes

Cohort / File(s) Summary
Component prop threading
packages/web/src/features/chat/components/chatThread/answerCard.tsx, packages/web/src/features/chat/components/chatThread/chatThreadListItem.tsx
Added sources?: FileSource[] prop to AnswerCard, passed filtered file sources from chatThreadListItem, and updated copy logic to call convertLLMOutputToPortableMarkdown(answerText, { sources }).
Core utility enhancements
packages/web/src/features/chat/utils.ts
Added buildCodeHostFileUrl(repo, fileName, revision, startLine?, endLine?) and extended convertLLMOutputToPortableMarkdown(text, { sources? }) to resolve file references to absolute code-host URLs using provided sources and inferred revisions.
Tests and exports
packages/web/src/features/chat/utils.test.ts
Exported and exercised convertLLMOutputToPortableMarkdown and buildCodeHostFileUrl; added comprehensive tests covering multiple code hosts, branches, line/range handling, encoding, and edge cases.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant User as User
  participant UI as AnswerCard (UI)
  participant Utils as convertLLMOutputToPortableMarkdown
  participant Builder as buildCodeHostFileUrl
  participant Host as CodeHost

  User->>UI: Click "Copy answer"
  UI->>Utils: convertLLMOutputToPortableMarkdown(answerText, { sources })
  Utils->>Builder: buildCodeHostFileUrl(repo, filePath, revision, start, end)
  Builder->>Host: Construct URL for provider (GitHub/GitLab/...)
  Host-->>Builder: URL string
  Builder-->>Utils: resolved URL(s)
  Utils-->>UI: portable Markdown with absolute links
  UI-->>User: clipboard contains converted Markdown
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested reviewers

  • brendan-kellam
  • msukkari
🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title clearly and accurately describes the main change: converting embedded file links to full code-host URLs.
Linked Issues check ✅ Passed Changes fully implement issue #576 objectives: file links are resolved to absolute URLs with host-specific formats, line references preserved, and robust across multiple code hosts.
Out of Scope Changes check ✅ Passed All changes are directly scoped to implementing absolute URL resolution for embedded file links; no unrelated modifications detected.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@brendan-kellam
Copy link
Contributor

@coderabbitai review

@coderabbitai
Copy link

coderabbitai bot commented Jan 20, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@packages/web/src/features/chat/utils.ts`:
- Around line 403-412: Normalize the referenced file path before matching and
URL construction: strip any leading slashes (or run a POSIX normalize) on
fileName before using it to find matchingSource in options?.sources and before
calling buildCodeHostFileUrl so revision lookup doesn’t miss entries when paths
include a leading slash; update the use of fileName in the matching logic and
the url call (references: matchingSource, options?.sources, fileName, repo,
revision, buildCodeHostFileUrl).
- Around line 260-365: The buildCodeHostFileUrl function constructs URLs by
interpolating raw filePath, revision and ownerRepo which can break for spaces or
special chars; update buildCodeHostFileUrl to percent-encode path and revision
components before interpolation (e.g., encode each segment of filePath and
ownerRepo via split('/').map(encodeURIComponent).join('/'), and encode revision
with encodeURIComponent) and ensure query parameter values used for Azure DevOps
(version, path, line, lineEnd) are encoded with encodeURIComponent; keep
line/anchor logic (startLine/endLine) intact but do not percent-encode the
leading “#L” or “:”-style anchors themselves.

@Cprakhar
Copy link
Author

@coderabbitai review

@coderabbitai
Copy link

coderabbitai bot commented Jan 22, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@packages/web/src/features/chat/utils.ts`:
- Around line 357-362: The Gerrit URL generation currently always prefixes the
revision with "refs/heads/", which breaks when the revision is a commit SHA or a
tag; update the host.includes('gerrit') branch in the URL builder to inspect
encodedRevision and choose the correct path: if encodedRevision matches a commit
SHA (e.g. 40 hex chars) use `/+/{encodedRevision}/{encodedFilePath}`, if it
represents a tag (e.g. starts with `refs/tags/` or otherwise detected as a tag)
use `/+/refs/tags/{tag}/{encodedFilePath}`, otherwise use
`/+/refs/heads/{encodedRevision}/{encodedFilePath}`; preserve the existing
startLine fragment logic (append `#${startLine}` when startLine is present) and
update any variable names (encodedRevision, encodedFilePath, startLine)
accordingly.
- Around line 330-347: The Azure DevOps branch/file query parameters are being
built using encodePathComponent (which preserves slashes), so values like
encodedFilePath and encodedRevision produce unencoded '/' in query params;
update the code that builds url (the Azure DevOps branch in the
host.includes('dev.azure.com') || host.includes('visualstudio.com') block) to
use full URL encoding for query parameter values (e.g., use encodeURIComponent
on the file path and revision instead of encodePathComponent) so that
encodedFilePath and encodedRevision become safe for query strings (slashes
become %2F) and then rebuild the url variable accordingly, preserving the
existing org/project/repoName assembly and the startLine/endLine logic.

Comment on lines +330 to +347
} else if (host.includes('dev.azure.com') || host.includes('visualstudio.com')) {
// Azure DevOps Cloud and Server
const repoParts = ownerRepo.split('/');
if (repoParts.length >= 3) {
const org = encodeURIComponent(repoParts[0]);
const project = encodeURIComponent(repoParts[1]);
const repoName = repoParts.slice(2).map(encodeURIComponent).join('/');
// For Azure DevOps, encode the path preserving forward slashes
url = `https://${host}/${org}/${project}/_git/${repoName}?path=/${encodedFilePath}&version=GB${encodedRevision}`;
if (startLine) {
url += `&line=${startLine}`;
if (endLine && startLine !== endLine) {
url += `&lineEnd=${endLine}`;
}
}
} else {
return fileName;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Query parameter values need full URL encoding, not path encoding.

The encodedFilePath and encodedRevision use encodePathComponent() which preserves / characters. However, for URL query parameter values, slashes must be encoded as %2F. Branch names like feature/my-branch will produce version=GBfeature/my-branch instead of the correct version=GBfeature%2Fmy-branch, which Azure DevOps may misinterpret.

🔧 Suggested fix
         } else if (host.includes('dev.azure.com') || host.includes('visualstudio.com')) {
             // Azure DevOps Cloud and Server
             const repoParts = ownerRepo.split('/');
             if (repoParts.length >= 3) {
                 const org = encodeURIComponent(repoParts[0]);
                 const project = encodeURIComponent(repoParts[1]);
                 const repoName = repoParts.slice(2).map(encodeURIComponent).join('/');
-                // For Azure DevOps, encode the path preserving forward slashes
-                url = `https://${host}/${org}/${project}/_git/${repoName}?path=/${encodedFilePath}&version=GB${encodedRevision}`;
+                // For Azure DevOps query params, use full URI encoding (slashes must be encoded)
+                const pathParam = encodeURIComponent('/' + filePath);
+                const versionParam = encodeURIComponent(revision);
+                url = `https://${host}/${org}/${project}/_git/${repoName}?path=${pathParam}&version=GB${versionParam}`;
                 if (startLine) {
                     url += `&line=${startLine}`;
🤖 Prompt for AI Agents
In `@packages/web/src/features/chat/utils.ts` around lines 330 - 347, The Azure
DevOps branch/file query parameters are being built using encodePathComponent
(which preserves slashes), so values like encodedFilePath and encodedRevision
produce unencoded '/' in query params; update the code that builds url (the
Azure DevOps branch in the host.includes('dev.azure.com') ||
host.includes('visualstudio.com') block) to use full URL encoding for query
parameter values (e.g., use encodeURIComponent on the file path and revision
instead of encodePathComponent) so that encodedFilePath and encodedRevision
become safe for query strings (slashes become %2F) and then rebuild the url
variable accordingly, preserving the existing org/project/repoName assembly and
the startLine/endLine logic.

Comment on lines +357 to +362
} else if (host.includes('gerrit')) {
// Gerrit Self-Hosted
url = `https://${host}/plugins/gitiles/${encodedOwnerRepo}/+/refs/heads/${encodedRevision}/${encodedFilePath}`;
if (startLine) {
url += `#${startLine}`;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Gerrit URL assumes revision is always a branch.

The URL format hardcodes refs/heads/ which only works for branch names. If revision is a commit SHA (e.g., from a matched source), the generated URL will be invalid. Gerrit/Gitiles expects /+/{sha}/{path} for commits and /+/refs/tags/{tag}/{path} for tags.

Consider detecting the revision type or documenting this limitation.

🔧 Potential enhancement
         } else if (host.includes('gerrit')) {
             // Gerrit Self-Hosted
-            url = `https://${host}/plugins/gitiles/${encodedOwnerRepo}/+/refs/heads/${encodedRevision}/${encodedFilePath}`;
+            // Detect if revision looks like a commit SHA (40 hex chars) vs branch name
+            const isSha = /^[0-9a-f]{40}$/i.test(revision);
+            const refPath = isSha ? revision : `refs/heads/${encodedRevision}`;
+            url = `https://${host}/plugins/gitiles/${encodedOwnerRepo}/+/${isSha ? revision : refPath}/${encodedFilePath}`;
             if (startLine) {
                 url += `#${startLine}`;
             }
🤖 Prompt for AI Agents
In `@packages/web/src/features/chat/utils.ts` around lines 357 - 362, The Gerrit
URL generation currently always prefixes the revision with "refs/heads/", which
breaks when the revision is a commit SHA or a tag; update the
host.includes('gerrit') branch in the URL builder to inspect encodedRevision and
choose the correct path: if encodedRevision matches a commit SHA (e.g. 40 hex
chars) use `/+/{encodedRevision}/{encodedFilePath}`, if it represents a tag
(e.g. starts with `refs/tags/` or otherwise detected as a tag) use
`/+/refs/tags/{tag}/{encodedFilePath}`, otherwise use
`/+/refs/heads/{encodedRevision}/{encodedFilePath}`; preserve the existing
startLine fragment logic (append `#${startLine}` when startLine is present) and
update any variable names (encodedRevision, encodedFilePath, startLine)
accordingly.

@brendan-kellam brendan-kellam self-requested a review January 22, 2026 19:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FR] Resolve links embedded in Ask responses to URLs rather than relative paths

2 participants