fix(snapshot): scope git operations from worktree root#2904
Conversation
docker-agent
left a comment
There was a problem hiding this comment.
Assessment: 🟢 APPROVE
The fix correctly roots all git invocations at the worktree and uses the new scope() helper to produce consistent worktree-relative pathspecs. The two new tests reproduce the original bug and verify both subfolder tracking and sibling-change isolation. One minor robustness note left as an inline comment.
| diff := r.git(ctx, append(quoteArgs(), r.args("diff-files", "--name-only", "-z", "--", ".")...), gitOpts{cwd: r.directory}) | ||
| other := r.git(ctx, append(quoteArgs(), r.args("ls-files", "--others", "--exclude-standard", "-z", "--", ".")...), gitOpts{cwd: r.directory}) | ||
| diff := r.git(ctx, append(quoteArgs(), r.args(append([]string{"diff-files", "--name-only", "-z", "--"}, r.scope()...)...)...), gitOpts{cwd: r.worktree}) | ||
| other := r.git(ctx, append(quoteArgs(), r.args(append([]string{"ls-files", "--others", "--exclude-standard", "-z", "--full-name", "--"}, r.scope()...)...)...), gitOpts{cwd: r.worktree}) |
There was a problem hiding this comment.
[LOW] diff-files lacks --full-name — inconsistent path guarantee vs ls-files
ls-files --others uses --full-name to guarantee worktree-root-relative output, but the adjacent diff-files --name-only call on the line above does not:
diff := r.git(ctx, append(quoteArgs(), r.args(append([]string{"diff-files", "--name-only", "-z", "--"}, r.scope()...)...)...), gitOpts{cwd: r.worktree})
other := r.git(ctx, append(quoteArgs(), r.args(append([]string{"ls-files", "--others", "--exclude-standard", "-z", "--full-name", "--"}, r.scope()...)...)...), gitOpts{cwd: r.worktree})Both outputs are merged into all and later used with filepath.Join(r.worktree, filepath.FromSlash(item)), which requires paths to be worktree-relative.
Today this is safe because cwd is explicitly set to r.worktree on both calls — when cwd equals the worktree root, git path output is already relative to the worktree root, making --full-name redundant. However, the asymmetry means a future refactor that changes cwd for diff-files (or any change that makes them diverge) would silently produce incorrect Lstat lookups for tracked modified files without any obvious warning. Consider adding --full-name to the diff-files call for symmetry and defensive correctness.
There was a problem hiding this comment.
Good catch on the asymmetry, but git diff-files doesn't accept --full-name (it errors out — that flag is ls-files-only). diff-files is plumbing and always emits worktree-root-relative paths by design, so the two outputs are already in the same convention. Added a comment above the two calls in e3bd4aa to make that invariant explicit for future readers.
There was a problem hiding this comment.
You're absolutely right — I apologize for the incorrect suggestion. I verified that --full-name is indeed ls-files-only and errors out on diff-files. The comment you added in e3bd4aa is perfect: it makes the invariant explicit (both commands emit worktree-root-relative paths, one by design and one by flag) and explains why the merged output works correctly. Thanks for the correction!
docker-agent
left a comment
There was a problem hiding this comment.
Assessment: 🟡 NEEDS ATTENTION
The core fix is sound — running all git operations with cwd=r.worktree and introducing scope() correctly addresses the original path-mixing bug. One medium-severity issue was identified in the new scope() helper.
| if err != nil || rel == "" || rel == "." { | ||
| return []string{"."} | ||
| } | ||
| if strings.HasPrefix(rel, "..") { |
There was a problem hiding this comment.
[MEDIUM] scope() silently expands to entire worktree when directory is accessed via a symlink
filepath.Rel(r.worktree, r.directory) can produce a ..-prefixed path even when r.directory is genuinely inside r.worktree, if the two paths were resolved through different mechanisms: git rev-parse --show-toplevel resolves symlinks (returning the real path) while filepath.Abs does not. This is a real scenario on macOS (e.g., /tmp → /private/tmp, /var → /private/var) and any host where the working directory is reached via a symlink.
When the .. guard triggers, scope() silently returns ["."], causing all snapshot git operations (diff-files, ls-files --others, git add) to run against the entire worktree rather than the agent's subdirectory. This defeats the whole purpose of the fix: files from sibling directories can be included or modified in the snapshot.
Suggested fix: Resolve symlinks on both sides before computing Rel, or resolve only r.directory with filepath.EvalSymlinks to match what --show-toplevel returns:
func (r *Repo) scope() []string {
dir := r.directory
if resolved, err := filepath.EvalSymlinks(dir); err == nil {
dir = resolved
}
rel, err := filepath.Rel(r.worktree, dir)
if err != nil || rel == "" || rel == "." {
return []string{"."}
}
if strings.HasPrefix(rel, "..") {
return []string{"."}
}
return []string{filepath.ToSlash(rel)}
}Alternatively, log a warning when the fallback triggers so the silent expansion is at least visible.
There was a problem hiding this comment.
Fixed in e3bd4aa: canonicalize the directory with filepath.EvalSymlinks at Open() time so r.directory and r.worktree (which comes from git rev-parse --show-toplevel, always symlink-resolved) use the same convention. Locked in by TestOpenCanonicalizesSymlinkedDirectory (fails without the fix, both on the Directory() equality check and on the patch wrongly including sibling-directory changes).
There was a problem hiding this comment.
Perfect! The fix handles it exactly as suggested: canonicalize with EvalSymlinks at Open() time so both paths use the same convention before Rel runs. The test is comprehensive — it verifies both that Directory() returns the canonical path and that sibling changes don't leak into the patch, which directly confirms that scope() no longer falls back to ["."].
Bonus: this also stabilizes the shadow gitdir hash for symlinked entry points, so the same repo accessed via different symlink paths now shares one shadow git directory.
Address review feedback on docker#2904. git rev-parse --show-toplevel resolves symlinks, while filepath.Abs preserves them. When the agent's directory is reached through a symlink (e.g. /tmp -> /private/tmp on macOS), r.directory and r.worktree no longer share a prefix, so filepath.Rel produces a '..'-prefixed path and scope() silently falls back to '.' — expanding snapshot operations to the entire worktree and defeating the subfolder fix. EvalSymlinks the abs path at Open() so r.directory uses the same canonicalization as r.worktree. Also stabilizes the shadow gitdir hash for symlinked entry points. Add TestOpenCanonicalizesSymlinkedDirectory to lock this in.
When the agent ran from a subfolder of the git worktree, Repo.add() mixed worktree-relative paths from 'git diff-files --name-only' with cwd-relative paths from 'git ls-files --others', then passed them as pathspecs to 'git add' running from the subfolder. Worktree-relative entries were re-resolved against the subfolder cwd and failed with: warning: could not open directory 'assistant/assistant/' fatal: pathspec 'assistant/gogo.yaml' did not match any files Run all snapshot git invocations with cwd set to the worktree root and scope them with a worktree-relative pathspec via a new Repo.scope() helper. This makes git's input and output path conventions consistent regardless of where the agent is launched from. Also canonicalize r.directory with filepath.EvalSymlinks at Open() so it uses the same symlink resolution as 'git rev-parse --show-toplevel'. Without this, a symlinked entry path (e.g. /tmp -> /private/tmp on macOS) would make filepath.Rel produce a '..'-prefixed path and scope() silently expand to the entire worktree. Fixes docker#2903
e3bd4aa to
515d27e
Compare
Fixes #2903.
When the agent runs from a subfolder of the git worktree, snapshotting failed with:
Repo.add()was mixing worktree-relative paths fromgit diff-fileswith cwd-relative paths fromgit ls-files --others, then passing them as pathspecs togit addrunning from the subfolder. Worktree-relative entries got re-resolved against the subfolder cwd and failed.Run all snapshot git invocations with
cwdset to the worktree root, scoped by a worktree-relative pathspec via a newRepo.scope()helper. Input and output path conventions are now consistent regardless of where the agent is launched.Testing: added
TestTrackPatchFromSubfolderandTestSubfolderScopeIgnoresSiblingChanges. Both fail onmain(reproducing the exact warning) and pass with the fix.