Skip to content

add NormalizePath to ensure IG paths are always full#723

Open
matthyx wants to merge 1 commit intomainfrom
fix/normalize-path
Open

add NormalizePath to ensure IG paths are always full#723
matthyx wants to merge 1 commit intomainfrom
fix/normalize-path

Conversation

@matthyx
Copy link
Contributor

@matthyx matthyx commented Feb 19, 2026

Fixes #721

Summary by CodeRabbit

  • Bug Fixes

    • Paths are now normalized consistently across system operations, ensuring proper handling of absolute paths, relative paths, and special process directory references with correct slash formatting.
  • Tests

    • Added comprehensive test suite validating path normalization across multiple scenarios, including empty paths, relative paths, and edge cases.

@coderabbitai
Copy link

coderabbitai bot commented Feb 19, 2026

📝 Walkthrough

Walkthrough

Three files introduce a NormalizePath utility function and integrate it into five datasource event getters. The function normalizes paths by prepending "/proc" to headless process paths, ensuring absolute paths, and cleaning redundant separators. This directly addresses missing root nodes in path captures.

Changes

Cohort / File(s) Summary
Path Normalization Utility
pkg/utils/path.go
New utility function NormalizePath that restores missing "/proc" prefixes for headless paths (matching `^/\d+/(task
DatasourceEvent Path Getters
pkg/utils/datasource_event.go
Five getter methods (GetExePath, GetFullPath, GetNewPath, GetOldPath, GetPath) now wrap return values with NormalizePath to normalize paths before returning.
Path Normalization Tests
pkg/utils/normalize_path_test.go
Comprehensive table-driven test suite covering empty paths, ".", absolute/relative paths, headless /proc paths, and path cleaning edge cases with assertion validation.

Sequence Diagram(s)

This section is skipped because the changes are primarily a utility function addition with straightforward integration into existing getters without significant new control flow or multi-component interactions.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~20 minutes

Poem

🐰 A path was lost, its root mislaid,
/proc had vanished without aid,
But now with NormalizePath's might,
Headless paths are made complete and right,
Tested thoroughly, clean and bright! 🌟

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'add NormalizePath to ensure IG paths are always full' directly and clearly describes the main change: introducing a NormalizePath function to convert relative/headless paths to absolute full paths.
Linked Issues check ✅ Passed The PR successfully implements the requirements from issue #721 by introducing NormalizePath function to convert headless paths (missing /proc prefix) and relative paths (empty or '.') to absolute full paths for OPEN events.
Out of Scope Changes check ✅ Passed All changes are directly related to the PR objective: NormalizePath implementation, its application to DatasourceEvent getters, and comprehensive test coverage for the normalization logic.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/normalize-path

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
pkg/utils/path.go (2)

21-23: Redundant special case for ".".

The "." early-return is dead logic. Without it, "." proceeds through: no headless-proc match → !strings.HasPrefix(".", "/")p = "/."path.Clean("/.")"/". The result is identical. The special case can be safely removed.

♻️ Proposed simplification
-	if p == "." {
-		return "/"
-	}
-
 	if headlessProcRegex.MatchString(p) {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/utils/path.go` around lines 21 - 23, Remove the redundant special-case
branch that returns "/" when p == "." in the path normalization logic: the
early-return is dead code because the subsequent logic (the headless-proc match,
the strings.HasPrefix check that adds "/" to p, and the final path.Clean call)
already yields "/" for "."; delete the if p == "." { return "/" } block and
leave the remaining code (including the strings.HasPrefix handling and
path.Clean) intact so behavior is unchanged but simpler.

4-4: Prefer path/filepath over path for OS filesystem paths.

The path package is documented for slash-separated paths in URLs; path/filepath is the correct package for OS filesystem paths. On Linux both are functionally identical, but using path here is non-idiomatic.

♻️ Proposed fix
 import (
-	"path"
+	"path/filepath"
 	"regexp"
 	"strings"
 )
 
 ...
 
-	return path.Clean(p)
+	return filepath.Clean(p)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/utils/path.go` at line 4, The code imports the net/URL-oriented "path"
package for filesystem paths; replace the import with "path/filepath" and update
all usages (e.g., path.Join, path.Clean, path.Dir, path.Base, path.Split, etc.)
to use filepath.Join, filepath.Clean, filepath.Dir, filepath.Base,
filepath.Split accordingly so the code uses the OS-specific filepath behavior;
ensure any tests/imports compile after switching the package name.
pkg/utils/normalize_path_test.go (1)

14-63: Consider adding edge-case test for "/" and explicit documentation of the headless-path false-positive behavior.

Two cases worth covering:

  1. Input "/" — already covered by general logic but having an explicit test guards against regression.
  2. Input like "/1234/fd/socket" (a legitimate app-level path matching the headless regex) — adding this as a documented test case makes it explicit that such paths will be prefixed with /proc, alerting future readers to the known trade-off.
🧪 Suggested additions
+		{
+			name:     "root path",
+			input:    "/",
+			expected: "/",
+		},
+		{
+			name:     "only dotdot relative",
+			input:    "../etc/passwd",
+			expected: "/etc/passwd",
+		},
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/utils/normalize_path_test.go` around lines 14 - 63, Add explicit test
cases to the table in pkg/utils/normalize_path_test.go: include an input of "/"
with expected "/" to guard regressions, and include an input like
"/1234/fd/socket" with expected "/proc/1234/fd/socket" (or whatever the current
NormalizePath behavior is) to document the known headless-path false-positive;
update the test table (the slice of test structs used to exercise NormalizePath)
and add a short comment next to the second case explaining this is a documented
trade-off of the headless-path prefixing logic so future readers see this
intended behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/chart/templates/node-agent/daemonset.yaml`:
- Around line 119-125: Render only one of `.value` or `.valueFrom` for each
`nodeAgent.env` entry by making the template mutually exclusive: change the
current independent `if .value` / `if .valueFrom` checks so that `.value` is
rendered only when `.value` is set and `.valueFrom` is not, and `.valueFrom` is
rendered only when `.valueFrom` is set and `.value` is not (or use an if/else-if
structure) to ensure exactly one of the two blocks appears for any env entry.

---

Nitpick comments:
In `@pkg/utils/normalize_path_test.go`:
- Around line 14-63: Add explicit test cases to the table in
pkg/utils/normalize_path_test.go: include an input of "/" with expected "/" to
guard regressions, and include an input like "/1234/fd/socket" with expected
"/proc/1234/fd/socket" (or whatever the current NormalizePath behavior is) to
document the known headless-path false-positive; update the test table (the
slice of test structs used to exercise NormalizePath) and add a short comment
next to the second case explaining this is a documented trade-off of the
headless-path prefixing logic so future readers see this intended behavior.

In `@pkg/utils/path.go`:
- Around line 21-23: Remove the redundant special-case branch that returns "/"
when p == "." in the path normalization logic: the early-return is dead code
because the subsequent logic (the headless-proc match, the strings.HasPrefix
check that adds "/" to p, and the final path.Clean call) already yields "/" for
"."; delete the if p == "." { return "/" } block and leave the remaining code
(including the strings.HasPrefix handling and path.Clean) intact so behavior is
unchanged but simpler.
- Line 4: The code imports the net/URL-oriented "path" package for filesystem
paths; replace the import with "path/filepath" and update all usages (e.g.,
path.Join, path.Clean, path.Dir, path.Base, path.Split, etc.) to use
filepath.Join, filepath.Clean, filepath.Dir, filepath.Base, filepath.Split
accordingly so the code uses the OS-specific filepath behavior; ensure any
tests/imports compile after switching the package name.

Comment on lines +119 to +125
{{- if .value }}
value: "{{ .value }}"
{{- end }}
{{- if .valueFrom }}
valueFrom:
{{ toYaml .valueFrom | indent 16 }}
{{- end }}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Missing mutual exclusion guard between value and valueFrom.

If a nodeAgent.env entry sets both .value and .valueFrom, both blocks will be rendered, producing invalid Kubernetes YAML (the pod spec requires exactly one of value or valueFrom, not both). Kubernetes will reject the resulting pod spec.

🛡️ Proposed guard
-            {{- if .value }}
-              value: "{{ .value }}"
-            {{- end }}
-            {{- if .valueFrom }}
-              valueFrom:
-{{ toYaml .valueFrom | indent 16 }}
-            {{- end }}
+            {{- if and .value (not .valueFrom) }}
+              value: "{{ .value }}"
+            {{- else if .valueFrom }}
+              valueFrom:
+{{ toYaml .valueFrom | indent 16 }}
+            {{- end }}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/chart/templates/node-agent/daemonset.yaml` around lines 119 - 125,
Render only one of `.value` or `.valueFrom` for each `nodeAgent.env` entry by
making the template mutually exclusive: change the current independent `if
.value` / `if .valueFrom` checks so that `.value` is rendered only when `.value`
is set and `.valueFrom` is not, and `.valueFrom` is rendered only when
`.valueFrom` is set and `.value` is not (or use an if/else-if structure) to
ensure exactly one of the two blocks appears for any env entry.

Signed-off-by: Matthias Bertschy <matthias.bertschy@gmail.com>

# Conflicts:
#	pkg/utils/datasource_event.go
#	pkg/utils/struct_event.go
@matthyx matthyx force-pushed the fix/normalize-path branch from 5febde9 to eafe066 Compare March 3, 2026 15:58
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/utils/path.go`:
- Around line 9-31: The headless /proc detection in NormalizePath misses cases
because headlessProcRegex requires a trailing slash and runs before
leading-slash normalization; update headlessProcRegex to match either a trailing
slash or end-of-string (use pattern like ^/\d+/(task|fd)(/|$)) and move the
MatchString check to after the leading-slash normalization in NormalizePath so
inputs like "46/task/46/fd", "/46/fd" and "/46/task" are correctly detected and
have "/proc" prepended.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5febde9 and eafe066.

📒 Files selected for processing (3)
  • pkg/utils/datasource_event.go
  • pkg/utils/normalize_path_test.go
  • pkg/utils/path.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • pkg/utils/normalize_path_test.go

Comment on lines +9 to +31
var headlessProcRegex = regexp.MustCompile(`^/\d+/(task|fd)/`)

// NormalizePath normalizes a path by:
// 1. Prepending "/proc" to "headless" proc paths (e.g. /46/task/46/fd -> /proc/46/task/46/fd)
// 2. Ensuring it starts with "/" if it's not empty
// 3. Converting "." to "/"
// 4. Cleaning the path (removing redundant slashes, dot-dots, etc.)
func NormalizePath(p string) string {
if p == "" {
return ""
}

if p == "." {
return "/"
}

if headlessProcRegex.MatchString(p) {
p = "/proc" + p
}

if !strings.HasPrefix(p, "/") {
p = "/" + p
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cat -n pkg/utils/path.go | head -40

Repository: kubescape/node-agent

Length of output: 940


🏁 Script executed:

#!/bin/bash
python - <<'PY'
import re, posixpath

# Current regex in the code
current_regex = re.compile(r"^/\d+/(task|fd)/")

# Proposed regex in the fix
proposed_regex = re.compile(r"^/\d+/(task|fd)(/|$)")

def test_current(p):
    """Simulates current implementation"""
    if p == "": return ""
    if p == ".": return "/"
    if current_regex.match(p):
        p = "/proc" + p
    if not p.startswith("/"):
        p = "/" + p
    return posixpath.normpath(p)

def test_proposed(p):
    """Simulates proposed implementation"""
    if p == "": return ""
    if p == ".": return "/"
    if not p.startswith("/"):
        p = "/" + p
    if proposed_regex.match(p):
        p = "/proc" + p
    return posixpath.normpath(p)

test_cases = [
    "/46/task/46/fd",      # Full headless proc path with content
    "/46/fd",              # Headless proc path without trailing slash
    "/46/task",            # Headless proc path without trailing slash
    "46/task/46/fd",       # Headless path without leading slash
    "/46/task/",           # With trailing slash
    "/46/fd/",             # With trailing slash
    "/normal/path",        # Regular path
    "/123/notask/456",     # Should not match
]

print("Test Case                    | Current Result            | Proposed Result           | Issue")
print("-" * 100)
for tc in test_cases:
    curr = test_current(tc)
    prop = test_proposed(tc)
    issue = "❌ Missing /proc" if curr.startswith("/") and not curr.startswith("/proc") and any(x in tc for x in ["task", "fd"]) and curr == "/" + tc.lstrip("/") else ""
    print(f"{tc:28} | {curr:25} | {prop:25} | {issue}")
PY

Repository: kubescape/node-agent

Length of output: 1021


Headless /proc detection misses valid edge cases due to regex requiring trailing slash.

The regex ^/\d+/(task|fd)/ requires a trailing slash after task or fd, causing valid headless proc paths like /46/fd, /46/task, and 46/task/46/fd (before leading slash normalization) to bypass /proc prepending. This breaks the "always full" guarantee.

Fix by changing the regex to ^/\d+/(task|fd)(/|$) to match paths ending with either "/" or end-of-string, and move the regex check after the leading slash normalization so 46/task/46/fd is correctly recognized after becoming /46/task/46/fd.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/utils/path.go` around lines 9 - 31, The headless /proc detection in
NormalizePath misses cases because headlessProcRegex requires a trailing slash
and runs before leading-slash normalization; update headlessProcRegex to match
either a trailing slash or end-of-string (use pattern like ^/\d+/(task|fd)(/|$))
and move the MatchString check to after the leading-slash normalization in
NormalizePath so inputs like "46/task/46/fd", "/46/fd" and "/46/task" are
correctly detected and have "/proc" prepended.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: WIP

Development

Successfully merging this pull request may close these issues.

Absolute paths partially changed to relative paths for OPEN_calls in Rule0002

1 participant