Skip to content

fix(contrib/trivy): dedup package names and add dup warnings#2491

Open
shino wants to merge 5 commits intomasterfrom
shino/fix-pkg-dup
Open

fix(contrib/trivy): dedup package names and add dup warnings#2491
shino wants to merge 5 commits intomasterfrom
shino/fix-pkg-dup

Conversation

@shino
Copy link
Copy Markdown
Collaborator

@shino shino commented Mar 26, 2026

Problem

On Debian/Ubuntu, Trivy's dpkg analyzer reads both status and /var/lib/dpkg/status.d/*. Because the applier's dedup key includes FilePath, entries from different paths survive deduplication and appear as duplicate packages in Trivy's JSON output.

The previous converter code was also inconsistent in its handling of duplicates: Packages (binary) used last-wins (unconditional overwrite), while SrcPackages used first-wins (only the first occurrence locked in the version, subsequent duplicates only accumulated BinaryNames). Combined with Trivy's sort.Sort — which sorts packages lexicographically by (Name, Version), not by semantic version — this meant the result depended on input order and string sort, silently dropping the correct version in some cases.

This led to false-positive vulnerability detections — for example, a host running openssl 3.5.5-1~deb13u1 (patched) would be reported as running 3.5.4-1~deb13u1 (vulnerable) because the stale entry from status.d/ was picked up by the first-wins SrcPackages logic.

Fix

Replaced the inconsistent first-wins / last-wins with explicit version comparison that always keeps the newer version:

  • Debian/Ubuntu: Uses go-deb-version (github.com/knqyf263/go-deb-version, already a dependency) for proper dpkg version semantics. The newer version always wins regardless of input order.
  • Other OS types: Falls back to lexicographic string comparison for deterministic dedup. (In practice, duplicate packages only occur with dpkg.)

The helper function compareVersions(osType, a, b) int dispatches between dpkg and lexicographic comparison, returning a positive value if a > b, zero if equal, and a negative value if a < b. If dpkg version parsing fails for either operand, it falls back to lexicographic comparison.

For SrcPackages, BinaryNames are accumulated across all duplicates before the version comparison, preserving the complete list of binary packages.

When duplicate packages are detected, a per-package warning is emitted via ScanResult.Warnings listing the package name and all versions seen.

Note: This is a workaround. Ideally Trivy itself should deduplicate packages from status and status.d/.

Changes

  • contrib/trivy/pkg/converter.go:
    • Added compareVersions() helper (returns int) with dpkg vs. lexicographic dispatch
    • Changed pkgs dedup: only overwrite when new entry has a newer or equal version (first occurrence always kept)
    • Changed srcPkgs dedup: same logic, with BinaryNames always accumulated
    • Added per-package duplicate warning via ScanResult.Warnings
  • contrib/trivy/pkg/converter_test.go:
    • Added TestConvert with 5 test cases:
      • No duplicate, single source
      • Duplicate packages, newer wins
      • Duplicate packages reverse order, newer wins
      • Multiple binary packages with duplicates
      • Non-dpkg OS with lexicographic fallback (unrealistic scenario for coverage)

Limitations

This PR deduplicates Packages and SrcPackages by always keeping the newest version, making the converter output deterministic and correct for package metadata. However, it does not address the following issues that stem from Trivy itself emitting duplicate packages:

  • False-positive vulnerabilities remain in ScannedCves. When Trivy detects a CVE against the stale (older) duplicate package from status.d/, that vulnerability entry is passed through to the converter as-is. The converter does not filter vulnerabilities, so the false-positive CVE will still appear in the output.

  • AffectedPackages may reference versions not present in Packages or SrcPackages. Because the converter keeps only the newest version in Packages/SrcPackages but preserves all vulnerability entries from Trivy, a CVE's AffectedPackages.FixedIn may refer to a version older than the installed version — a version that no longer exists in the package maps. This can result in an inconsistent state where a CVE appears to affect a package whose installed version already exceeds the fix version.

The root cause is that Trivy's dpkg analyzer reads both status and status.d/ without deduplication. The recommended fix for affected users is to remove /var/lib/dpkg/status.d/ from their container images when status is present, as status.d/ is intended only for distroless images that lack a full dpkg installation.

@shino shino self-assigned this Mar 26, 2026
@shino shino requested a review from Copilot March 26, 2026 07:18
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the Trivy JSON-to-Vuls conversion logic to deterministically deduplicate OS packages (notably for Debian/Ubuntu dpkg where Trivy can emit duplicates) by keeping the newer version using dpkg-aware comparison, and adds unit tests covering the duplicate scenarios.

Changes:

  • Add versionGreaterThan() to compare Debian/Ubuntu versions using dpkg semantics (with a lexicographic fallback).
  • Change OS package (Packages / SrcPackages) aggregation to keep the newest version while accumulating BinaryNames for source packages.
  • Add TestConvert table tests covering duplicate ordering and non-dpkg fallback behavior.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
contrib/trivy/pkg/converter.go Deduplicate packages by version comparison (dpkg-aware for Debian/Ubuntu) and accumulate source BinaryNames.
contrib/trivy/pkg/converter_test.go Add conversion tests to validate de-duplication and source/binary aggregation behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@shino shino requested review from MaineK00n and removed request for MaineK00n March 26, 2026 08:04
@shino shino marked this pull request as draft March 26, 2026 08:17
@shino
Copy link
Copy Markdown
Collaborator Author

shino commented Mar 26, 2026

Ooops, detected vulns by trivy remains after trivy-to-vuls... 🤷

@shino shino marked this pull request as ready for review March 26, 2026 09:39
@shino shino requested a review from Copilot March 26, 2026 09:41
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@shino shino requested a review from MaineK00n March 26, 2026 09:48
@shino shino changed the title fix(contrib/trivy): dedup package names, just last-wins fix(contrib/trivy): dedup package names and add dup warnings Mar 26, 2026
dupPkgs[p.Name] = append(versions, pv)
}
}
if compareVersions(trivyResult.Type, pv, existing.Version) >= 0 {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: compareVersions(...) >= 0 means the map entry is overwritten even when the two versions are equal. Changing the condition to > 0 would skip the redundant reassignment and make the intent clearer — we only want to replace when the new version is strictly newer.

Suggested change
if compareVersions(trivyResult.Type, pv, existing.Version) >= 0 {
if compareVersions(trivyResult.Type, pv, existing.Version) > 0 {

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is tricky part (maybe I should have written comment). 🤔

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this make sense? d3e5edb


if existing, ok := srcPkgs[p.SrcName]; ok {
existing.AddBinaryName(p.Name)
if compareVersions(trivyResult.Type, sv, existing.Version) >= 0 {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: compareVersions(...) >= 0 means the map entry is overwritten even when the two versions are equal. Changing the condition to > 0 would skip the redundant reassignment and make the intent clearer — we only want to replace when the new version is strictly newer.

Suggested change
if compareVersions(trivyResult.Type, sv, existing.Version) >= 0 {
if compareVersions(trivyResult.Type, sv, existing.Version) > 0 {

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strictly speaking, = does not needed here.
I added it to make bin/src collection seems as same as the possible.
But it can be removed. It may be a matter of taste? Which do you prefer?

@shino shino requested a review from MaineK00n March 27, 2026 01:46
@shino shino force-pushed the shino/fix-pkg-dup branch from d3e5edb to 2c4d940 Compare March 27, 2026 04:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants