Skip to content

sbom: add opt-in SBOM generation and SBOM-based vulnerability scanning#27455

Draft
bhouse-nexthop wants to merge 3 commits into
sonic-net:masterfrom
bhouse-nexthop:bhouse.sbom-generation
Draft

sbom: add opt-in SBOM generation and SBOM-based vulnerability scanning#27455
bhouse-nexthop wants to merge 3 commits into
sonic-net:masterfrom
bhouse-nexthop:bhouse.sbom-generation

Conversation

@bhouse-nexthop
Copy link
Copy Markdown
Collaborator

@bhouse-nexthop bhouse-nexthop commented May 20, 2026

Why I did it

SONiC currently ships no Software Bill of Materials. There is no inventory of what is in each .bin, no record of the patches and forks SONiC carries on top of upstream sources, no per-artifact vulnerability surface, and no machine-readable provenance. This makes CVE response, license attribution, supply-chain audits (SLSA, SBOM regulations like EO 14028 / CRA), and reproducibility checks expensive or impossible.

This PR adds opt-in SBOM generation and SBOM-based vulnerability scanning. When ENABLE_SBOM=y is passed at build time, every artifact (.bin, .img, .swi, plus the standalone test-container .gzs) gets a CycloneDX 1.6 SBOM sidecar covering Debian packages, Python wheels, language-ecosystem lockfile contents (Rust, Go, npm, pnpm, yarn), Docker image layers, vendor blobs, and SONiC-built sources. The SBOMs preserve fork-of-upstream provenance (e.g. the FRR submodule pinned to a specific upstream tag plus SONiC's patch set) via pedigree.ancestors[], so a CVE in upstream FRR can be tied back to the SONiC build that carries the corresponding patch. A separate set of standalone scripts consume the SBOMs to produce vulnerability reports (grype + OpenVEX suppressions) and reproducibility diffs.

The default build path is unchanged. With ENABLE_SBOM unset (the default), all SBOM-related work is short-circuited in slave.mk via a make-level $(if) guard, so non-SBOM builds incur zero overhead — no extra subprocesses, no Python startup cost, no apt-cache calls. The 5 added build dependencies (syft, grype, cyclonedx-cli, plus their installer) are only fetched when SBOM is enabled, through the existing sonic-build-hooks wget shim so versions-web tracking still applies.

The Azure Pipelines side replaces the recently-merged Trivy scans (PR #27079) with the new SBOM-based scanner so CI vulnerability reports come from the same data that ships in the artifact, not a separate re-scan of the image tarball.

Comparison with the trivy-based PRs this would supersede

The Azure Pipelines half of this PR replaces / overlaps with two recent trivy PRs:

Trivy is a perfectly reasonable image scanner. But trivy alone solves only the vulnerability scanning slice of the supply-chain problem, and it does so by re-discovering what's in the artifact at scan time rather than capturing what the build actually put there. Crucially, the SBOM-based approach finds more CVEs, not the same set — because the input components carry richer provenance, the scanner can match them against CVE databases that trivy can't reach from a plain rootfs scan.

Trivy (PR #27079, #27322) SBOM + scanner (this PR)
CVE database coverage NVD + GitHub Advisory DB + distro feeds NVD + GitHub Advisory DB + distro feeds — same coverage at the DB level
CVEs surfaced for unmodified upstream Debian packages ✓ matches name@version against the trivy DB ✓ matches against grype's equivalent feeds
CVEs surfaced for SONiC-patched / forked upstream packages ✗ false negatives — trivy keys on name@version. SONiC-built debs carry custom version strings (frr_10.5.4-sonic-0, openssh-server_1:9.2p1-2+fips, etc.) that don't match upstream CVE DB entries. CVEs in those packages are silently missed ✓ — pedigree.ancestors[] records the upstream tag/commit. Grype matches against the upstream identity, and OpenVEX statements (auto-extracted from src/*/patches/) suppress the CVEs that SONiC's carried patches have already fixed. So real unfixed CVEs surface; already-patched CVEs don't generate noise
CVEs surfaced for Rust crate transitive deps ✗ false negatives — rustc embeds no crate info; trivy sees only the wrapping .deb and cannot enumerate crates compiled into the binary. Any CVE in a Rust crate (RustSec advisories, GHSA-Rust entries) is unreachable ✓ — slave Dockerfiles install cargo-auditable and route cargo build through a wrapper, so every Rust binary embeds its full resolved Cargo.lock in a .dep-v0 ELF section. Grype's cargo cataloger reads it and CVE-matches every crate. This is the largest CVE-coverage gain in the PR for SONiC's increasingly-Rust-heavy components (sonic-swss, sonic-swss-common, sonic-dash-ha, sonic-host-services, sonic-ctrmgrd-rs, …)
CVEs surfaced for build-time deps that don't survive to the runtime FS ✗ — trivy only sees the assembled rootfs ✓ — collect_version_files harvests lockfiles inside each build container before cleanup phases run, capturing deps that were pulled, used, and discarded during the build
CVEs surfaced for vendor binary blobs (Broadcom/Mellanox/Marvell SDKs) partial — depends on whether the blob is identifiable to trivy ✓ — recipe-emit fragments label these with pkg:generic/<vendor>/... PURLs and explicit license metadata. Vendors that publish CVE advisories can be matched; for the rest, the component is at least visible in the inventory for triage
False-positive suppression trivy supports .trivyignore — per-rule, per-line, not signed, detached from the source code that justifies the suppression OpenVEX v0.2.0 JSON — each suppression carries status, justification, impact_statement, and a reference to the source patch that fixes the CVE. Consumed by grype natively, machine-readable, and the auto-extractor regenerates them from src/*/patches/ on every build (gitignored, can't drift)
Reproducibility check ✗ — trivy output drifts as its CVE DB updates ✓ — the SBOM is a deterministic function of source state; scripts/sbom_diff.py does build-vs-build diffs to detect non-deterministic build inputs
SLSA / in-toto provenance ✓ — unsigned SLSA v1.0 in-toto attestation per .bin (subject SHA-256 + materials). Release engineering signs with cosign later
PURL precision generic Debian PURLs only SONiC-specific PURLs (pkg:deb/sonic/... vs pkg:deb/debian/...), plus pkg:github/, pkg:cargo/, pkg:golang/, pkg:npm/, pkg:oci/, pkg:generic/<vendor>/
License attribution partial — trivy detects licenses but coverage varies ✓ — DEP-5 debian/copyright parser + licensecheck fallback + 104-entry SPDX header→identifier map. Matches the structured format SONiC actually ships
Output for downstream consumers trivy report (scan artifact only) CycloneDX 1.6 + SPDX 2.3 + in-toto provenance — shippable alongside the .bin. Customers can run their own scans, do license audits, satisfy EU CRA / EO 14028 reporting
CI cost downloads multi-GB artifacts (PR #27322 pulls sonic-buildimage.<platform> per platform, extracts squashfs, scans every docker-*.gz in parallel — measured in many minutes per platform) downloads ~5 MB of *.cdx.json sidecars total, scans them in one parallel job. Cheap enough to cover all platforms, not just broadcom + mellanox
Build-time overhead with ENABLE_SBOM=n n/a (trivy runs post-build) zero — make-level $(if $(filter y,$(ENABLE_SBOM)),...,:) short-circuit means no Python startup, no apt-cache calls, no extra subprocesses on non-SBOM builds
Build-time overhead with ENABLE_SBOM=y n/a ~2–3 minutes added to a full broadcom build (measured: 2h33min total including SBOM emission for 3 .bin variants)
Scope docker containers + (PR #27322) host rootfs from squashfs extraction every shipping artifact: .bin, .img, .swi, plus standalone SBOMs for docker-ptf, docker-ptf-sai, docker-sonic-mgmt

TL;DR: trivy answers "what CVEs match the packages I can identify by name+version in this rootfs?" SONiC's SBOM tooling answers a strictly larger question — "what CVEs match the components actually built into this artifact, including SONiC-patched forks (with already-fixed CVEs suppressed via VEX), Rust crates embedded in binaries, and build-time deps that were consumed and discarded?" — at lower CI cost, and produces shippable CycloneDX/SPDX/SLSA artifacts as a side effect.

Work item tracking
  • Microsoft ADO (number only):

How I did it

The implementation is split across three commits.

1. SBOM generation

  • Build-flag plumbing: ENABLE_SBOM, SBOM_FORMAT, SBOM_SCAN_TOOL, SBOM_INCLUDE_LICENSES added to rules/config. Makefile.work and build_debian.sh propagate the flag into the slave container and the chroot. prepare_docker_buildinfo.sh injects ENV ENABLE_SBOM into every container's Dockerfile so per-container post-build hooks can branch on it.
  • Recipe-emit fragments: A new sbom_emit_fragment helper in slave.mk is invoked from every recipe site that produces an artifact in scope: SONIC_DPKG_DEBS, SONIC_MAKE_DEBS, SONIC_DERIVED_DEBS, SONIC_EXTRA_DEBS, SONIC_ONLINE_DEBS, SONIC_PYTHON_STDEB_DEBS, SONIC_PYTHON_WHEELS, and the docker-image-save step. The helper is wrapped in a make-level $(if $(filter y,$(ENABLE_SBOM)),...,:) short-circuit so non-SBOM builds never invoke Python. Each recipe produces a small per-component CycloneDX fragment; scripts/sbom_fragment.py detects four fork-of-upstream patterns (sonic-net submodules, dget+patches, nested submodule à la FRR, direct-upstream-submodule + sidecar patches) and emits pedigree.ancestors[] accordingly.
  • Observation harvest: src/sonic-build-hooks/scripts/collect_version_files is extended (gated on ENABLE_SBOM=y) to emit a 9-column TSV per container (Package, Version, Architecture, Source, SourceVersion, Maintainer, Homepage, Filename, SHA256) using one bulk xargs apt-cache show call plus an awk parser — one fork per container rather than one per package — and to snapshot /usr/share/doc/*/copyright plus any language lockfiles (Cargo.lock, go.sum, package-lock.json, pnpm-lock.yaml, yarn.lock) found in the container.
  • Rust crate visibility via cargo-auditable: every slave Dockerfile (sonic-slave-{bookworm,trixie,bullseye,buster}) installs cargo-auditable and drops a wrapper at /usr/local/bin/cargo that routes cargo build through cargo auditable build. The resulting binaries embed the resolved Cargo.lock into a .dep-v0 ELF section. Syft's rust-binary cataloger reads that section at scan time, so the SBOM lists exactly the crates the active feature set actually pulls in — not a Cargo.lock lockfile superset. Zero changes required to any individual debian/rules.
  • Lockfile expansion: scripts/sbom_parse_lockfiles.py parses lockfiles harvested from runtime containers and emits pkg:cargo/, pkg:golang/, pkg:npm/ PURLs so transitive dependencies compiled into the image are visible. Mostly relevant for ecosystems other than Rust (which is already covered precisely by cargo-auditable) and Go (already covered natively by syft).
  • Image-level aggregator: scripts/build_sbom.py merges recipe fragments + observation TSVs + lockfile components + scanner output. Dedupe runs on three keys (PURL, name+version+arch, name+normalized-version+arch); the normalizer strips epoch prefixes (1:) and suffixes (+fips, +sonic, +bN, +debNuM) so e.g. openssh-server@10.0p1-7 and openssh-server@1:10.0p1-7+fips collapse to one component. The aggregator runs once per .bin/.img/.swi, and again in --container mode to produce standalone sidecars for docker-ptf, docker-ptf-sai, and docker-sonic-mgmt.
  • License resolution: scripts/sbom_resolve_licenses.py parses DEP-5 debian/copyright files (the structured format) and falls back to licensecheck heuristic scanning for free-form headers. A bundled scripts/sbom_license_map.json translates 104 common Debian License header strings to SPDX identifiers.
  • Outputs: CycloneDX 1.6 JSON is the canonical format; SPDX 2.3 is produced via cyclonedx-cli convert. scripts/sbom_emit_provenance.py produces an unsigned in-toto/SLSA v1.0 provenance attestation per artifact. scripts/sbom_diff.py is provided for two-build reproducibility checks.
  • Tool fetch: scripts/install_sbom_tool.sh pins syft 1.44.0, grype 0.112.0, and cyclonedx-cli 0.32.0 with SHA-256 checksums for amd64 and arm64, fetched through the existing wget shim so versions-web records the downloads.

2. Vulnerability scanning + VEX

  • Scanner: scripts/sbom_vuln_scan.py runs grype against a CycloneDX SBOM, applies VEX statements under vex/, and emits a CycloneDX VEX-annotated report plus a human-readable table. Policy is configurable (--min-severity, --fail-on, --ignore-unfixed). Standalone Python — not a make target — so engineers and CI can run it against any published SBOM without re-driving the build system.
  • Diff: scripts/sbom_vuln_diff.py shows CVE drift between two scan reports (added / fixed / unchanged).
  • VEX auto-extraction: scripts/sbom_extract_vex_from_patches.py scans src/*/patches/ for CVE markers (CVE-YYYY-NNNN) in patch headers and emits OpenVEX v0.2.0 JSON statements claiming fixed status with the patch as evidence. The output directory vex/auto/ is gitignored; the extractor runs automatically at the start of scripts/build_sbom.sh so the suppression set always reflects the current state of src/*/patches/.
  • Triage workflow: vex/README.md documents the OpenVEX schema, the auto-vs-manual directory split, and the engineer workflow for adding not_affected / false_positive / under_investigation statements with justification and impact_statement fields.

3. Azure Pipelines integration

  • .azure-pipelines/azure-pipelines-build.yml: appends ENABLE_SBOM=y to BUILD_OPTIONS so every existing make $BUILD_OPTIONS … line picks it up.
  • .azure-pipelines/build-template.yml: same ENABLE_SBOM=y propagation, plus a new inline SBOM vulnerability scan step that iterates over every target/sonic-*.bin.cdx.json and target/docker-*.gz.sbom.cdx.json produced by the build, runs sbom_vuln_scan.py against each, and publishes sbom-vuln-scan-results.<platform>.
  • azure-pipelines.yml (top-level): the Test-stage [OPTIONAL] Trivy vulnerability scan (docker-ptf) job from PR ci: add Trivy vulnerability scan for docker-ptf and docker-sonic-mgmt #27079 is replaced with a unified [OPTIONAL] SBOM-based vulnerability scan (all artifacts) job that downloads only the *.cdx.json sidecars from every sonic-buildimage.<platform> artifact (a few MB instead of multi-GB images) and scans them in parallel groups. continueOnError: true mirrors the original — informational, doesn't gate merges.
  • .azure-pipelines/docker-sonic-mgmt.yml: same trivy → sbom replacement for the docker-sonic-mgmt pipeline.

Documentation: README.sbom.md (~150 lines, with table of contents) covers the build flag, the four data sources, dedupe priority, vulnerability scanning quick-start, VEX workflow, reproducibility, and limitations.

Out of scope — transient build fixes (two trailing [temp]: commits)

These are not part of the SBOM feature; they unblock builds in my local environment and are independent of this work. They are included so the branch builds end-to-end and can be dropped once the upstream fixes land:

  • [temp]: Fix sonic-sysmgr build race (upstream PR #27447) — backport of a fix already submitted upstream for a protoc-generated .pb.cc race in src/sonic-sysmgr/Makefile.am. Remove once PR Add a dependency in sonic-sysmgr makefile #27447 merges.
  • [temp]: Fix Micas plat_sysfs parallel build race — adds a dev_sysfs: dev_cfg ordering dependency in platform/broadcom/sonic-platform-modules-micas/common/modules/plat_sysfs/Makefile so dev_sysfs's modpost can see dev_cfg/Module.symvers under -j. Belongs upstream in the Micas platform repo.

How to verify it

Default path — non-SBOM builds are unchanged:

make configure PLATFORM=broadcom
make target/sonic-broadcom.bin
ls target/*.cdx.json target/*.spdx.json 2>&1   # expected: no files

There should be no SBOM artifacts, no extra Python invocations during the build, and no increase in build wall time.

Enable SBOM:

make configure PLATFORM=broadcom ENABLE_SBOM=y
make ENABLE_SBOM=y target/sonic-broadcom.bin
ls target/sonic-broadcom.bin.cdx.json target/sonic-broadcom.bin.spdx.json target/sonic-broadcom.bin.intoto.json
jq '.components | length' target/sonic-broadcom.bin.cdx.json   # expected: thousands

For per-container SBOMs (the test containers not embedded in any .bin):

make ENABLE_SBOM=y target/docker-ptf.gz target/docker-sonic-mgmt.gz
ls target/docker-ptf.gz.sbom.cdx.json target/docker-sonic-mgmt.gz.sbom.cdx.json

Vulnerability scan:

python3 scripts/sbom_vuln_scan.py \
    --vex vex \
    --min-severity medium \
    target/sonic-broadcom.bin.cdx.json

Expected: a table of MEDIUM+ findings with VEX-suppressed CVEs filtered out.

Reproducibility:

# Build twice with identical inputs
python3 scripts/sbom_diff.py build1/sonic-broadcom.bin.cdx.json build2/sonic-broadcom.bin.cdx.json

Expected: an empty diff (any differences indicate non-deterministic build inputs).

VEX auto-extraction:

python3 scripts/sbom_extract_vex_from_patches.py --output vex/auto
ls vex/auto/*.json

Expected: one OpenVEX JSON file per CVE referenced in any src/*/patches/ patch header.

Azure Pipelines: the existing [OPTIONAL] Trivy vulnerability scan (docker-ptf) and [OPTIONAL] Trivy vulnerability scan (docker-sonic-mgmt) jobs are gone from the Test stage; replacement [OPTIONAL] SBOM-based vulnerability scan jobs run on the cheap sonic-ubuntu-1c pool with the same continueOnError semantics, and emit sbom-vuln-scan-results.* artifacts.

Which release branch to backport (provide reason below if selected)

  • 202305
  • 202311
  • 202405
  • 202411
  • 202505
  • 202511

This is a feature, not a fix — no backport requested.

Tested branch (Please provide the tested image version)

  • master / sonic-broadcom.bin (verified end-to-end: 5959 components, 5.1 MB SBOM, 3 .bin variants)

Description for the changelog

Add opt-in SBOM generation and SBOM-based vulnerability scanning (ENABLE_SBOM=y); replaces the Trivy CI jobs from PR #27079 with the new scanner.

Link to config_db schema for YANG module changes

N/A — no YANG schema changes.

A picture of a cute animal (not mandatory but encouraged)

@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run Azure.sonic-buildimage

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@bhouse-nexthop bhouse-nexthop force-pushed the bhouse.sbom-generation branch from 6f8e36f to 22032d4 Compare May 20, 2026 01:15
@bhouse-nexthop
Copy link
Copy Markdown
Collaborator Author

/azpw run

@mssonicbld
Copy link
Copy Markdown
Collaborator

⚠️ Notice: /azpw run only runs failed jobs now. If you want to trigger a whole pipline run, please rebase your branch or close and reopen the PR.
💡 Tip: You can also use /azpw retry to retry failed jobs directly.

Retrying failed(or canceled) jobs...

@mssonicbld
Copy link
Copy Markdown
Collaborator

No Azure DevOps builds found for #27455.

@bhouse-nexthop bhouse-nexthop force-pushed the bhouse.sbom-generation branch from 22032d4 to 82a7905 Compare May 20, 2026 01:23
@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run Azure.sonic-buildimage

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@bhouse-nexthop bhouse-nexthop force-pushed the bhouse.sbom-generation branch from 82a7905 to afa0659 Compare May 20, 2026 03:44
@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run Azure.sonic-buildimage

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@bhouse-nexthop bhouse-nexthop force-pushed the bhouse.sbom-generation branch 2 times, most recently from 1950da6 to f36637c Compare May 20, 2026 08:12
@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run Azure.sonic-buildimage

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run Azure.sonic-buildimage

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@bhouse-nexthop
Copy link
Copy Markdown
Collaborator Author

/azpw run

@mssonicbld
Copy link
Copy Markdown
Collaborator

⚠️ Notice: /azpw run only runs failed jobs now. If you want to trigger a whole pipline run, please rebase your branch or close and reopen the PR.
💡 Tip: You can also use /azpw retry to retry failed jobs directly.

Retrying failed(or canceled) jobs...

@mssonicbld
Copy link
Copy Markdown
Collaborator

No failed(or canceled) stages or jobs found in the most recent build 1118358.

@bhouse-nexthop bhouse-nexthop force-pushed the bhouse.sbom-generation branch from f36637c to b8e93cc Compare May 20, 2026 08:53
@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run Azure.sonic-buildimage

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@bhouse-nexthop bhouse-nexthop force-pushed the bhouse.sbom-generation branch from b8e93cc to decd2be Compare May 20, 2026 09:00
@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run Azure.sonic-buildimage

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run Azure.sonic-buildimage

@bhouse-nexthop bhouse-nexthop force-pushed the bhouse.sbom-generation branch from 25debd0 to 9f81eb4 Compare May 20, 2026 19:25
@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run Azure.sonic-buildimage

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@bhouse-nexthop bhouse-nexthop force-pushed the bhouse.sbom-generation branch from 9f81eb4 to 673f611 Compare May 20, 2026 19:30
@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run Azure.sonic-buildimage

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run Azure.sonic-buildimage

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@bhouse-nexthop bhouse-nexthop force-pushed the bhouse.sbom-generation branch from 94c2afd to 673f611 Compare May 20, 2026 23:01
@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run Azure.sonic-buildimage

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@bhouse-nexthop bhouse-nexthop force-pushed the bhouse.sbom-generation branch from 673f611 to 239d27c Compare May 20, 2026 23:31
@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run Azure.sonic-buildimage

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@bhouse-nexthop bhouse-nexthop force-pushed the bhouse.sbom-generation branch 2 times, most recently from c8fbb94 to 0654d98 Compare May 20, 2026 23:37
@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run Azure.sonic-buildimage

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@bhouse-nexthop bhouse-nexthop force-pushed the bhouse.sbom-generation branch from 0654d98 to d950167 Compare May 21, 2026 12:26
@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run Azure.sonic-buildimage

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

Copy link
Copy Markdown
Contributor

@securely1g securely1g left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thorough and well-documented PR — the comparison table against trivy alone justifies the approach. A few observations:

Structure

5 commits — consider squashing. SONiC convention is single-commit PRs. At minimum, the three [sbom] commits should be squashed into one.

The [build] and [temp] commits should be separate PRs. The build_mirror_config.sh race fix and the Micas plat_sysfs ordering fix are real bugs worth fixing, but bundling them here makes this PR harder to review and merge. Split them out — they can land independently and faster.

Design (positive)

  • Zero-overhead opt-in via make-level $(if) short-circuit is the right call. Non-SBOM builds pay nothing.
  • cargo-auditable wrapper that transparently embeds Cargo.lock into ELF .dep-v0 sections without touching any debian/rules — clever.
  • Recipe-emit-wins dedupe with version normalization (stripping epoch, +fips, +sonic suffixes) is well thought out.
  • VEX auto-extraction from src/*/patches/ gives you CVE suppression that tracks source state rather than a manually-maintained ignore file.
  • CI cost: downloading ~5 MB of .cdx.json sidecars vs multi-GB image tarballs is a big win.

Concerns

cargo-wrapper error path

The wrapper does exec "$REAL" auditable "$@" — if cargo-auditable is not installed (e.g. a custom slave image without the Dockerfile change), this silently fails the build rather than falling back. Consider checking for cargo-auditable on PATH and falling through to plain cargo build if absent:

build)
    if command -v cargo-auditable >/dev/null 2>&1; then
        export CARGO_AUDITABLE_WRAPPED=1
        exec "$REAL" auditable "$@"
    fi
    exec "$REAL" "$@"
    ;;

cargo-wrapper duplication

The same cargo-wrapper file is copied into all 4 slave dirs (bookworm, bullseye, buster, trixie). Consider putting it in a shared location (e.g. files/) and COPYing from there to avoid 4-way drift.

|| true in slave.mk vs SBOM_STRICT

Every sbom_emit_fragment call in slave.mk is wrapped with || true, which means even with SBOM_STRICT=y, fragment emission failures are silently swallowed at the make level before build_sbom.py ever sees them. This is fine for the "SBOM cannot break a build" philosophy, but worth documenting that SBOM_STRICT only governs the aggregation phase, not the per-recipe emit phase.

find / in collect_version_files

The lockfile harvest does find / ... with prune exclusions. This is broad — on a large slave container it could be slow or pick up unexpected lockfiles. Consider scoping to known source paths (/sonic/src/, /usr/, etc.) rather than root.

Reproducibility of SBOM timestamps

now_iso() in build_sbom.py respects SOURCE_DATE_EPOCH — good. But is SOURCE_DATE_EPOCH actually set during SONiC builds? If not, SBOMs will have non-deterministic timestamps and sbom_diff.py will report false diffs. Worth noting in the README or setting it from the build system.

Minor

  • README.sbom.md at 507 lines is excellent documentation. Consider linking it from the main README.md.
  • The scripts/sbom_license_map.json with 104 entries — is this manually maintained? If so, consider documenting how to add entries (or auto-generating from a source).

Overall this is solid work. The main ask is splitting out the unrelated build fixes and squashing the SBOM commits.

@bhouse-nexthop
Copy link
Copy Markdown
Collaborator Author

The [build] and [temp] commits should be separate PRs. The build_mirror_config.sh race fix and the Micas plat_sysfs ordering fix are real bugs worth fixing, but bundling them here makes this PR harder to review and merge. Split them out — they can land independently and faster.

Yes, these are separate PRs I filed after running into them:

I needed them in there for building and testing and they will be removed from this PR once those PRs merge.

This PR adds opt-in CycloneDX 1.6 SBOM generation, SPDX 2.3
conversion, SLSA v1.0 in-toto provenance, and SBOM-based
vulnerability scanning for SONiC images. Default builds are
unchanged; SBOM emission is opt-in via ENABLE_SBOM=y at build time.

Design summary
==============

Hybrid SBOM aggregation pulls from four independent sources and
deduplicates by PURL + (name, version, arch) + (name, normalized
version, arch). Recipe-emit wins ties because it carries pedigree
and patch-set provenance that scanners can't recover from the
shipped binary alone.

* Recipe-emit fragments: slave.mk hooks every recipe that produces
  an artifact in scope (SONIC_DPKG_DEBS, SONIC_MAKE_DEBS,
  SONIC_DERIVED_DEBS, SONIC_EXTRA_DEBS, SONIC_ONLINE_DEBS,
  SONIC_PYTHON_STDEB_DEBS, SONIC_PYTHON_WHEELS, docker-image-save).
  scripts/sbom_fragment.py detects four fork-of-upstream patterns
  (sonic-net submodules, dget+patches, nested submodule a la FRR,
  direct-upstream-submodule + sidecar patches) and emits
  pedigree.ancestors[] accordingly.

* Observation harvest: src/sonic-build-hooks/scripts/collect_version_files
  is extended (gated on ENABLE_SBOM=y) to emit a 9-column TSV per
  container and snapshot /usr/share/doc/*/copyright + language
  lockfiles found in known source roots (/sonic /usr /etc /root
  /home /opt). Avoids walking the entire filesystem.

* Rust crate visibility via cargo-auditable: every slave Dockerfile
  installs cargo-auditable and a transparent /usr/local/bin/cargo
  shim (files/build/cargo-wrapper, staged into each slave's build
  context by scripts/prepare_docker_buildinfo.sh). Every Rust
  binary embeds its resolved Cargo.lock in a .dep-v0 ELF section;
  syft reads it at scan time. The shim fails hard with an
  actionable error if cargo-auditable is missing, rather than
  silently degrading to plain cargo build.

* Lockfile expansion: scripts/sbom_parse_lockfiles.py parses
  lockfiles harvested from runtime containers (Cargo.lock, go.sum,
  package-lock.json, pnpm-lock.yaml, yarn.lock) and emits the
  appropriate PURL types.

* Image-level aggregator: scripts/build_sbom.py merges fragments +
  observations + lockfiles + scanner output. Version normalization
  strips epoch prefixes (1:) and downstream suffixes (+fips,
  +sonic, +bN, +debNuM) so the recipe-emit and observation entries
  collapse to one component.

* License resolution: scripts/sbom_resolve_licenses.py parses DEP-5
  debian/copyright; falls back to licensecheck for free-form
  copyrights. A bundled scripts/sbom_license_map.json (~100
  entries) translates Debian License header strings to SPDX
  identifiers. Maintenance instructions in README.sbom.md.

* SOURCE_DATE_EPOCH is set from the HEAD git commit time in
  Makefile.work and exported through to the slave container, so
  two builds of the same source produce byte-identical SBOMs.

* Strict mode (SBOM_STRICT=y, default when ENABLE_SBOM=y) fails
  the build when a critical input is missing (host rootfs, declared
  installer dockers, scanner binary). Per-recipe sbom_emit_fragment
  and sbom_emit_per_container honor SBOM_STRICT consistently: when
  set, fragment-emit failures abort the recipe; when n, failures
  are swallowed (defensive against script bugs only).

Output filenames track the actual installer artifact via the
SBOM_TARGET_ARTIFACT env var ($* from the slave.mk recipe), so .bin,
.swi, and .img.gz all get correctly-named sidecars:
  target/sonic-broadcom.bin.cdx.json + .spdx.json + .intoto.json
  target/sonic-aboot-broadcom.swi.cdx.json + ...
  target/sonic-vs.img.gz.cdx.json + ...

Vulnerability scanning
======================

scripts/sbom_vuln_scan.py runs grype against a CycloneDX SBOM,
applies VEX statements under vex/, and emits a CycloneDX
VEX-annotated report plus a human-readable table. Policy is
configurable (--min-severity, --fail-on, --ignore-unfixed).
Standalone Python; not a make target.

* scripts/sbom_extract_vex_from_patches.py scans src/*/patches/
  for CVE markers and auto-emits OpenVEX v0.2.0 statements with
  the patch as evidence. vex/auto/ is gitignored and regenerated
  by scripts/build_sbom.sh so the suppression set always tracks
  src/*/patches/.

* The human-readable table lists only actionable findings (those
  with an upstream fix available); not-fixed and wont-fix entries
  are summarized at the top so the table is action-oriented. The
  --fail-on gate is restricted to actionable findings so CI
  doesn't fail on things the team can't act on. The CycloneDX
  JSON sidecar preserves grype's exact fix.state per finding so
  downstream tooling can still distinguish not-fixed from
  wont-fix.

Azure Pipelines integration
===========================

* .azure-pipelines/azure-pipelines-build.yml: ENABLE_SBOM=y
  appended to BUILD_OPTIONS so existing make $(BUILD_OPTIONS) ...
  lines pick it up without per-line edits.

* .azure-pipelines/build-template.yml: ENABLE_SBOM=y propagation
  + a new inline SBOM vulnerability scan step iterating over each
  aggregate SBOM (sonic-*.bin / .swi / .img.gz, docker-*.gz.sbom)
  with results published as sbom-vuln-scan-results.<platform>.
  Uses ##[group]/##[endgroup] for collapsible per-SBOM log
  sections.

* azure-pipelines.yml (top level): the Test-stage [OPTIONAL]
  Trivy vulnerability scan (docker-ptf) job from PR sonic-net#27079 is
  replaced with a unified [OPTIONAL] SBOM-based vulnerability
  scan (all artifacts) job. Downloads only *.cdx.json sidecars
  (a few MB instead of multi-GB images) and scans every aggregate
  SBOM the build produced. continueOnError: true mirrors the
  original.

* .azure-pipelines/docker-sonic-mgmt.yml: same trivy -> sbom
  replacement for the docker-sonic-mgmt pipeline.

Documentation
=============

README.sbom.md (~570 lines) covers configuration, scope,
architecture, license resolution and license-map maintenance,
tools, reproducibility, attestation/signing notes, vulnerability
scanning quick start, VEX workflow, verification, known
limitations, and a file map. Linked from README.md.

Signed-off-by: Brad House <bhouse@nexthop.ai>
scripts/build_mirror_config.sh writes $CONFIG_PATH/sources.list.<arch>
with `j2 $TEMPLATE | sed ... > path`. The `>` redirect truncates the
destination first, so during the j2|sed pipeline lifetime the file is
in a 0-byte / partial state.

build_debian.sh:117 runs this with CONFIG_PATH=files/apt, a path
shared across the whole tree. SONiC builds multiple rootfs variants
in parallel under `make -j` (e.g. broadcom, broadcom-dnx,
broadcom-legacy-th — all amd64, all writing
files/apt/sources.list.amd64). One parallel build_debian.sh can `cp`
the file at the exact instant another's `>` has just truncated it,
and end up with an empty sources.list in its chroot.

The chroot's apt-get update then succeeds trivially (no sources to
fetch, exit 0, no Get:/Hit:/Err: output). apt-get -y install
eatmydata later fails with `E: Unable to locate package eatmydata`.
The failure is flaky — only one of the three rootfs builds hits the
window per affected run.

Container builds are immune because prepare_docker_buildinfo.sh:47
calls build_mirror_config.sh with a per-container DOCKERFILE_PATH,
so each container has its own non-shared sources.list.amd64.

Fix: write via mktemp + atomic mv. Apply the same pattern to the
apt-retries-count file write that follows, which has the same race
shape. mv within the same filesystem is atomic — readers either see
the old content or the new content, never a torn write.

Signed-off-by: Brad House <bhouse@nexthop.ai>
The plat_sysfs/Makefile lists dev_cfg and dev_sysfs as prerequisites
of 'all' but declares no ordering between them. Under make -j
(SONiC uses -j12 by default) the two subdirs can build in parallel
and the modpost step in dev_sysfs fails with:

  dev_cfg/Module.symvers: No such file or directory

because dev_sysfs/Makefile's KBUILD_EXTRA_SYMBOLS references that
file. Adding 'dev_sysfs: dev_cfg' forces dev_cfg to finish first.

Temporary patch on this branch so verification builds for the SBOM
work complete reliably. Drop this commit (git rebase --onto) once
an equivalent upstream fix lands.

Signed-off-by: Brad House <bhouse@nexthop.ai>
@bhouse-nexthop bhouse-nexthop force-pushed the bhouse.sbom-generation branch from d950167 to 369da06 Compare May 21, 2026 21:26
@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run Azure.sonic-buildimage

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@bhouse-nexthop
Copy link
Copy Markdown
Collaborator Author

The wrapper does exec "$REAL" auditable "$@" — if cargo-auditable is not installed (e.g. a custom slave image without the Dockerfile change), this silently fails the build rather than falling back. Consider checking for cargo-auditable on PATH and falling through to plain cargo build if absent:

I'm going to reject this as since the SBOM is for security it would be bad if we couldn't extract this. Instead I made a change to provide a clear error message to the user.

The same cargo-wrapper file is copied into all 4 slave dirs (bookworm, bullseye, buster, trixie). Consider putting it in a shared location (e.g. files/) and COPYing from there to avoid 4-way drift.

done, though had to do a temporary copy elsewhere during the build where docker could access it.

Every sbom_emit_fragment call in slave.mk is wrapped with || true, which means even with SBOM_STRICT=y, fragment emission failures are silently swallowed at the make level before build_sbom.py ever sees them. This is fine for the "SBOM cannot break a build" philosophy, but worth documenting that SBOM_STRICT only governs the aggregation phase, not the per-recipe emit phase.

Fixed, different behavior for SBOM_STRICT=y vs SBOM_STRICT=n

The lockfile harvest does find / ... with prune exclusions. This is broad — on a large slave container it could be slow or pick up unexpected lockfiles. Consider scoping to known source paths (/sonic/src/, /usr/, etc.) rather than root.

done

now_iso() in build_sbom.py respects SOURCE_DATE_EPOCH — good. But is SOURCE_DATE_EPOCH actually set during SONiC builds? If not, SBOMs will have non-deterministic timestamps and sbom_diff.py will report false diffs. Worth noting in the README or setting it from the build system.

timestamp now taken from git commit time automatically

  • README.sbom.md at 507 lines is excellent documentation. Consider linking it from the main README.md.

done

  • The scripts/sbom_license_map.json with 104 entries — is this manually maintained? If so, consider documenting how to add entries (or auto-generating from a source).

Its manual intentionally, added a section to README.sbom.md

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants