Skip to content

parse() silently accepts non-SBOM / wrong-format input, producing a false "pass" for CI gates #21

Description

@dmchaledev

Summary

parse() never validates its input. When given JSON that is not a recognized SBOM (a wrong-format file, a package.json passed by mistake, a truncated/garbage document), it does not error — it silently returns an empty CycloneDX SBOM. Because the README sells this tool as "perfect for CI/CD gates and audit trails," this is a silent false-negative: a broken or wrong input sails through the pipeline as if "nothing changed."

This is distinct from the in-flight parser/CLI PRs (#5 --fail-on, #18 CVSS) — none of them validate that the input is actually a parseable SBOM.

Reproduction

import { parse } from '@hailbytes/sbom-diff';
import { diff, renderReport } from '@hailbytes/sbom-diff';

// A non-SBOM file (e.g. a package.json passed by mistake, or a corrupt export)
const notAnSbom = JSON.stringify({ name: 'my-app', dependencies: { lodash: '^4.17.21' } });

const result = parse(notAnSbom);
console.log(result.format, result.components.length); // -> "cyclonedx" 0   (no error!)

Two concrete failure modes

1. Unknown input is mislabeled as CycloneDX.
detectFormat() correctly returns 'unknown' (src/parser.ts:6-16), but parse()'s default branch falls through to parseCycloneDX (src/parser.ts:93-96), which always stamps format: 'cyclonedx' (src/parser.ts:46). So the caller is told the garbage is a valid CycloneDX document with zero components.

2. A wrong/garbage new SBOM passes the gate.
Diffing a real SBOM against a wrong-format file yields a clean report with exit 0:

$ sbom-diff real.cdx.json oops-wrong-file.json
Summary:
  Added:       0
  Removed:     1     <- the only signal, and it's misleading
  Upgraded:    0
  New CVEs:    0     <- gate sees "no new CVEs"
  Fixed CVEs:  0

totalNewCVEs / totalAdded are 0, so a --fail-on new-cves,added,critical gate (proposed in #5) passes even though the comparison is meaningless. A typo'd path, a wrong artifact name in CI, or a scanner that emitted an unexpected format all silently produce a green check. For a package keyworded supply-chain-security / vulnerability-management, a gate that passes on broken input is the worst kind of failure.

Proposed change

Make unrecognized / unparseable input loud instead of silent. Suggested design (happy to adjust):

  1. Throw on unknown format by default. When detectFormat() returns 'unknown', parse() should throw a clear error:
    Error: Unrecognized SBOM format. Expected CycloneDX (bomFormat: "CycloneDX") or SPDX (spdxVersion: ...).
    The CLI already prints err.message and exits non-zero (src/cli.ts:86-89), so this surfaces as a clean failure with a useful message rather than a green pipeline.

  2. Guard the happy path too. If a detected SBOM has zero components and zero vulnerabilities, that almost always means a malformed document; at minimum emit a warning to stderr (Warning: parsed SBOM contains 0 components — is this the right file?).

  3. Optional escape hatch for the current best-effort behavior, e.g. parse(input, { strict: false }) or a CLI --allow-unknown flag, so anyone relying on lenient parsing can opt back in.

If a hard throw is considered too breaking, (2)+(3) alone (warn + opt-in strictness) still close the silent-gate hole.

Why this is high-leverage

I'm happy to open a focused PR implementing option (1) + (2) (with tests for unknown-format throw, empty-SBOM warning, and the existing CycloneDX/SPDX happy paths) once there's agreement on whether the default should throw.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions