Skip to content

Support CycloneDX XML input (advertised as supported, but parse() is JSON-only and fails with a cryptic error) #27

Description

@dmchaledev

Summary

src/types.ts:4 declares the package "Supports CycloneDX (JSON/XML) and SPDX (JSON)", but XML is never parsed. parse() unconditionally calls JSON.parse() (src/parser.ts:87), so any CycloneDX XML document — the default output of several mainstream generators — throws an unhelpful SyntaxError that mentions neither XML nor which file failed.

CycloneDX XML is not a niche format: cyclonedx-maven-plugin and cyclonedx-gradle-plugin emit bom.xml by default, and many enterprise/Java toolchains standardize on it. For a package keyworded cyclonedx / supply-chain-security whose README targets "security engineers, DevSecOps teams, and supply-chain risk analysts," silently rejecting a first-class CycloneDX serialization is a real adoption blocker.

This is distinct from every open PR/issue — they all operate on the post-JSON.parse object (diff keying #20, downgrades #24, ordering #25, CVSS #18, markdown escaping #23, --fail-on #5, license #9, hashes #22, unknown-JSON-format validation #21). None add an XML input path.

Reproduction (current main)

A minimal, valid CycloneDX 1.5 XML SBOM:

<?xml version="1.0" encoding="UTF-8"?>
<bom xmlns="http://cyclonedx.org/schema/bom/1.5" version="1">
  <components>
    <component type="library">
      <name>lodash</name>
      <version>4.17.20</version>
      <purl>pkg:npm/lodash@4.17.20</purl>
    </component>
  </components>
</bom>
$ sbom-diff old.xml new.xml
Unexpected token '<', "<?xml vers"... is not valid JSON
exit code: 1

The error doesn't say XML is unsupported, doesn't name the offending file, and directly contradicts the documented "JSON/XML" support.

Evidence in source

  • src/types.ts:4"Supports CycloneDX (JSON/XML) and SPDX (JSON)" (the claim).
  • src/parser.ts:86-97parse() does JSON.parse(input) then dispatches on detectFormat(); there is no XML branch anywhere.
  • detectFormat() (src/parser.ts:6-16) only inspects a parsed JS object, so it can't see XML at all.

Proposed change

1. Add a real XML input path (the actual feature)

  • Detect XML before JSON. In parse(input: string | object), when input is a string whose first non-whitespace char is <, route to an XML parser instead of JSON.parse. (Object input stays JSON-only, as today.)
  • Parse CycloneDX XML into the existing canonical SBOM. Add parseCycloneDXXML(xml: string): SBOM that maps the XML shape onto the same Component / CVEEntry model the JSON path already produces, so diff() and renderReport() need zero changes. Field mapping mirrors the JSON parser:
  • Dependency: use a small, well-maintained, zero-native-dep parser such as fast-xml-parser (MIT, no transitive bloat, already common in the SBOM tooling ecosystem). This keeps the install footprint small — relevant given the bundle-size badge in the README.

2. Interim safety net + honesty (ship even before full XML lands)

Independently valuable and tiny:

  • When JSON.parse throws and the input looks like XML, throw a clear, actionable error instead of the raw SyntaxError, e.g.:
    Error: Input looks like XML. CycloneDX XML is not yet supported — convert to JSON (e.g. 'cyclonedx convert --output-format json') or see #<this issue>.
  • Until XML parsing exists, correct the claim in src/types.ts:4 to "CycloneDX (JSON) and SPDX (JSON)" so the docs don't promise a format the code rejects. (Flip it back when chore: npm pkg fix — normalize package.json fields #1 lands.)

3. Tests

  • parseCycloneDXXML maps components/versions/purls/licenses/hashes/vulns onto the canonical SBOM identically to the JSON fixture in parser.test.ts (parameterize the existing assertions over both serializations).
  • parse() auto-routes a leading-< string to the XML path and a leading-{ string to JSON.
  • End-to-end: diff(parse(oldXml), parse(newXml)) produces the same ChangeReport as the equivalent JSON inputs — proving diff/reporter stay format-agnostic.
  • Interim: malformed/non-SBOM XML throws the clear, actionable error (not a raw SyntaxError).

Why this is high-leverage

  • Closes a gap between advertised and actual capability for a primary CycloneDX serialization — the kind of mismatch that erodes trust in a security tool.
  • Unblocks a large class of real users (Maven/Gradle/enterprise pipelines that emit XML by default) with no workflow change on their side.
  • Surgical blast radius: all changes live in parser.ts (+ one small dependency). The canonical SBOM model is unchanged, so diff.ts, reporter.ts, and cli.ts are untouched — it composes cleanly with every in-flight PR.
  • The interim error + doc fix (step 2) is a few lines and can land immediately, even if full XML support is staged separately.

Happy to open a focused PR implementing the XML parser + tests (and the interim error/doc fix as a first commit) if the direction and the fast-xml-parser dependency are acceptable.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions