feat(scan): external CycloneDX SBOM ingest endpoint#406
Open
haksungjang wants to merge 4 commits into
Open
Conversation
Add POST /v1/projects/{id}/sbom-ingest so external tools (CI, cdxgen-based
scanners) can upload an already-generated CycloneDX SBOM; TRUSCA runs the
back half of the scan pipeline against it — persist components → trivy sbom
matching → findings — reusing the Scan model so ingested scans get ref-keyed
retention, the per-project active-scan guard, and the existing
Components/Vulnerabilities/Licenses UI and build gate for free.
This is NOT a Dependency-Track compatible surface: it is a TRUSCA-native
endpoint (Authorization: Bearer, field `sbom`, no autoCreate), not DT's
/api/v1/bom + X-Api-Key.
Endpoint / service (services/sbom_ingest_service.py, api/v1/sbom.py):
- multipart sbom + ref + release; 202 ScanPublic (kind="sbom").
- require_role_or_api_key("developer"); project-scoped key must match.
- Reuses trigger_scan's guards via an extracted prepare_scan_target
(existence/team 404/403 before archived 409 / cap 429 — authz before state).
- Synchronous adversarial validation of untrusted input: bounded read
(SBOM_INGEST_MAX_BYTES, 32 MiB → 413), content-type/filename allow-list
(415), JSON + CycloneDX structure whitelist (422), component cap
(SBOM_INGEST_MAX_COMPONENTS, 50k → 422), and an O(n) string-aware byte
nesting-depth pre-check so a deeply nested document is a clean 422 instead
of a RecursionError → 500 from json.loads. RFC 7807 throughout.
- Atomic: flush wins the active-scan race before the file is written; a 409
loser writes no file; commit-race deletes the file; enqueue failure → 503.
Celery task (tasks/ingest_sbom.py, enqueue branch + include):
- ingest_sbom_task reuses persist_sbom_components → run_trivy_sbom →
persist_trivy_findings → mark_succeeded (ref-keyed supersede). Preserves the
uploaded SBOM as a durable sbom_cyclonedx ScanArtifact for the signature
surface; containment-guards the path under workspace_root().
Security (Producer-Reviewer findings addressed):
- bind_audit_team before the scan INSERT so the audit row carries team_id.
- disk-write failure → 503 SbomIngestStorageError (retryable), not 422.
- release / original_filename length-capped + control-byte stripped.
Tests: pure adversarial validator unit suite (incl. depth-bomb regression),
endpoint permission×state matrix + new existence-hide-state 409 rows,
realistic multi-CVE fixture pipeline test. Docs: EN/KO ci-integration/sbom-upload.
The OpenAPI contract snapshot test (test_openapi_no_drift) flagged the new
POST /v1/projects/{project_id}/sbom-ingest path. Add it to the committed
snapshot — path param project_id only (sbom/ref/release are requestBody).
…e-pkg layer image-scan (worker) HARD-failed on 3 node-pkg findings — lodash 4.17.19 (CVE-2021-23337, CVE-2026-4800) and minimist 1.2.5 (CVE-2021-44906) — that live under @cyclonedx/cdxgen/node_modules. Reproduction in node:20-bookworm shows cdxgen 11.x bundles both, while 12.3.3 AND 12.5.1 ship neither: a clean build already lacks them, so the failure was a stale type=gha scope=worker cache layer serving the pre-12.x install tree (same class as the earlier php-symfony image-scan incident). Bumping the version interpolated into the global npm install changes that layer's cache key, forcing a fresh (clean) install — root-cause removal, not a .trivyignore suppression (suppressing a package absent from a clean build would wrongly mute a future regression). cdxgen invocation is unchanged across 12.3.3→12.5.1 and engines.node still allows ^20, so no scan regression. Fixes main too (shared cache) once merged.
image-scan kept HARD-failing on lodash 4.17.19 (CVE-2021-23337, CVE-2026-4800) and minimist 1.2.5 (CVE-2021-44906) even after the cdxgen 12.3.3→12.5.1 bump, which only rebuilt the cdxgen layer. A fresh local install of cdxgen 12.5.1 and of npm 11.14.1 — the image's only two npm-package installers — pulls neither package, and these CVEs were never in .trivyignore, yet image-scan passed on #404/#405. The vulnerable copies therefore live in a stale, earlier `scope=worker` cache layer (a non-deterministic npm-install resolution cached long ago), not in anything the current Dockerfile produces. Bumping the buildx GHA cache scope (worker → worker-v2) abandons the poisoned cache and forces a single clean rebuild; the new namespace caches the clean tree. Keeps the cdxgen 12.5.1 bump (latest 12.x, verified lodash/minimist-free).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Adds
POST /v1/projects/{project_id}/sbom-ingestso external tools (CI, cdxgen-based scanners) can upload an already-generated CycloneDX SBOM. TRUSCA runs the back half of the scan pipeline against it — persist components →trivy sbommatching → findings — reusing theScanmodel so ingested scans get ref-keyed retention, the per-project active-scan guard, and the existing Components/Vulnerabilities/Licenses UI and build gate for free.Builds on #404 (
sbomscan kind) and #405 (shared pipeline helpers).Contract
Poll
GET /v1/scans/{id}to completion (same as the GitHub Action flow).Endpoint / service
trigger_scan's guards via an extractedprepare_scan_target(behavior-preserving): existence/team 404/403 before archived 409 / cap 429 — authz/existence always before state (CLAUDE.md §2 rule 1).SBOM_INGEST_MAX_BYTES, 32 MiB → 413), content-type/filename allow-list (415), JSON + CycloneDX structure whitelist (422), component cap (SBOM_INGEST_MAX_COMPONENTS, 50k → 422), and an O(n) string-aware byte nesting-depth pre-check so a deeply nested document is a clean 422 instead ofRecursionError→ 500 fromjson.loads. RFC 7807 throughout.flushwins the active-scan race before the file is written; a 409 loser writes no file; commit-race deletes the file; enqueue failure flips the row tofailed→ 503.Celery task
ingest_sbom_taskreusespersist_sbom_components→run_trivy_sbom→persist_trivy_findings→mark_succeeded(ref-keyed supersede). Preserves the uploaded SBOM as a durablesbom_cyclonedxScanArtifact(so the signature/bundle surface works) and containment-guards the path underworkspace_root().Filled: components, vulnerabilities (Trivy), declared licenses, dependency graph, build gate. Not filled (documented): scancode-detected / registry-concluded licenses, cosign signing, source preservation — these need a source tree.
Security — Producer-Reviewer
security-reviewerran (CLAUDE.md §7): 0 Critical/High, 2 Medium, 2 Low, 1 Info. Addressed in this PR:bind_audit_team(project.team_id)before the scan INSERT so the audit row carriesteam_id(was NULL, dropping ingest mutations out of team-scoped audit views).SbomIngestStorageError(retryable), not a misleading 422.release/original_filenamelength-capped + control-byte stripped (parity withtrigger_scan'smask_pii).Deferred follow-ups (tracked, non-blocking):
maxRequestBodyBytesdevops change covering both upload surfaces.persist_sbom_components(contained: fails only the attacker's own scan). Harden the shared persist path to skip-and-log — affects the cdxgen pipeline too, so separate.typeURIs usedocs.trustedoss.io(consistent with all 33 existing ones) — fold into the broader TRUSCA rebrand sweep.Tests
Docs
EN + KO
ci-integration/sbom-upload.md(ko-style lint 0 findings, Docusaurus builds clean): contract, curl, auth (Bearer notX-Api-Key), filled/not-filled, limits, RFC 7807 errors, DT-incompat caution.Verification
mypy .(full): clean (447 files).ruff check: clean.test (backend)(needs Postgres+Redis).Follow-up (separate PR)
Frontend
kind="sbom"badge label + EN/KO i18n.