feat(sbom): accept SPDX (JSON / Tag-Value) on the ingest endpoint (model 3)#411
Merged
Conversation
…del 3) #406 ingest accepted CycloneDX-JSON only; the SPDX→CycloneDX converter landed in #409 but was unwired. This enables SPDX end-to-end: - sbom_ingest_service.validate_uploaded_sbom: format-dispatch validator. The O(n) byte-depth pre-check runs BEFORE any json.loads (incl. detect_format's) so a deeply-nested document is a clean 422, never a RecursionError → 500. CycloneDX keeps its existing structural gate; SPDX-JSON bounds its packages[] array (same cap as components[]); SPDX Tag-Value is bounded by the read cap. Content-type / filename allow-list gains SPDX media types + .spdx/.tag (NOT the over-broad text/plain). unknown / RDF / XML → 422. - tasks/ingest_sbom._load_uploaded_sbom: maps the upload to a CycloneDX dict via sbom_convert.to_cyclonedx (CycloneDX passes through; SPDX JSON/TV is mapped) for persist_sbom_components. The ORIGINAL bytes stay on disk and are handed to Trivy, which auto-detects CycloneDX vs SPDX — no lossy round-trip for matching. - Tests: validate_uploaded_sbom unit cases (real syft SPDX fixtures + adversarial RDF/XML/depth/cap, all local-runnable); pipeline test ingests a real SPDX-JSON → components persisted + conformance source_format='spdx-json'; API tests assert SPDX-JSON and SPDX Tag-Value uploads return 202 (the fake-bomFormat:SPDX 422 case is unchanged — real SPDX uses spdxVersion). - docs (EN/KO): the SBOM-upload guide now documents SPDX acceptance, the media type / filename allow-list, and SPDX RDF/XML being unsupported.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
모델 3 (받은 SBOM) — SPDX 입력 활성화
#406의 인제스트는 CycloneDX-JSON만 받았고, SPDX→CycloneDX 변환기는 #409에 있었지만 미배선이었다. 이 PR이 SPDX(JSON·Tag-Value)를 엔드투엔드로 활성화한다.
포함
sbom_ingest_service.validate_uploaded_sbom— 포맷 분기 검증기. O(n) byte-depth 사전검사를detect_format(내부 json.loads)보다 먼저 수행해 깊은 중첩 문서가 RecursionError(500)가 아니라 깨끗한 422가 되게 함. CycloneDX는 기존 구조 게이트 유지, SPDX-JSON은packages[]를components[]와 동일 cap으로 제한, SPDX Tag-Value는PackageName:개수를 동일 cap으로 제한(security review Medium 대응 — 바이트 cap만으론 ~2.4M 패키지 통과 가능). content-type/파일명 allow-list에 SPDX 타입 +.spdx/.tag추가(과도하게 넓은text/plain은 제외). unknown·RDF·XML → 422.tasks/ingest_sbom._load_uploaded_sbom— 업로드를sbom_convert.to_cyclonedx로 CycloneDX dict로 정규화(컴포넌트 적재용). 원본 바이트는 디스크에 그대로 두고 Trivy에 전달 — Trivy가 CycloneDX/SPDX를 자동 감지하므로 매칭에 무손실.validate_uploaded_sbom단위(실물 syft SPDX 픽스처 + 적대적 RDF/XML/깊이/개수 cap, 전부 로컬 실행 가능); 파이프라인이 실물 SPDX-JSON 인제스트 → 컴포넌트 적재 + conformancesource_format='spdx-json'; API가 SPDX-JSON·Tag-Value 업로드 202 확인(가짜bomFormat:"SPDX"422 케이스는 그대로 — 진짜 SPDX는spdxVersion사용).보안 검토 (Producer-Reviewer)
security-reviewerPASS (Critical 0/High 0). Medium 1건(Tag-Value 개수 cap 부재)은 이 PR에서 수정. Low 2건(워커 재파싱 depth 가드는 task catch-all로 방어됨 / Tag-Value 단일라인 sniff는 Trivy 콘텐츠 권위 모델로 수용)은 문서화하고 수용. 권한×상태 순서·경로 컨테이너 탈출·ReDoS·시크릿 로깅 모두 PASS.검증