Skip to content

[REVIEW] api-security: add multipart file-upload parser and archive-bomb gates #1215

@alejandrorivas-pixel

Description

@alejandrorivas-pixel

Skill Being Reviewed

Skill name: api-security
Skill path: skills/appsec/api-security/

False Positive Analysis

Benign API behavior that can be over-reported:

endpoint: POST /api/documents/upload
request:
  content_type: multipart/form-data
controls:
  max_request_body_size: 10MB
  max_file_count: 3
  extension_allowlist: [pdf, png]
  magic_number_validation: enabled
  content_type_header_trusted: false
  archive_extraction: disabled
  storage:
    location: object_storage
    filename_source: server_generated_uuid
    executable_permissions: false
  malware_scan: async_quarantine_before_release

Why this is a false positive:
The current skill maps API4 resource consumption and API8 parser misconfiguration well, but it does not provide a clear evidence path for multipart/file-upload endpoints. A reviewer may over-report every file upload as unsafe when it already has size, count, signature, storage, and quarantine controls. The skill needs explicit benign evidence fields so secure upload endpoints are classified accurately.

Coverage Gaps

Missed variant 1: trusting multipart Content-Type and filename extension

app.post('/api/upload', upload.single('file'), async (req, res) => {
  if (req.file.mimetype !== 'image/png') return res.status(400).end();
  await fs.promises.writeFile(`/var/www/uploads/${req.file.originalname}`, req.file.buffer);
  res.json({ url: `/uploads/${req.file.originalname}` });
});

Why it should be caught:
Attackers can spoof Content-Type, use polyglot files, inject path traversal through filenames, or upload active content that is later served from the application origin. The current main skill does not require content signature validation, server-generated names, storage isolation, executable-permission checks, or download response hardening.

Missed variant 2: archive extraction accepts ZIP bombs or path traversal entries

@app.post('/api/import')
def import_zip():
    z = zipfile.ZipFile(request.files['archive'])
    z.extractall('/srv/imports')
    return {'imported': len(z.infolist())}

Why it should be caught:
Archive extraction can cause resource exhaustion, overwrite files through ../ entries, or expand far beyond upload size limits. API4 guidance covers resource consumption generally, but the skill should explicitly require compressed/uncompressed size ratio limits, entry count/depth limits, canonical path checks, and extraction into an isolated workspace.

Missed variant 3: parser differential between gateway, framework, and downstream scanner

Gateway limit: 10MB request body
Application parser: accepts unlimited multipart parts after buffering
AV scanner: scans only first file part
Business logic: processes all parts and stores original filename

Why it should be caught:
Multipart and parser-boundary issues often come from inconsistent limits across layers. The API review should bind gateway, application parser, scanner/quarantine, and storage behavior into one evidence gate rather than accepting a single gateway limit as proof.

Edge Cases

  • Some APIs accept uploads but never parse or serve the file. The report should allow low-risk classification when files are encrypted, quarantined, and only passed to a trusted offline workflow.
  • Image resizing or document conversion can be the real vulnerable parser, not the upload endpoint itself. Review should include downstream processors.
  • Asynchronous malware scanning is acceptable only if the file remains quarantined and inaccessible until the scan completes.
  • Object storage is safer than app-server storage only if bucket policy, random object keys, content-disposition, and executable-content serving controls are verified.
  • Content-type validation should not rely solely on Content-Type or user-provided filename extension.

Remediation Quality

  • Fix resolves the vulnerability
  • Fix doesn't introduce new security issues
  • Fix doesn't break functionality
  • Issues found: Add a file-upload/parser-boundary gate to api-security requiring evidence for size/count limits, content signature validation, extension allowlist, archive extraction safety, parser-layer consistency, quarantine/scanning, server-generated filenames, storage isolation, and safe download headers.

Severity guidance should distinguish:

  • High: unauthenticated or broadly authenticated upload endpoint stores attacker-controlled filenames/content under a web-served path, trusts Content-Type, or extracts archives without bounds/path checks.
  • Medium: upload controls exist at one layer but parser/scanner/storage limits are inconsistent or not evidenced.
  • Low: secure controls exist but report lacks complete evidence.
  • Informational: uploads are quarantined/offline with documented no-serve/no-extract behavior.

Comparison to Other Tools

Tool / Framework Catches this? Notes
Semgrep Partial Can detect direct extractall, originalname, or unsafe writes, but cannot prove layered upload limits and storage controls alone.
CodeQL Partial Can model path traversal and unsafe archive extraction in some languages, but API upload review needs architecture evidence.
OWASP API Top 10:2023 Partial API4 covers resource consumption and API8 covers misconfiguration, but the skill needs concrete file-upload parser gates.
OWASP File Upload Cheat Sheet Yes Provides concrete controls for extension validation, content-type/signature checks, storage, antivirus/sandboxing, and malicious file risks.
Manual API review Yes Can bind gateway, framework, parser, scanner, storage, and download behavior together.

Overall Assessment

Strengths:

  • Strong mapping to OWASP API Top 10:2023.
  • Good GraphQL coverage for introspection, query depth, field authorization, and alias abuse.
  • Good C# supplement note that upload endpoints need size/type/count validation.

Needs improvement:

  • Main skill does not call out file upload or multipart parser boundaries.
  • Checklist lacks archive-bomb, path traversal, quarantine, object-storage, content signature, and safe download evidence.
  • Output format has no field to record parser-layer consistency across gateway, framework, scanner, storage, and downstream processors.

Priority recommendations:

  1. Add a File Upload and Multipart Parser Boundary section to the main skill.
  2. Add checklist fields under API4/API8 for upload limits, magic-number validation, archive extraction safety, server-generated names, quarantine/scanning, storage isolation, and safe download headers.
  3. Add vulnerable and benign fixtures for spoofed MIME/original filename writes, unsafe ZIP extraction, and controlled upload handling.

Sources checked:

Bounty Info

  • I have read and agree to the CONTRIBUTING.md bounty terms
  • Preferred payment method: Payment details can be provided privately after maintainer acceptance.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions