Skip to content

Debian PURLs fail to resolve in ScanCode.io / FetchCode #191

@mallurivikas

Description

@mallurivikas

While working with ScanCode.io and FetchCode over the past week, I ran into an issue with Debian PURL resolution

Debian PURLs (e.g. pkg:deb/debian/bash@5.2.15) consistently fail to resolve.
This happens both locally and on a ScanCode.io server instance.

To Reproduce in Scancode.io

  1. Open ScanCode.io
  2. Create a new project
  3. In Inputs, add: "pkg:deb/debian/bash@5.2.15"
  4. select scan_single_package pipeline
  5. Click Create

Result
The pipeline fails during download_missing_inputs.

Error screenshots
Image

Image

In Fetchcode
In WSL Activate your venv in the fetchcode directory

In Python

from fetchcode import fetch
fetch("pkg:deb/debian/bash@5.2.15")

Traceback (most recent call last):
File "", line 1, in
File "/home/vikas/fetchcode/src/fetchcode/init.py", line 148, in fetch
url, scheme = get_resolved_url(url, scheme)
File "/home/vikas/fetchcode/src/fetchcode/init.py", line 117, in get_resolved_url
url, scheme = resolution_handler(url)
File "/home/vikas/fetchcode/src/fetchcode/init.py", line 128, in resolve_url_from_purl
raise ValueError("Could not resolve PURL to a valid URL.")
ValueError: Could not resolve PURL to a valid URL.

from fetchcode.download_urls import download_url
download_url("pkg:pypi/requests@2.31.0")
'https://files.pythonhosted.org/.../requests-2.31.0.tar.gz'
download_url("pkg:deb/debian/bash@5.2.15")
(No Output)

FetchCode cannot resolve Debian PURLs
Failure happens at PURL → URL resolution

Observations
FetchCode currently does not resolve Debian PURLs to concrete .deb download URLs
ScanCode.io relies on this resolution during download_missing_inputs
As a result, Debian PURLs fail early in the pipeline
Debian packages commonly appear in SBOMs, so this limits coverage

Related context
There are prior discussions and partially related work around Debian support in this repository,
#89 but based on current behavior, Debian PURL resolution is still incomplete or not fully wired through FetchCode → ScanCode.io.

Propsed Approach
I'll start by Identifying the appropriate Debian metadata sources and then adding support to fetch .deb URLs.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions