Skip to content

Conversation

@tamilari
Copy link
Contributor

@tamilari tamilari commented Nov 7, 2025

feat: add Vcs informations to sbom

For spdx, add the Vcs information in the 'download_location' field. Format this string according to the spdx documentation. It is not possible to add the url to the web interface for browsing the repository, since there is no field for this information.

For cdx add the Vcs information and the url to the web interface to the 'external_refs' object with the type set to 'vcs'.

This Vcs information can be useful for analysing the packages (e.g. how well a package is maintained and what the risk factor is).

vcs_browser: str | None = None,
vcs_git: str | None = None,
vcs: str | None = None,
vcs_type: VCS_Type | None = None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we please combine this into a single field with a custom type?

if subdir:
download_location += f"#{subdir.group(1)}"

return download_location
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the entries in the .dsc files (and alike) usually precise, or do they just link to the upstream repo? My interpretation of https://spdx.github.io/spdx-spec/v2.3/package-information/#77-package-download-location-field is that it is in general allowed to be imprecise, but it I would also like to hear the opinion of @gernot-h .

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO this should link to the snapshot download location of the .dsc file. We are representing a source package and I would not consider the link to the repository a location where I can download it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also thought that this field wasn't the best fit for VCS information, but the description and examples provide such detailed instructions on how to format the VCS information. I couldn't find a better place for it, and I assume that we need this information in the SPDX format as well. Alternatively, maybe it could go in the source information field (https://spdx.github.io/spdx-spec/v2.3/package-information/#712-source-information-field)?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What speaks against using the external reference field?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While we are at it, there is documentation now about what fields are modeled how. See the Mapping of Debian Source Packages to SBOM Packages/Components section in docs/source/design-decisions.rst, you need to add the additional information there too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What speaks against using the external reference field?

I can't find a suitable category and type. The only option there would be to use the 'Other' category.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While we are at it, there is documentation now about what fields are modeled how. See the Mapping of Debian Source Packages to SBOM Packages/Components section in docs/source/design-decisions.rst, you need to add the additional information there too.

Thanks for the reminder. I'll do that once we've agreed on a field.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I would do something like category = Other and type = vcs

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I'm having a hard time completely following the theoretical discussion here, so I can't provide a final opinion. Can you please provide two or three short examples how the final SPDX and CDX output would look like?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For CDX, it is pretty clear on where to add the Debian Version Control System fields Vcs-Browser and Vcs-<type>. There is a vcs type for the externalReferences field. So a component would look something like this:

{
    "externalReferences": [
        {
            "comment": "URL of a web interface for browsing the repository",
            "type": "vcs",
            "url": "https://salsa.debian.org/perl-team/modules/packages/libxml-sax-expat-perl"
        },
        {
            "comment": "Version control system of type Git",
            "type": "vcs",
            "url": "https://salsa.debian.org/perl-team/modules/packages/libxml-sax-expat-perl.git"
        },
        {
            "comment": "homepage",
            "type": "website",
            "url": "https://metacpan.org/release/XML-SAX-Expat"
        }
    ],
    "name": "libxml-sax-expat-perl",
    "version": "0.51-2"
}

The only problem there is that you can't differentiate between the vcs-Browser and the normal vcs field based on the type alone.

For SPDX, this isn't clear since there isn't a suitable category or type for this in the externalReferences field. Therefore, we would use the Other category. There, we can use any type, so a package could look like this:

{
    "downloadLocation": "NOASSERTION",
    "externalRefs": [
        {
            "referenceCategory": "PACKAGE_MANAGER",
            "referenceLocator": "pkg:deb/debian/[email protected]?arch=source",
            "referenceType": "purl"
        },
        {
            "comment": "URL of a web interface for browsing the repository",
            "referenceCategory": "OTHER",
            "referenceLocator": "https://salsa.debian.org/perl-team/modules/packages/libxml-sax-expat-perl",
            "referenceType": "vcsBrowser"
        },
        {
            "comment": "Version control system of type Git",
            "referenceCategory": "OTHER",
            "referenceLocator": "https://salsa.debian.org/perl-team/modules/packages/libxml-sax-expat-perl.git",
            "referenceType": "vcs"
        }
    ],
    "name": "libxml-sax-expat-perl",
    "versionInfo": "0.51-2"
}

Previously, I had it in the download location field. This is partly because there is such a detailed explanation on how to format the string for all the different version control systems. However, we feel that the name of this field is not appropriate.

{
    "downloadLocation": "git+https://salsa.debian.org/perl-team/modules/packages/libxml-sax-expat-perl.git",
    "externalRefs": [
        {
            "referenceCategory": "PACKAGE_MANAGER",
            "referenceLocator": "pkg:deb/debian/[email protected]?arch=source",
            "referenceType": "purl"
        }
    ],
    "name": "libxml-sax-expat-perl",
    "versionInfo": "0.51-2"
}

@tamilari tamilari force-pushed the fm/vcs-link branch 2 times, most recently from 89bc06e to 88dedcf Compare November 7, 2025 17:46
),
],
)
def test_format_vcs(input, expected):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we also have a test that verifies the parsing works as expected? The setup should be available in the apt-sources root, from there you could simply check if it parsed correctly into a package.

For cdx add the Vcs information and the url to the web interface to the
'external_refs' object with the type set to 'vcs'.

For spdx there there isn't a suitable category or type for this in the
externalReferences field. Therefore, we use the Other category with the
custom types 'vcs' and 'vcsBrowser'.

This Vcs information can be useful for analysing the packages (e.g. how
well a package is maintained and what the risk factor is).

Signed-off-by: Tamino Larisch <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants