Skip to content

Conversation

@chinyeungli
Copy link
Contributor

@chinyeungli chinyeungli commented Apr 15, 2025

… golang

Signed-off-by: Chin Yeung Li <tli@nexb.com>
… of test code that will need to be removed).

Signed-off-by: Chin Yeung Li <tli@nexb.com>
```
pkg:golang/github.com/*
pkg:golang/gitlab.com/*
pkg:golang/bitbucket.org/*
```

Signed-off-by: Chin Yeung Li <tli@nexb.com>
Signed-off-by: Chin Yeung Li <tli@nexb.com>
 * Collect metadata from API for the following "namespace"
 ```
 pkg:golang/github.com/*
 pkg:golang/gitlab.com/*
 pkg:golang/bitbucket.org/*
```
 * Add tests
 * Add "golang" in the "supported_ecosystems" list in the api.py

Signed-off-by: Chin Yeung Li <tli@nexb.com>
@chinyeungli chinyeungli requested a review from JonoYang April 15, 2025 22:40
@chinyeungli chinyeungli changed the title 596 add on demand package data collection for golang add on demand package data collection for golang #596 Apr 15, 2025
Copy link
Member

@JonoYang JonoYang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chinyeungli I am looking at https://github.com/package-url/purl-spec/blob/main/PURL-SPECIFICATION.rst#rules-for-each-purl-component and I am not sure if we can add gitlab.com in the package namespace otherwise, the code looks good.


if from_go_lang:
packages[0].type = "golang"
packages[0].namespace = "github.com/" + packages[0].namespace
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chinyeungli could there be golang packages not from github?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Only golang packages from github use this map_fetchcode_supported_package function.
Others will use map_golang_package()

version = ""
if "@" in purl_str:
version = purl_str.rpartition("@")[2]
subset = purl_str.partition("pkg:golang/gitlab.com/")[2].partition("@")[0]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pombredanne https://github.com/package-url/purl-spec/blob/main/PURL-SPECIFICATION.rst#rules-for-each-purl-component

Does this mean we cannot have things like gitlab.com in the namespace field?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

chinyeungli and others added 6 commits April 16, 2025 15:04
Signed-off-by: Chin Yeung Li <tli@nexb.com>

Co-authored-by: Jono Yang <JonoYang@users.noreply.github.com>
Signed-off-by: Chin Yeung Li <tli@nexb.com>
Signed-off-by: Chin Yeung Li <tli@nexb.com>
Signed-off-by: Chin Yeung Li <tli@nexb.com>
license_text = package_data.get("licenses")
extracted_license_statement = [license_text]

download_url = (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Go has some weird rules to encode upper case in these strings. See https://github.com/aboutcode-org/go-inspector/blob/442bc5b83d5aeff2b7a27937ec82b63277bc8f7c/src/go_inspector/utils.py#L211

We are adding support for getting golang download URL in PURL library. @pombredanne @chinyeungli I think we can reuse that here ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

package-url/packageurl-python#195 here is PR for same.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TG1999 Thanks. We can use the code there once it's merged.

chinyeungli and others added 5 commits July 29, 2025 15:30
Signed-off-by: Chin Yeung Li <tli@nexb.com>
…ages #596

Signed-off-by: Chin Yeung Li <tli@nexb.com>
    * This is so we can use the updated packageurl-python library

Signed-off-by: Jono Yang <jyang@nexb.com>
 * purldb depends on scancodeio which depends on sctk 32.4.0 (scancodeio 35.1.0 depends on scancode-toolkit==32.4.0)

Signed-off-by: Chin Yeung Li <tli@nexb.com>
Signed-off-by: Chin Yeung Li <tli@nexb.com>
@pombredanne pombredanne changed the title add on demand package data collection for golang #596 add on demand package data collection for golang, gitlab and bitbucket #596 Sep 2, 2025
for item in data["values"]:
version = item["name"]
author = ""
if "target" in item and item["target"]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about using .get here

target = item.get("target") or {}
author = target.get("author") or {}
if author.get("type") == "author":
   user = author.get("user") or {}
   author_display_name = user.get("author")

break

for tag in data:
version_list.append(tag["name"])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If "name" property does not exits on a tag it should not crash, We should log and continue

data = response.json()
version_author_list = []
# Get all available versions
for item in data:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO .get will be a more better option

]
data = response.json()
# Search for license files in the root directory
for item in data["values"]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO a .get will be a better choice

Copy link
Contributor

@TG1999 TG1999 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chinyeungli majorly looks good to me, IMO we should use .get instead of directly getting an item using dict"foo"], so we can log that and know whenever the contract changes from upstream.

Signed-off-by: Chin Yeung Li <tli@nexb.com>
Signed-off-by: Chin Yeung Li <tli@nexb.com>
Signed-off-by: Chin Yeung Li <tli@nexb.com>
Copy link
Contributor

@TG1999 TG1999 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks

@TG1999 TG1999 merged commit edff9e1 into main Sep 9, 2025
3 of 6 checks passed
@pombredanne pombredanne deleted the 596_add_on-demand_package_data_collection_for_golang branch January 23, 2026 16:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add on-demand package data collection for golang

3 participants