Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace stereoscope with using go-containerregistry directly #836

Merged
merged 27 commits into from
Mar 6, 2024

Conversation

another-rex
Copy link
Collaborator

@another-rex another-rex commented Mar 4, 2024

Replace stereoscope with using the lower level go-containerregistry library directly.

Stereoscope brings in a large list of dependencies to support functions that is not necessary for osv-scanner. This also gives us more low level control of the extraction and scanning.

It's possible to switch to a new fs.FS lockfile interface without too much work.

internal/image/image.go Dismissed Show resolved Hide resolved
Copy link
Collaborator

@oliverchang oliverchang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice!

@@ -41,61 +41,30 @@ require (

require (
dario.cat/mergo v1.0.0 // indirect
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome to see so many dependencies gone!

cmd/osv-scanner/main.go Show resolved Hide resolved
internal/image/image.go Outdated Show resolved Hide resolved
internal/image/image.go Outdated Show resolved Hide resolved
}

type FileMap struct {
fileNodeTrie *trie.PathTrie
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, how much memory do we expect to save on average with using a trie structure here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can check, but I doubt it's a significant amount, this is mainly used to allow us to iterate through directories.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain a bit more what you mean by "iterate through directories"?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

E.g. if we want to scan a node_modules folder, we need to iterate through the contents of the directory.

It's mainly used for when we want to refactor and implement the fs.FS interface.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got it! this trie dependency is fine for now, but if memory usage differences is negligible I wonder if this can be just simplified to a sorted array of paths that we can binary search on to enable fast lookups and directory listings.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like most of the memory usage is from extracting the tar.

@codecov-commenter
Copy link

codecov-commenter commented Mar 4, 2024

Codecov Report

Attention: Patch coverage is 69.54023% with 53 lines in your changes are missing coverage. Please review.

Project coverage is 60.40%. Comparing base (b28c1c8) to head (24c1a97).
Report is 17 commits behind head on main.

Files Patch % Lines
internal/image/image.go 67.67% 18 Missing and 14 partials ⚠️
internal/image/extractor.go 71.05% 7 Missing and 4 partials ⚠️
internal/image/scan.go 71.42% 3 Missing and 3 partials ⚠️
cmd/osv-scanner/main.go 0.00% 2 Missing ⚠️
internal/image/filemap.go 85.71% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #836      +/-   ##
==========================================
+ Coverage   59.78%   60.40%   +0.61%     
==========================================
  Files         136      142       +6     
  Lines       11268     9032    -2236     
==========================================
- Hits         6737     5456    -1281     
+ Misses       4102     3109     -993     
- Partials      429      467      +38     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

}

type FileMap struct {
fileNodeTrie *trie.PathTrie
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain a bit more what you mean by "iterate through directories"?

internal/image/image.go Outdated Show resolved Hide resolved
internal/image/image.go Outdated Show resolved Hide resolved
@@ -124,7 +124,7 @@
"position": {
"filename": "\u003cAny value\u003e",
"offset": -1,
"line": 840,
"line": 839,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to set the line number to a fixed value? so we don't need to update it again when go version changes.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's alright for now, this allows us to check whether the line result in govulncheck is returning correctly. And we shouldn't be updating go that often anyway, so updating it when we do should be fine.

@another-rex another-rex merged commit 095d41e into google:main Mar 6, 2024
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants