-
Notifications
You must be signed in to change notification settings - Fork 607
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove potential Go module versions from shortened names #571
Conversation
Codecov Report
@@ Coverage Diff @@
## master #571 +/- ##
==========================================
- Coverage 67.14% 67.13% -0.01%
==========================================
Files 78 78
Lines 14072 14074 +2
==========================================
Hits 9449 9449
- Misses 3788 3789 +1
- Partials 835 836 +1
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for this PR!
internal/graph/graph.go
Outdated
@@ -34,6 +34,8 @@ var ( | |||
// Removes package name and method arugments for Go function names. | |||
// See tests for examples. | |||
goRegExp = regexp.MustCompile(`^(?:[\w\-\.]+\/)+(.+)`) | |||
// Checks for a package name that could be a module version. | |||
goVerRegExp = regexp.MustCompile(`^v[2-9]+\.`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that a regex like (?:^(?:[\w\-\.]+\/)+((?:[\w\-\.]+\/v[0-9]+)(?:\.[^.\n]+){2})$)|(?:^(?:[\w\-\.]+\/)+(.+))
could be used for goRegExp
rather than adding goVerRegExp
. Though, I think @aalexand should make the call as to whether we'd prefer to add code or add a more complex regexp.
Otherwise, I think goVerRegExp
would miss "v14" and "v10". Perhaps ^v[0-9]+\.
or ^v[2-9][0-9]+\.
would match better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, you're right, I brainfarted on that one. Should have been ^v([2-9]|[1-9][0-9]+)\.
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I switched it to a corrected regex, but if there's some more complicated one that works better it can be used. I know my brain shuts down trying to read that long one... 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The key thing is to allow v2, v3, v10, v1234, etc, but not v0 or v1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(?:^(?:[\w\-\.]+\/)+((?:[\w\-\.]+\/v(?:[2-9]|[1-9][0-9]+)+)(?:\.[^.\n]+){2})$)|(?:^(?:[\w\-\.]+\/)+(.+))
Is the long one with the restricted version select; that'd be a one line change and it appears to pass the tests. I'm happy to use it instead.
internal/graph/graph_test.go
Outdated
@@ -451,6 +451,18 @@ func TestShortenFunctionName(t *testing.T) { | |||
"github.com/blah/blah/vendor/gopkg.in/redis.v3.(*baseClient).(github.com/blah/blah/vendor/gopkg.in/redis.v3.process)-fm", | |||
"redis.v3.(*baseClient).(github.com/blah/blah/vendor/gopkg.in/redis.v3.process)-fm", | |||
}, | |||
{ | |||
"github.com/jackc/pgx/v4.(*Conn).Query", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you consider using more abstract test case names? (To be in keeping with the style of existing tests)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure thing.
internal/graph/graph.go
Outdated
end = idx | ||
} | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps remove this line break.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, if you want. I view the above block as calculating end
, so that was my thinking.
@aalexand -- Did some initial review; wanted to know your thoughts on the approach; specifically if it makes sense to try to use a single regexp or if adding an additional function (as is done here) is the right approach. |
@@ -451,6 +451,30 @@ func TestShortenFunctionName(t *testing.T) { | |||
"github.com/blah/blah/vendor/gopkg.in/redis.v3.(*baseClient).(github.com/blah/blah/vendor/gopkg.in/redis.v3.process)-fm", | |||
"redis.v3.(*baseClient).(github.com/blah/blah/vendor/gopkg.in/redis.v3.process)-fm", | |||
}, | |||
{ | |||
"github.com/foo/bar/v4.(*Foo).Bar", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious - why is version string sometimes a separate subdirectory and sometimes a prefix of the package name? Is this something that the package owners choose? Are these options restricted at these two, or are there more?
Oh, I guess it's a function of how deep below the versioning level the actual symbol is?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you're referring to the tests; some of the tests I've added are where the "version" isn't a version at all. The only valid version in paths are "v2", "v3", ... "v1234", etc. So you'd have github.com/foo/bar
, github.com/foo/bar/v2
, github.com/foo/bar/v3
, and so on, then subpackages like github.com/foo/bar/v3/baz
. Custom domains mean you can have things like gotest.tools/assert
, gotest.tools/v3/assert
. But a package can be at the level where the version appears, so when github.com/jackc/pgx
was bumped to github.com/jackc/pgx/v4
, it's still referred to as pgx
in the code.
But if it isn't a valid version part, then I don't want to treat it as one naively (i.e. "something.com/hello/v123xyz" isn't versioned, "something.com/hello/v123/xyz" is because the version is its own element).
internal/graph/graph.go
Outdated
return name | ||
} | ||
|
||
// The shortened name could start with a module version (like "v2"). Go back one slash. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Keep comments in 80 columns please.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure.
internal/graph/graph.go
Outdated
return strings.Join(matches[1:], "") | ||
name := strings.Join(matches[1:], "") | ||
if re == goRegExp { | ||
return shortenGoFunc(f, name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel it might be simpler to first remove the version substring from the name, and then handle it just like before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if you saw the previous review comments, but if preferred this all can be removed and replaced with a single regex change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel it might be simpler to first remove the version substring from the name, and then handle it just like before.
I'm not sure how this is possible; the name here is extracted from the regex directly. If we remove the version suffix, you get the empty string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see, you mean that if the name matches a version, remove the suffix from the whole path and then try again. It wouldn't distinguish two versions of the same module, but I guess it's no worse than any other name aliasing within the same graph. Would be short; I can do that if preferred.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I misinterpreted again (sorry!), so I'll wait for clarification.
I think you meant just starting with something like github.com/jackc/pgx/v4/foo.bar
, then replacing the first instance of /v4/
with /
, then running the regex again. Not quite a suffix, but functional enough. This is all heuristics after all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, /v[1-9][0-9]*[./]
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And not all occurrences but at most one occurrence (assuming there can't be two version substrings in the name).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All instances wouldn't work, because it's legal for me to write github.com/foo/bar/v4/something/v8
or similar. First instance I believe would work as intended, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll give it a try. Would be v([2-9]|[1-9][0-9]+)\.
, though, as v0
and v1
don't exist.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, and retitled to match the new fix. regexp
doesn't have a nice "just replace once", so I used the replace all with two capture groups method.
I just realized I didn't add a test for the multi-version case; I can add one quick if that's alright. |
Ah, yeah, my replacement was overzealous. I'm not sure off the top of my head how to construct my regex to capture as little text on the left side as possible. |
Doh; one character. Sorry to dismiss the review. |
Bump from 20191205061153 => 20201109224723 My personal interest is to pull in google/pprof#564, which adds support for displaying names with `"` in them, which julia functions sometimes have (e.g. `var"#foo#23"`) Includes: - google/pprof#564 - google/pprof#575 - google/pprof#574 - google/pprof#571 - google/pprof#572 - google/pprof#570 - google/pprof#562 - google/pprof#561 - google/pprof#565 - google/pprof#560 - google/pprof#563 - google/pprof#557 - google/pprof#554 - google/pprof#552 - google/pprof#545 - google/pprof#549 - google/pprof#547 - google/pprof#541 - google/pprof#534 - google/pprof#542 - google/pprof#535 - google/pprof#531 - google/pprof#530 - google/pprof#528 - google/pprof#522 - google/pprof#525 - google/pprof#527 - google/pprof#519 - google/pprof#520 - google/pprof#517 - google/pprof#518 - google/pprof#514 - google/pprof#513 - google/pprof#510 - google/pprof#508 - google/pprof#506 - google/pprof#509 - google/pprof#504
* Update pprof to latest revision Bump from 20191205061153 => 20201109224723 My personal interest is to pull in google/pprof#564, which adds support for displaying names with `"` in them, which julia functions sometimes have (e.g. `var"#foo#23"`) Includes: - google/pprof#564 - google/pprof#575 - google/pprof#574 - google/pprof#571 - google/pprof#572 - google/pprof#570 - google/pprof#562 - google/pprof#561 - google/pprof#565 - google/pprof#560 - google/pprof#563 - google/pprof#557 - google/pprof#554 - google/pprof#552 - google/pprof#545 - google/pprof#549 - google/pprof#547 - google/pprof#541 - google/pprof#534 - google/pprof#542 - google/pprof#535 - google/pprof#531 - google/pprof#530 - google/pprof#528 - google/pprof#522 - google/pprof#525 - google/pprof#527 - google/pprof#519 - google/pprof#520 - google/pprof#517 - google/pprof#518 - google/pprof#514 - google/pprof#513 - google/pprof#510 - google/pprof#508 - google/pprof#506 - google/pprof#509 - google/pprof#504 * Update P/pprof/build_tarballs.jl - use a real version number Co-authored-by: Mosè Giordano <giordano@users.noreply.github.com> * Remove now unused `timestamp` * [pprof] Use `GitSource` Co-authored-by: Mosè Giordano <giordano@users.noreply.github.com>
* Expand shortened name if package name appears to be a module version * Correct version regexp, use more generic names in tests, remove an empty line * 80 character columns * Remove first matching Go module version from path * Test and fix multi-version case Co-authored-by: Maggie Nolan <nolanmar@google.com>
Fixes #515.
Remove potential module path versions (v2, v3, v4, etc) from the input string before extracting a shortened name. This makes it easier to tell which packages are which if the versions happen to match.
An example similar to my report: