Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove potential Go module versions from shortened names #571

Merged
merged 6 commits into from
Oct 16, 2020
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 24 additions & 1 deletion internal/graph/graph.go
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,8 @@ var (
// Removes package name and method arugments for Go function names.
// See tests for examples.
goRegExp = regexp.MustCompile(`^(?:[\w\-\.]+\/)+(.+)`)
// Checks for a package name that could be a module version.
goVerRegExp = regexp.MustCompile(`^v[2-9]+\.`)
Copy link
Contributor

@nolanmar511 nolanmar511 Oct 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that a regex like (?:^(?:[\w\-\.]+\/)+((?:[\w\-\.]+\/v[0-9]+)(?:\.[^.\n]+){2})$)|(?:^(?:[\w\-\.]+\/)+(.+)) could be used for goRegExp rather than adding goVerRegExp. Though, I think @aalexand should make the call as to whether we'd prefer to add code or add a more complex regexp.

Otherwise, I think goVerRegExp would miss "v14" and "v10". Perhaps ^v[0-9]+\. or ^v[2-9][0-9]+\. would match better?

Copy link
Contributor Author

@zikaeroh zikaeroh Oct 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, you're right, I brainfarted on that one. Should have been ^v([2-9]|[1-9][0-9]+)\..

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I switched it to a corrected regex, but if there's some more complicated one that works better it can be used. I know my brain shuts down trying to read that long one... 🙂

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The key thing is to allow v2, v3, v10, v1234, etc, but not v0 or v1.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(?:^(?:[\w\-\.]+\/)+((?:[\w\-\.]+\/v(?:[2-9]|[1-9][0-9]+)+)(?:\.[^.\n]+){2})$)|(?:^(?:[\w\-\.]+\/)+(.+))

Is the long one with the restricted version select; that'd be a one line change and it appears to pass the tests. I'm happy to use it instead.

// Strips C++ namespace prefix from a C++ function / method name.
// NOTE: Make sure to keep the template parameters in the name. Normally,
// template parameters are stripped from the C++ names but when
Expand Down Expand Up @@ -442,12 +444,33 @@ func ShortenFunctionName(f string) string {
f = cppAnonymousPrefixRegExp.ReplaceAllString(f, "")
for _, re := range []*regexp.Regexp{goRegExp, javaRegExp, cppRegExp} {
if matches := re.FindStringSubmatch(f); len(matches) >= 2 {
return strings.Join(matches[1:], "")
name := strings.Join(matches[1:], "")
if re == goRegExp {
return shortenGoFunc(f, name)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel it might be simpler to first remove the version substring from the name, and then handle it just like before.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if you saw the previous review comments, but if preferred this all can be removed and replaced with a single regex change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel it might be simpler to first remove the version substring from the name, and then handle it just like before.

I'm not sure how this is possible; the name here is extracted from the regex directly. If we remove the version suffix, you get the empty string.

Copy link
Contributor Author

@zikaeroh zikaeroh Oct 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see, you mean that if the name matches a version, remove the suffix from the whole path and then try again. It wouldn't distinguish two versions of the same module, but I guess it's no worse than any other name aliasing within the same graph. Would be short; I can do that if preferred.

Copy link
Contributor Author

@zikaeroh zikaeroh Oct 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I misinterpreted again (sorry!), so I'll wait for clarification.

I think you meant just starting with something like github.com/jackc/pgx/v4/foo.bar, then replacing the first instance of /v4/ with /, then running the regex again. Not quite a suffix, but functional enough. This is all heuristics after all.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, /v[1-9][0-9]*[./].

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And not all occurrences but at most one occurrence (assuming there can't be two version substrings in the name).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All instances wouldn't work, because it's legal for me to write github.com/foo/bar/v4/something/v8 or similar. First instance I believe would work as intended, though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll give it a try. Would be v([2-9]|[1-9][0-9]+)\., though, as v0 and v1 don't exist.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, and retitled to match the new fix. regexp doesn't have a nice "just replace once", so I used the replace all with two capture groups method.

}
return name
}
}
return f
}

func shortenGoFunc(f string, name string) string {
if !goVerRegExp.MatchString(name) {
return name
}

// The shortened name could start with a module version (like "v2"). Go back one slash.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keep comments in 80 columns please.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.

end := len(f) - len(name) - 1
if end >= 0 {
prefix := f[:end]
if idx := strings.LastIndex(prefix, "/"); idx >= 0 {
end = idx
}
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps remove this line break.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, if you want. I view the above block as calculating end, so that was my thinking.

return f[end+1:]
}

// TrimTree trims a Graph in forest form, keeping only the nodes in kept. This
// will not work correctly if even a single node has multiple parents.
func (g *Graph) TrimTree(kept NodePtrSet) {
Expand Down
12 changes: 12 additions & 0 deletions internal/graph/graph_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -451,6 +451,18 @@ func TestShortenFunctionName(t *testing.T) {
"github.com/blah/blah/vendor/gopkg.in/redis.v3.(*baseClient).(github.com/blah/blah/vendor/gopkg.in/redis.v3.process)-fm",
"redis.v3.(*baseClient).(github.com/blah/blah/vendor/gopkg.in/redis.v3.process)-fm",
},
{
"github.com/jackc/pgx/v4.(*Conn).Query",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you consider using more abstract test case names? (To be in keeping with the style of existing tests)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure thing.

"pgx/v4.(*Conn).Query",
},
{
"github.com/jackc/pgx/v4/stdlib.connector.Connect",
"stdlib.connector.Connect",
},
{
"example.org/v2xyz.Foo",
"v2xyz.Foo",
},
{
"java.util.concurrent.ThreadPoolExecutor$Worker.run",
"ThreadPoolExecutor$Worker.run",
Expand Down