Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GODRIVER-2361 Refactor pseudo-random number and UUID generation with "golang.org/x/exp/rand" #945

Merged
merged 6 commits into from
May 23, 2022

Conversation

qingyang-hu
Copy link
Collaborator

@qingyang-hu qingyang-hu commented May 11, 2022

GODRIVER-2361 Refactor pseudo-random number and UUID generation with "golang.org/x/exp/rand".

Third-party packages are imported into "internal/uuid/uuid" and "internal/randutil/rand" in commit 7633425.
The refactoring is in commit d4defe6.

There is not any modification to the third-party code.

Two files are impacted after the refactoring:
x/mongo/driver/connstring/connstring.go
x/mongo/driver/topology/topology.go

I put the LICENSE files (3-Clause BSD License copied from the original source repo) with the third-party packages. Please let me know if we follow a different practice.

@qingyang-hu qingyang-hu changed the title GODRIVER-2361 Refactor pseudo-random number and UUID generation with "golang .org/x/exp/rand" GODRIVER-2361 Refactor pseudo-random number and UUID generation with "golang.org/x/exp/rand" May 11, 2022
Copy link
Collaborator

@prestonvasquez prestonvasquez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work!

Comment on lines 27 to 29
var wg sync.WaitGroup
for i := 1; i < 1000000; i++ {
wg.Add(1)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
var wg sync.WaitGroup
for i := 1; i < 1000000; i++ {
wg.Add(1)
iterations := 1e6
wg := sync.WaitGroup{}.Add(iterations)
for i := 1; i < threshold; i++ {

I think this will help ensure this subtest is consistent with the other subtests. I think decoupling Add from the loop will also make the test safer, although it's probably fine as is.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! Thank you!

Comment on lines 46 to 49
var wg sync.WaitGroup
uuids := new(sync.Map)
for i := 0; i < iterations; i++ {
wg.Add(1)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
var wg sync.WaitGroup
uuids := new(sync.Map)
for i := 0; i < iterations; i++ {
wg.Add(1)
wg := sync.WaitGroup{}.Add(iterations)
uuids := new(sync.Map)
for i := 0; i < iterations; i++ {

Comment on lines 69 to 71
var wg sync.WaitGroup
for i := 1; i < 1000000; i++ {
wg.Add(1)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
var wg sync.WaitGroup
for i := 1; i < 1000000; i++ {
wg.Add(1)
iterations := 1e6
wg := sync.WaitGroup{}.Add(iterations)
for i := 1; i < iterations; i++ {

Comment on lines 90 to 93
var wg sync.WaitGroup
uuids := new(sync.Map)
for i := 0; i < iterations; i++ {
wg.Add(1)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
var wg sync.WaitGroup
uuids := new(sync.Map)
for i := 0; i < iterations; i++ {
wg.Add(1)
wg := sync.WaitGroup{}.Add(iterations)
uuids := new(sync.Map)
for i := 0; i < iterations; i++ {

Comment on lines 42 to 44
func Equal(a, b UUID) bool {
return a == b
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading through the google/uuid package, I notice that they don't actually have a routine for equality. That's kind of curious, and I really like that we are adding this. What are your thoughts on making it a receiver for the UUID type like what the time pacakge does?

Suggested change
func Equal(a, b UUID) bool {
return a == b
}
func (u UUID) Equal(b UUID) bool {
return u == b
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually the Equal function seems unnecessary because users can compare UUIDs directly with the equality operator (the Equal code just uses the == operator). We should consider removing that function.

import "fmt"

// MarshalText implements encoding.TextMarshaler.
func (uuid UUID) MarshalText() ([]byte, error) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we use these UUID marshalers and unmarshalers anywhere? If we don't use them anywhere, can we exclude this file?

// globalSource is a package-global pseudo-random UUID generator.
var globalSource = func() *source {
uuid.SetRand(randutil.GlobalRand)
uuid.EnableRandPool()
Copy link
Collaborator

@matthewdale matthewdale May 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're using a pseudo-random number generator for the UUID randomness source, are there any benefits to enable random value pooling?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. A new benchmark test showed pooling did not bring any benefit in this scenario. Thus, I removed the github.com/google/uuid package as well.

@@ -461,7 +457,7 @@ func (t *Topology) SelectServer(ctx context.Context, ss description.ServerSelect
// provided, pick2 will panic.
func pick2(ds []description.Server) (description.Server, description.Server) {
// Select a random index from the input slice and keep the server description from that index.
idx := random.Intn(len(ds))
idx := randutil.GlobalRand.Intn(len(ds))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any performance impact to using the same randutil.GlobalRand for selecting servers as generating UUIDs? Since GlobalRand uses a mutex to ensure sequential calls, it seems like it could add lock contention to different parts of the driver.

@@ -0,0 +1,221 @@
// Copyright 2009 The Go Authors. All rights reserved.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a comment somewhere at the top of all files copied from the "golang.org/x/exp/rand" package to indicate where they're from and what version they are.

E.g.

// Copied from https://cs.opensource.google/go/x/exp/+/24438e51023af3bfc1db8aed43c1342817e8cfcd:rand/exp.go

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👌

@@ -0,0 +1,27 @@
Copyright (c) 2009 The Go Authors. All rights reserved.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm concerned that vendoring the newer version of the full "math/bits" is introducing a lot of code and dependency complexity (e.g. "How does the nested vendor directory work with different versions of Go and different dependency strategies?") just to add the Add64 and Mul64 functions.

There are a few alternatives that could allow us to use the "golang.org/x/exp/rand" pRNG with less dependency complexity:

  1. Copy in only Add64 and Mul64 functions from the latest "math/bits" library to "internal/math/bits" and update the relevant code in rng.go to use the copied-in versions of Add64 and Mul64.
  2. Only use "golang.org/x/exp/rand" when building with Go 1.12+. Requires adding versions of the randutil package that are built for Go 1.12+ and Go <1.12 using Go build constraints (e.g. // +build go1.12), using the "golang.org/x/exp/rand" pRNG for Go 1.12+ and "math/rand" pRNG for Go <1.12.

Go build constraints require a lot of testing with different Go versions to validate, so I usually try to avoid them. Because of that, I think option 1 is easier and more robust, although it does require lightly modifying copied-in code. Does that seem like a positive change?

@@ -0,0 +1,27 @@
Copyright (c) 2009 The Go Authors. All rights reserved.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current pattern for license/copyright information is to copy the license into THIRD-PARTY-NOTICES with a reference to the code or package it references. Is that sufficient in this case or do we need to retain this LICENSE file in the package?

// GODRIVER-2349
// Test that initializing many package-global UUID sources concurrently never leads to any duplicate
// UUIDs being generated.
func TestGlobalSource(t *testing.T) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice tests and comment referencing the GODRIVER ticket!

Copy link
Collaborator

@prestonvasquez prestonvasquez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, just a few suggestions.

internal/uuid/uuid_test.go Outdated Show resolved Hide resolved
uuid, err := New()
require.NoError(t, err, "new() error")
_, ok := uuids.Load(uuid)
require.Falsef(t, ok, "New returned a duplicate UUID on iteration %d: %v", i, uuid)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
require.Falsef(t, ok, "New returned a duplicate UUID on iteration %d: %v", i, uuid)
require.Falsef(t, ok, "goroutine %d returned a duplicate UUID on iteration %d: %v", i, j, uuid)

Can we return the same message from the other subtest?

Copy link
Collaborator

@matthewdale matthewdale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pRNG updates look great! One outstanding concern about the tests.

for i := 0; i < iterations; i++ {
go func(i int) {
defer wg.Done()
uuid, err := New()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This currently doesn't test that initializing the pRNG randomness source many times always results in a distinct sequence of values. We either need to re-initialize the package-global variable globalRandom on each iteration (which would necessitate doing all UUID generation sequentially, i.e. no goroutines) or make a UUID source that can be instantiated many times.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right.

@qingyang-hu qingyang-hu requested a review from matthewdale May 20, 2022 22:11
Copy link
Collaborator

@matthewdale matthewdale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great 👍

@qingyang-hu qingyang-hu merged commit ef8ad53 into mongodb:master May 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants