Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace gogo/protobuf with google/protobuf #5032

Merged
merged 13 commits into from
Nov 21, 2023

Conversation

tdeebswihart
Copy link
Contributor

@tdeebswihart tdeebswihart commented Oct 24, 2023

What changed?

gogo/protobuf has been replaced with Google's official go compiler.

Why?

gogo/protobuf has been deprecated for some time and the community is moving on, building new tools (like vtproto) atop google's v2 compiler.

How did you test it?

make test

Potential risks

  1. The change from embedded gogo-generated-structs to google-generated-pointers-to-structs created a risk of nil pointer exceptions. I've fixed all the ones our tests found but it's possible there are more lurking in the new code.
  2. This change may cause our performance to decrease. Certainly encoding/deconding of proto objects will become slower, but the overuse of pointers by the google compiler may negatively affect our overall performance. We'll need to keep an eye on the GC stats
  3. This breaks the HTTP API. We will not support shortand payload encoding in this first pass; that will come once this initial work is in testing.

Breaking changes for developers

  • *time.Time in proto structs will now be timestamppb.Timestamp
  • *time.Duration will now be durationpb.Duration
  • V2-generated structs embed locks, so you cannot dereference them. go vet will scream at you about this. If you need a copy, use proto.Clone.
    • If the performance of this sucks then I will either update our code generator to add shallow-clone methods or hand-roll the ones we need
  • Proto enums will, when formatted to JSON, now be in SCREAMING_SNAKE_CASE rather than PascalCase. We decided (in discussion with the SDK team) that now was as good a time as any to rip the bandage off.
  • Proto objects, or objects embedding protos, cannot be compared using reflect.DeepEqual or anything that uses it. This includes testify and mock equality testers!
    • You will need to use the common/testing/protorequire, common/testing/protoassert, or common/testing/protomock packages instead. I've implemented proto-compatible matchers and assertions there for all cases I've encountered
    • If you need reflect.DeepEqual for any reason you can use go.temporal.io/api/temporalproto.DeepEqual instead

Note that history loading will not be impacted by the JSON changes: I rewrote history loading to dynamically fix incoming history JSON data (like all our other sdks); you can find this code in my fork of our go API alongside its tests.

🚨Sharp Edges Introduced🚨

Beware *timestamppb.Timestamp.AsTime(). If you need to extract a time value from a proto time (timestamppb) always make sure to check whether it's nil first. When the proto object is nil AsTime() will return a non-zero time at the proto epoch: UTC midnight on January 1, 1970.

I've made this mistake multiple times during this transition and each time it's been a pain to debug

Is hotfix candidate?

Haha. No.


Release Plan

Our goal is to release our API, Go API, Go SDK, and Server all at once, then update our UI (and ui-server), CLI, and HTTP API in a second pass. Until the second pass happens no changes to our JSON formatting will be apparent to users

@tdeebswihart tdeebswihart changed the title Nomo gogo Replace gogo/protobuf with google/protobuf Oct 24, 2023
@tdeebswihart tdeebswihart force-pushed the nomo-gogo branch 3 times, most recently from e87a0ae to b393589 Compare October 25, 2023 23:05
@tdeebswihart tdeebswihart force-pushed the nomo-gogo branch 3 times, most recently from 44cde0e to a0d20c6 Compare October 27, 2023 22:47
@tdeebswihart tdeebswihart marked this pull request as ready for review October 27, 2023 22:56
@tdeebswihart tdeebswihart requested a review from a team as a code owner October 27, 2023 22:56
@tdeebswihart
Copy link
Contributor Author

Please ignore golangci-lint: I changed enough code that it's complaining about existing issues that I happen to brush the dust off of

Comment on lines +151 to 146
@go install google.golang.org/protobuf/cmd/protoc-gen-go@latest
@go install google.golang.org/grpc/cmd/protoc-gen-go-grpc@latest
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious why we're installing latest as opposed to using the modfile.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No reason. I'd be happy to change that if the team cares

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like latest. I would use it everywhere. Yes, build can be broken because of breaking changes in dependencies (and it happened few times already) but I think it is "ok" trade-off for keeping these build dependencies up to date.

Copy link
Member

@dnr dnr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not done yet, but I got through proto/ (going in alphabetical order) and figured I should send what I had so far

@@ -188,6 +187,10 @@ func (s *queryParserSuite) TestParsePrecision() {
}
}

func ptr[t any](v t) *t {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want this in the standard library... maybe common/util/util.go though?
Also, use capital T for a type variable for consistency

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it used anywhere else?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I swapped t for T but didn't put this into our utility library. I'd prefer that we address things regularly (using &stackVar) instead of this in our normal code; I added this to make tests simpler to write

@@ -27,23 +27,24 @@ package timestamp
import (
"time"

"github.com/gogo/protobuf/types"
"google.golang.org/protobuf/types/known/durationpb"
"google.golang.org/protobuf/types/known/timestamppb"
)

// Timestamp provides easy conversions and utility comparison functions
// making go to proto time comparison straightforward
type Timestamp struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

afaict this is unused.. why not delete it? (or at least don't touch it in this PR and delete it later)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would love to see this go away!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm planning to remove it in a separate PR after the whole proto adventure is over

@@ -65,6 +68,7 @@ require (
)

require (
github.com/gogo/protobuf v1.3.2 // indirect
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's still an indirect dep? is that from sdk and it'll go away soon?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SDK supports Gogo for unmarshaling user-supplied protos. I have no idea if we need to continue supporting it; @cretz do we still need to support gogo in our payload converters or can it be removed entirely?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can remove it. It sort of breaking change but I doubt that anyone use gogo to generate in/out params.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SDK team has mentioned we need to keep this as some of our customers use Gogo/protobuf

// rpc Bar(google.protobuf.Empty) returns (google.protobuf.Empty);
// }
//
message Empty {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can't get this through a dependency with the other well-known types?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I would add it to api repo.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, why do you need it all? I don't see it's being used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the only repo that uses it so I didn't bother vendoring it in our API repo. I can if you'd prefer

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would vote for api repo. @dnr why did you use it at all? Why not bool or enum?

Comment on lines +151 to 146
@go install google.golang.org/protobuf/cmd/protoc-gen-go@latest
@go install google.golang.org/grpc/cmd/protoc-gen-go-grpc@latest
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like latest. I would use it everywhere. Yes, build can be broken because of breaking changes in dependencies (and it happened few times already) but I think it is "ok" trade-off for keeping these build dependencies up to date.

if expirationDuration <= 0 {
var expirationDuration *durationpb.Duration
expirationTime := task.event.Data.ExpiryTime.AsTime()
if task.event.Data.ExpiryTime != nil && !expirationTime.IsZero() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that line above would work even if ExpiryTime is nil. But this check here seems to be not needed. If you want to keep it, move it higher.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check is needed, and skipping the nil check causes a test failure because of how the zero protobuf time (not a zero go time) is handled.

we probably only need to check whether ExpiryTime is nil, honestly. I'll see what happens if I do

strctV := reflect.ValueOf(strct)
strctT := strctV.Type()

ret := map[string]any{}

for i := 0; i < strctV.NumField(); i++ {
field := strctV.Field(i)
var fieldData any
// Skip private members
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"public" => "exported",
"private" => "unexported".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But why?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We compare the equality of the History structs formatted using this against concrete strings (search for EqualHistory). Before I added this the naive reflection we used here dumped the unexported members that the google compiler generated, polluting our tests. You can comment that line out and see how bad it was

@@ -662,6 +662,7 @@ func (s *functionalSuite) TestUpdateWorkflow_NormalScheduledWorkflowTask_AcceptC
case 1:
return nil, nil
case 2:
s.Require().True(len(task.Messages) > 0, "Task has no messages", task)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is that? Why not s.True()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without the require we'd immediately access the first element of a zero-length slice on the next line, crashing and dumping a useless stack trace

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should always have them, but ok. I will check it later.

@@ -65,6 +68,7 @@ require (
)

require (
github.com/gogo/protobuf v1.3.2 // indirect
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can remove it. It sort of breaking change but I doubt that anyone use gogo to generate in/out params.

Copy link
Member

@dnr dnr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comments through service/frontend

@@ -70,6 +70,8 @@ var _ OperatorHandler = (*OperatorHandlerImpl)(nil)
type (
// OperatorHandlerImpl - gRPC handler interface for operator service
OperatorHandlerImpl struct {
operatorservice.UnimplementedOperatorServiceServer
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this make it so you can leave out handlers and they'll just return unimplemented? if so I don't think we should do this. requiring someone to write stub (or real) implementations seems like a good thing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes and I've replaced that with the Unsafe variant generated by the GRPC package which will cause a compilation error if we leave out an impl

return proxy.NewGRPCGatewayJSONPBMarshaler(m, u)
}
return newProtoJsonMarshaler(indent)
// TODO: reintroduce shorthand JSON marshaling here
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and is the intention to update go.temporal.io/api/proxy and move this back? or do it in the new marshaller in this package? I don't really understand what's going on in this area

@@ -104,6 +106,7 @@ var (
type (
// WorkflowHandler - gRPC handler interface for workflowservice
WorkflowHandler struct {
workflowservice.UnimplementedWorkflowServiceServer
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same question here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have to embed either this or (as I just now discovered) the UnsafeFOOServiceServer generated by the GRPC code. It's a requirement as we must implement an internal-only interface...

I'm going to swap these to use the Unsafe variant as that will cause compilation errors when new methods are added

@@ -2091,7 +2094,7 @@ func (wh *WorkflowHandler) ListArchivedWorkflowExecutions(ctx context.Context, r

// special handling of ExecutionTime for cron or retry
for _, execution := range archiverResponse.Executions {
if timestamp.TimeValue(execution.GetExecutionTime()).IsZero() {
if execution.ExecutionTime == nil || execution.GetExecutionTime().AsTime().IsZero() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't need Get if you just checked nil

Suggested change
if execution.ExecutionTime == nil || execution.GetExecutionTime().AsTime().IsZero() {
if execution.ExecutionTime == nil || execution.ExecutionTime.AsTime().IsZero() {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

d'oh, thanks.

@@ -4145,9 +4147,10 @@ func (wh *WorkflowHandler) unaliasStartWorkflowExecutionRequestSearchAttributes(
}

// Shallow copy request and replace SearchAttributes fields only.
newRequest := *request
// FIXME actually shallow copy
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in this pr or later?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left these in here in case we needed shallow copies for performance. Once I've gone through the performance testing I've kicked off I'll see if these are necessary (they probably aren't) and address them appropriately at that point.

I'm going to change these to // NOTE instead of fixmes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually just deleted my note entirely. I don't think it's necessary, and were this actually a problem we'd find it easily in a profile

newRequest.SearchAttributes = unaliasedSas
return &newRequest, nil
return newRequest, nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, definitely not in this PR, but I think this pattern of replacing the SA field in SWERequest and other messages is error-prone (see the recent bug that Roey fixed). we should use the type system to differentiate between aliased and unaliased names. (which I would call external and internal since "alias" could be read both ways.) all user-facing api messages should have only external names, and internal names should be present in internal structs only. ideally confined to visibility, but that may be hard.

anyway, that's a big refactoring that can happen later.

@@ -182,7 +182,8 @@ func (wh *WorkflowHandler) getWorkflowExecutionHistory(
ctx,
wh.metricsScope(ctx),
namespaceID,
*execution,
// FIXME shallow copy
common.CloneProto(execution),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. WorkflowExecution has only simple fields, so this already is a shallow copy
  2. I would do it like this: just pass the pointer here. then on the four lines that set execution.RunId, have those lines do the copy like execution = &commonpb.WorkflowExecution{WorkflowId: execution.WorkflowId, RunId: newRunId}. that way we treat WorkflowExecution as immutable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't actually figure out why we shallow copied these in the first place; I don't see any mutations in these calls

@tdeebswihart tdeebswihart force-pushed the nomo-gogo branch 6 times, most recently from 7a4bd20 to a99f8b1 Compare November 20, 2023 19:32
@tdeebswihart tdeebswihart enabled auto-merge (squash) November 21, 2023 22:48
@tdeebswihart tdeebswihart merged commit 1be76e3 into temporalio:main Nov 21, 2023
@tdeebswihart tdeebswihart deleted the nomo-gogo branch November 21, 2023 23:19
nichtverstehen pushed a commit to nscloud-demo/temporalio-features that referenced this pull request Nov 22, 2023
**What changed?**

gogo/protobuf has been replaced with Google's official go compiler. 

**Why?**

gogo/protobuf has been deprecated for some time and the community is moving on, building new tools (like vtproto) atop google's v2 compiler.

**How did you test it?**

This branch is actively used by my [SDK](temporalio/sdk-go#1256) and [server](temporalio/temporal#5032) PRs so you can see it in use there

**Potential risks**

None I can think of.

**Breaking changes for developers**

- `*time.Time` in proto structs will now be [timestamppb.Timestamp](https://pkg.go.dev/google.golang.org/protobuf@v1.31.0/types/known/timestamppb#section-documentation)
- `*time.Duration` will now be [durationpb.Duration](https://pkg.go.dev/google.golang.org/protobuf/types/known/durationpb)
- V2-generated structs embed locks, so you cannot dereference them. `go vet` will scream at you about this. If you need a copy, use `proto.Clone`.
- Proto objects, or objects embedding protos, cannot be compared using `reflect.DeepEqual` or _anything_ that uses it. This includes `testify` and `mock` equality testers! 
  - You will need to use the `protorequire` ro `protoassert`packages instead. I've implemented proto-compatible assertions there for all cases I've encountered
  - If you need `reflect.DeepEqual` for any reason you can use `go.temporal.io/api/temporalproto.DeepEqual` instead
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants