Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Created GRPC handlers and internal clients #4057

Merged
merged 17 commits into from
Apr 15, 2021

Conversation

vytautas-karpavicius
Copy link
Contributor

@vytautas-karpavicius vytautas-karpavicius commented Mar 16, 2021

What changed?
This change introduces GRPC handlers and internal clients. It brings simultaneous support for TChannel/Thrift and GRPC/protobuf within cadence server.

For this to work, it opens up new port GRPC port. For consistency all GRPC ports are -100 from Thrift ports. For example:

frontend: 7933 -> 7833
history: 7934 -> 7834
matching: 7935 -> 7835

On this new port new YARPC inbound is configured to accept GRPC traffic. New GRPC handlers were created to accepts proto encoded requests. They are converted to internal types and passed along to the main handler to process.

The outbound GRPC communication between server components is off by default, and can be enabled with dynamic config flag: system.enableGRPCOutbound. If enabled all requests will be encoded to proto messages within newly added GRPC clients. Additionally those clients are created with a different GRPC dispatcher.

When making an outbound call we need to know GRPC port. This is a little bit tricky, as all addresses are resolved using ringpop, which automatically carries TChannel port. Thus we need to drop it and append grpc port by using a lookup from additionally passed service configuration.

Why?
To migration Cadence to GRPC.

How did you test it?
Integration tests were updated to use GPRC configuration by default.
Locally passing cadence canary with GRPC communication enabled.

Potential risks
There may be some mapping issues between proto and internal types that were not noticed yet. Some of them were caught by with integration and canary tests. To reduce risk, proto communication is off by default and will be enabled with dynamic config. Additionally all frontend metrics are tagged with grpc/thrift flag, which will allow observing error increase when switching to grpc dynamically. Mitigation is easy - switching back to Thrift.

@coveralls
Copy link

coveralls commented Mar 26, 2021

Pull Request Test Coverage Report for Build f6aa2ce7-9d0b-4405-9180-5b7c75d6d6a3

  • 342 of 1066 (32.08%) changed or added relevant lines in 15 files are covered.
  • 961 unchanged lines in 16 files lost coverage.
  • Overall coverage decreased (-0.8%) to 67.07%

Changes Missing Coverage Covered Lines Changed/Added Lines %
cmd/server/cadence/server.go 0 1 0.0%
client/matching/grpcClient.go 31 39 79.49%
service/matching/grpcHandler.go 38 50 76.0%
host/onebox.go 30 48 62.5%
common/config/rpc.go 18 54 33.33%
client/clientfactory.go 16 56 28.57%
client/history/grpcClient.go 87 159 54.72%
service/frontend/adminGrpcHandler.go 6 78 7.69%
client/admin/grpcClient.go 0 75 0.0%
service/history/grpcHandler.go 94 170 55.29%
Files with Coverage Reduction New Missed Lines %
common/config/rpc.go 1 17.14%
host/signalworkflowTest.go 1 95.53%
service/history/task/task.go 1 83.2%
common/task/fifoTaskScheduler.go 2 85.57%
service/history/execution/mutable_state_builder.go 2 69.85%
common/cache/lru.go 3 90.73%
service/history/task/transfer_standby_task_executor.go 4 90.32%
common/persistence/cassandra/cassandraPersistence.go 6 50.96%
common/types/mapper/thrift/errors.go 6 34.88%
host/onebox.go 9 81.34%
Totals Coverage Status
Change from base Build 219ec098-aeec-4736-8195-1bb17cb761d4: -0.8%
Covered Lines: 98274
Relevant Lines: 146525

💛 - Coveralls

@vytautas-karpavicius vytautas-karpavicius marked this pull request as ready for review March 26, 2021 12:36
@vytautas-karpavicius vytautas-karpavicius requested a review from a team March 26, 2021 12:36
common/config/rpc.go Outdated Show resolved Hide resolved
serviceName string,
hostName string,
) (*yarpc.Dispatcher, error) {
grpcAddress, err := d.grpcPorts.GetGRPCAddress(serviceName, hostName)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we assuming that all instances use the same grpc port on different hosts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, at least for open source version here.

@vytautas-karpavicius vytautas-karpavicius merged commit d0a8f7e into uber:master Apr 15, 2021
yux0 pushed a commit to yux0/cadence that referenced this pull request May 4, 2021
* Created GRPC handlers and internal clients

* GRPC port config for remaining config files

* Fix metric tags for grpc handlers

* Proper mapping for WorkflowExecutionCloseStatus

* Return yarpcerrors.Status with no match

* Updated changelog

* Fix after merge

* Do not fatal on outbound dispatcher failure

* Updated integration tests

* Minor rename

* Fix after merge
pregnor added a commit to banzaicloud/banzai-charts that referenced this pull request May 31, 2021
pregnor added a commit to banzaicloud/banzai-charts that referenced this pull request May 31, 2021
pregnor added a commit to banzaicloud/banzai-charts that referenced this pull request May 31, 2021
pregnor added a commit to banzaicloud/banzai-charts that referenced this pull request May 31, 2021
pregnor added a commit to banzaicloud/banzai-charts that referenced this pull request May 31, 2021
pregnor added a commit to banzaicloud/banzai-charts that referenced this pull request May 31, 2021
pregnor added a commit to banzaicloud/banzai-charts that referenced this pull request May 31, 2021
pregnor added a commit to banzaicloud/banzai-charts that referenced this pull request May 31, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants