Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: DirectPath calls do not duplicate the request metadata #3663

Draft
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

rockspore
Copy link
Member

@rockspore rockspore commented Feb 25, 2025

This is achieved by:

  1. InstantiatingGrpcChannelProvider always uses the given credentials when creating the gRPC ManagedChannel.
  2. GrpcCallContext does not put call credentials into the call options during withCredentials() calls.

So far the DirectPath flow is verified to work properly in local test setups.

@product-auto-label product-auto-label bot added the size: m Pull request size is medium. label Feb 25, 2025
Comment on lines 678 to 681
// Use default TLS credentials if we cannot initialize channel credentials via DCA or S2A.
channelCredentials =
CompositeChannelCredentials.create(
TlsChannelCredentials.create(), MoreCallCredentials.from(credentials));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

qq, for this flow, shouldn't it be just TlsChannelCredentials.create() for the channelCredentials?

Why does it need to be done via CompositeChannelCredentials?

Copy link
Contributor

@lqiu96 lqiu96 Feb 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wait.. This is because we were sending the GoogleCredentials as part of CallOptions. NVM I think this is correct.

@@ -185,21 +184,8 @@ public GrpcCallContext nullToSelf(ApiCallContext inputContext) {

@Override
public GrpcCallContext withCredentials(Credentials newCredentials) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the one main concern about this is that this is a behavior breaking change if there are uses that manually create a GrpcCallContext and use withCredentials. I don't think there are really any valid uses cases for this, but I will need to check for this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that makes sense.

What's also concerning is, by looking at the CI failures, it seems many tests have been building the channel from InstantiatingGrpcChannelProvider without actually giving it a credentials. And MoreCallCredentials.from() throws exception for null credentials.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those showcase tests seems to be fixable within this PR. But I'm unsure about the downstream compatibility tests and those on Cloud Build (I lack permission to view logs).

Copy link
Contributor

@lqiu96 lqiu96 Feb 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The showcase tests were following this: https://github.com/googleapis/gapic-showcase?tab=readme-ov-file#example-for-java-grpc

Essentially it's a local server that is not expected to do any authentication. I think we just need a way to be able to hit the local server with no credentials provided.

The downstream tests for our partner teams is a bit concerning. I think we'll need to investigate them to see if they're just a testing configuration issue or a logic issue.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Essentially it's a local server that is not expected to do any authentication. I think we just need a way to be able to hit the local server with no credentials provided.

Would it make sense if we allow null credentials in the case of one-way TLS? I believe there are no valid use cases to go without credentials here when mTLS or DirectPath are used.

The downstream tests for our partner teams is a bit concerning. I think we'll need to investigate them to see if they're just a testing configuration issue or a deeper issue.

Agreed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize a tricky thing we missed earlier.

The API setChannelConfigurator takes an option that gets applied to the channel builder. In various tests (downstream or showcase), ManagedChannelBuilder::usePlainText is given. While users are not supposed to use insecure channels in production, this is reasonable and often convenient in tests when the server is turned on locally. This change is breaking that behavior.

Basically, either we have to change all tests (setting up TLS can be difficult), or we need a way to allow insecure channels w/ credentials, and TlsChannelCredentials.create() doesn't support that. For the second approach, IIUC, it goes back to the need of supporting adding call credentials to a managed channel builder that is possibly configured to build insecure channels. cc: @ejona86

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using InsecureChannelCredentials.create() instead of TlsChannelCredentials would be fine for those specific tests. But the existing sdk API works poorly for that.

You can add CallCredentials per-RPC, either on the stub, in the CallOptions, or with an interceptor (that adds it to the CallOptions).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes unfortunately the existing SDK API doesn't take InsecureChannelCredentials explicilty.

The client SDK has been attaching call credentials to the CallOptions. Now the problem is that DirectPath (via GoogleDefaultChannelCredentials) gets or creates its own call credentials. So currently, this is causing the same request metadata to appear twice. Because we want to adopt bound credentials in DirectPath, it's not ideal to change this GoogleDefaultChannelCredentials to build without call credentials. Nor would it be good for backward compatibility reasons.

So as this PR is trying to do, we want to stop attaching call credentials to CallOptions in all gRPC cases. But then we need to make sure all gRPC cases can get the call credentials from the channel. ManagedChannelBuilder doesn't allow it. This is where we are stuck.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can add an interceptor to the channel that attaches CallCreds to the CallOptions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can add an interceptor to the channel that attaches CallCreds to the CallOptions.

Ah. I didn't realize this from your first comment. Will try it. Thanks!

@lqiu96
Copy link
Contributor

lqiu96 commented Feb 25, 2025

/gcbrun

@rockspore
Copy link
Member Author

@lqiu96 Thanks for the patch. I applied it and it was able to solve almost all the test failures except for one in java-bigtable. I was guessing maybe the test there wasn't giving GoogleCredentials so it didn't get intercepted. Attempting to fix this, I changed the logic so that it always uses the interceptor to inject the call credentials, except for DirectPath and S2A w/ hard bound token use cases in the latest commit. But the failure still persists.

Now I suspect that the failing case may be truly relying on the call credentials attached to the client context, and then call options, which this PR stops doing. The investigation is still WIP.

@rockspore
Copy link
Member Author

Perhaps it's just this one test file that's configuring the channel with proxy: https://github.com/googleapis/java-bigtable/blob/cd7f82e4b66dc3c34262c73b26afc2fdfd1deed7/google-cloud-bigtable/src/test/java/com/google/cloud/bigtable/data/v2/stub/metrics/BuiltinMetricsTracerTest.java#L256

We could try fixing that test by attaching credentials to the channel provider there if that's the only failure.

And we can talk more about the last commit (add interceptor with GoogleCredentials only or not) off line.

@lqiu96
Copy link
Contributor

lqiu96 commented Feb 26, 2025

Hmm, I think this may be a new edge case we haven't considered. For the test, BigTable is passing in a FixedTransportChannel with a gRPC Transport Channel already configured (it's plaintext and no CallCredentials is used during channel creation). They also pass in a FixedCredentialProvider with a credential to be used.

The test expects that the FixedCredentialProvider Cred value will be passed to the CallOptions (or be used), but since we changed the logic so that only the channel creds are used, it ends up not using the creds at all (no bearer token).

@lqiu96
Copy link
Contributor

lqiu96 commented Feb 26, 2025

Actually after another look, I think this test is a bit confusing and there is probably more going on in the BigTable library.

FixedTransportChannelProvider's needCredentials() is always false and it doesn't support taking Credentials. I'm not sure how BigTable's test is expecting a Bearer token since the TransportChannel they created doesn't have any creds and it wouldn't be passed from ClientContext. This may be just a testing configuration issue more than anything.

@rockspore
Copy link
Member Author

Actually after another look, I think this test is a bit confusing and there is probably more going on in the BigTable library.

FixedTransportChannelProvider's needCredentials() is always false and it doesn't support taking Credentials. I'm not sure how BigTable's test is expecting a Bearer token since the TransportChannel they created doesn't have any creds and it wouldn't be passed from ClientContext. This may be just a testing configuration issue more than anything.

If their test is relying on the existing behavior (prior to this PR) that the credentials will be passed to the GrpcCallContext, we are indeed breaking them, right? So, other than FixedTransportChannelProvider and InstantiatingGrpcChannelProvider, is there any other providers that can be used with GrpcCallContext?

@lqiu96
Copy link
Contributor

lqiu96 commented Feb 28, 2025

If their test is relying on the existing behavior (prior to this PR) that the credentials will be passed to the GrpcCallContext, we are indeed breaking them, right?

Yes, unfortunately I believe so. At least for the test, it looks like they have FixedTransportChannel (with no Creds) and a FixedCredentialProvider with the creds that they want to apply. I think there are certain ways we can work with this inside Gax to pass this over. I think I'll need to talk to a BigTable expert to confirm this to ensure we're not breaking anything.

So, other than FixedTransportChannelProvider and InstantiatingGrpcChannelProvider, is there any other providers that can be used with GrpcCallContext?

I don't believe so. There are other TransportChannelProviders, but they're used in testing so I don't think we should be impacted by them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size: m Pull request size is medium.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants