-
Notifications
You must be signed in to change notification settings - Fork 3.4k
HBASE-25051 DIGEST based auth broken for rpc based ConnectionRegistry #5631
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
Good, all green! |
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
@bbeaudreault Mind taking a look at this? It fixes one of the security problems for RpcConnectionRegistry. And I will also fix HBASE-28321 based on this patch. Hope we could get this done in 2.6.0 release. Thanks. |
@Apache9 This is on my list, trying to get to it soon |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good, but I had some questions and suggestions around the organization of the StubsHolder.
It might make sense to get another review here, since the code is quite complex. Maybe @bharathv if he's around, since he commented on the jira in the past?
} | ||
LOG.debug("Going to refresh stubs"); | ||
assert rpcClient != null; | ||
addr2Stub = createStubs(rpcClient, fetchEndpoints.get()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we not need to synchronize the setting of addr2Stub? Looks like we are doing it in 2 places, here and above, and we're also reading it in a synchronized block above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I now see above that we are synchronizing in the callback of createStubs, but still seems like we need it here too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, we don't. It is the different createStubs method, where we just update the addr2Stubs field without setting up any real rpc connections. It is a bit confusing as we use the same method name. Let me see if we can change this a bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My feeling is that if we're going to synchronize writes to addr2Stub
, we should synchronize all writes to it. I also wonder if we should do something like this:
if (addr2Stub == null) {
Log.debug(...);
return;
}
Collection<ServerName> endpoints = fetchEndpoints.get();
synchronized(this) {
if (addr2Stub == null) {
return;
}
addr2Stubs = createStubs(rpcClient, endpoints);
}
Just to minimize the scope of synchronization, since fetchEndpoints is a network call. Also, rather than pass in an IOExceptionSupplier, should we just pass in a future?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me check if we can change the code like this.
For future vs. IOExce[tionSupplier, the intention here is that we do not need asynchronous call here, we are in a background thread, we just fetch the result in a blocking way and update the addr2Stub, so we do not need a CompletableFuture here, the only thing is that we use an async client in the caller method so actually the implementation is a future.get.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do not need to synchronize here.
The synchronization is for protecting the initialization of addr2Stub, and after initialization, addr2Stub will never be null, so we will never enter the synchronization block in getStubs method any more. And since addr2Stub is volatile, in refreshStubs, we just update the reference to addr2Stub at once, without changing its content, it is safe to not do any synchronization. Is this clear enough? Or we could introduce a initLock and synchronized on this object in getStubs method to make the logic more clear?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok thanks, that makes sense as is
hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionRegistryRpcStubHolder.java
Show resolved
Hide resolved
hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionRegistryRpcStubHolder.java
Outdated
Show resolved
Hide resolved
// out, only a preamble connection header, but if we pass null as user, there will be NPE in | ||
// some code paths... | ||
RpcChannel channel = | ||
getConnectionRegistryRpcClient.createRpcChannel(bootstrapServers.get(index), user, 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even if the rpcTimeout is not used, I think it's preferable to pass a real value in. We have a real value in this class, let's pass it in? Maybe something changes down the line, I don't like passing a 0 value for timeout
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is intentional, I do not want to let developers think the timeout will take effect, as not everyone will look at the comment unless there is something unusual...
// cluster id, which is the old behavior | ||
LOG.debug("Failed to get connection registry info, should be an old server," | ||
+ " fallback to use null cluster id", controller.getFailed()); | ||
createStubsAndComplete(null, future); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder, do we want to try the next bootstrap server? We can fall through to old behavior if all bootstrap servers fail preamble?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could but this will make the code even more complicated, since we need to record all the failures, and have to consider what should be done if we got different exceptions from different servers...
And if a server returns us this exception, it means that the cluster is with old server, or at least still under upgrading, then we keep the old behavior is acceptable?
hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionUtils.java
Show resolved
Hide resolved
hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/NettyRpcConnection.java
Show resolved
Hide resolved
// exceptionCaught method can not help us finish the call, so here we need to catch the | ||
// exception and finish it | ||
// And in netty, the decoding the frame based, when reaching here we have already read a full | ||
// In netty, the decoding the frame based, when reaching here we have already read a full | ||
// frame, so hitting exception here does not mean the stream decoding is broken, thus we do | ||
// not need to throw the exception out and close the connection. | ||
call.setException(e); | ||
LOG.warn("failed to process response", e); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I follow this change -- why is it ok to now remove the call.setException? was it never necessary, or what part of the other changes here negate the need?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is already done in RpcConnection.readResponse method, you can check the code there. It will catch the exception for finishCall method and call the setException method. And then it will throw the exception out, for BlockingRpcClient we may close the connetion and for netty, we will just ignore it.
} | ||
// on backup masters, this request may be blocked since we need to fetch it from filesystem, | ||
// but since it is just backup master, it is not a critical problem | ||
String clusterId = ((ConnectionRegistryEndpoint) rpcServer.server).getClusterId(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there no way to protect this from blocking? maybe in a followup since this PR is already large.
my concern is that backup masters may soon become active, we don't want them blocked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The request is blocked, not the backup masters are blocked.
Thanks for the review. I could see if I can split the ConnectionRegistryRpcStubHolder too. |
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
OK, good, all green. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one more common to resolve on my side. Thanks
} | ||
LOG.debug("Going to refresh stubs"); | ||
assert rpcClient != null; | ||
addr2Stub = createStubs(rpcClient, fetchEndpoints.get()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My feeling is that if we're going to synchronize writes to addr2Stub
, we should synchronize all writes to it. I also wonder if we should do something like this:
if (addr2Stub == null) {
Log.debug(...);
return;
}
Collection<ServerName> endpoints = fetchEndpoints.get();
synchronized(this) {
if (addr2Stub == null) {
return;
}
addr2Stubs = createStubs(rpcClient, endpoints);
}
Just to minimize the scope of synchronization, since fetchEndpoints is a network call. Also, rather than pass in an IOExceptionSupplier, should we just pass in a future?
…apache#5631) Signed-off-by: Bryan Beaudreault <bbeaudreault@apache.org> (cherry picked from commit 16de74c)
No description provided.