Skip to content

Commit bef1eb3

Browse files
committed
HBASE-23347 Allow custom authentication methods for RPCs
Decouple the HBase internals such that someone can implement their own SASL-based authentication mechanism and plug it into HBase RegionServers/Masters. Comes with a design doc in dev-support/design-docs and an example in hbase-examples known as "Shade" which uses a flat-password file for authenticating users. Closes #884 Signed-off-by: Wellington Chevreuil <wchevreuil@apache.org> Signed-off-by: Andrew Purtell <apurtell@apache.org> Signed-off-by: Reid Chan <reidchan@apache.org>
1 parent 0935ba4 commit bef1eb3

File tree

53 files changed

+4102
-537
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

53 files changed

+4102
-537
lines changed
Lines changed: 179 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,179 @@
1+
<!--
2+
Licensed to the Apache Software Foundation (ASF) under one
3+
or more contributor license agreements. See the NOTICE file
4+
distributed with this work for additional information
5+
regarding copyright ownership. The ASF licenses this file
6+
to you under the Apache License, Version 2.0 (the
7+
"License"); you may not use this file except in compliance
8+
with the License. You may obtain a copy of the License at
9+
10+
http://www.apache.org/licenses/LICENSE-2.0
11+
12+
Unless required by applicable law or agreed to in writing, software
13+
distributed under the License is distributed on an "AS IS" BASIS,
14+
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
15+
See the License for the specific language governing permissions and
16+
limitations under the License.
17+
-->
18+
19+
# Pluggable Authentication for HBase RPCs
20+
21+
## Background
22+
23+
As a distributed database, HBase must be able to authenticate users and HBase
24+
services across an untrusted network. Clients and HBase services are treated
25+
equivalently in terms of authentication (and this is the only time we will
26+
draw such a distinction).
27+
28+
There are currently three modes of authentication which are supported by HBase
29+
today via the configuration property `hbase.security.authentication`
30+
31+
1. `SIMPLE`
32+
2. `KERBEROS`
33+
3. `TOKEN`
34+
35+
`SIMPLE` authentication is effectively no authentication; HBase assumes the user
36+
is who they claim to be. `KERBEROS` authenticates clients via the KerberosV5
37+
protocol using the GSSAPI mechanism of the Java Simple Authentication and Security
38+
Layer (SASL) protocol. `TOKEN` is a username-password based authentication protocol
39+
which uses short-lived passwords that can only be obtained via a `KERBEROS` authenticated
40+
request. `TOKEN` authentication is synonymous with Hadoop-style [Delegation Tokens](https://steveloughran.gitbooks.io/kerberos_and_hadoop/content/sections/hadoop_tokens.html#delegation-tokens). `TOKEN` authentication uses the `DIGEST-MD5`
41+
SASL mechanism.
42+
43+
[SASL](https://docs.oracle.com/javase/8/docs/technotes/guides/security/sasl/sasl-refguide.html)
44+
is a library which specifies a network protocol that can authenticate a client
45+
and a server using an arbitrary mechanism. SASL ships with a [number of mechanisms](https://www.iana.org/assignments/sasl-mechanisms/sasl-mechanisms.xhtml)
46+
out of the box and it is possible to implement custom mechanisms. SASL is effectively
47+
decoupling an RPC client-server model from the mechanism used to authenticate those
48+
requests (e.g. the RPC code is identical whether username-password, Kerberos, or any
49+
other method is used to authenticate the request).
50+
51+
RFC's define what [SASL mechanisms exist](https://www.iana.org/assignments/sasl-mechanisms/sasl-mechanisms.xml),
52+
but what RFC's define are a superset of the mechanisms that are
53+
[implemented in Java](https://docs.oracle.com/javase/8/docs/technotes/guides/security/sasl/sasl-refguide.html#SUN).
54+
This document limits discussion to SASL mechanisms in the abstract, focusing on those which are well-defined and
55+
implemented in Java today by the JDK itself. However, it is completely possible that a developer can implement
56+
and register their own SASL mechanism. Writing a custom mechanism is outside of the scope of this document, but
57+
not outside of the realm of possibility.
58+
59+
The `SIMPLE` implementation does not use SASL, but instead has its own RPC logic
60+
built into the HBase RPC protocol. `KERBEROS` and `TOKEN` both use SASL to authenticate,
61+
relying on the `Token` interface that is intertwined with the Hadoop `UserGroupInformation`
62+
class. SASL decouples an RPC from the mechanism used to authenticate that request.
63+
64+
## Problem statement
65+
66+
Despite HBase already shipping authentication implementations which leverage SASL,
67+
it is (effectively) impossible to add a new authentication implementation to HBase. The
68+
use of the `org.apache.hadoop.hbase.security.AuthMethod` enum makes it impossible
69+
to define a new method of authentication. Also, the RPC implementation is written
70+
to only use the methods that are expressly shipped in HBase. Adding a new authentication
71+
method would require copying and modifying the RpcClient implementation, in addition
72+
to modifying the RpcServer to invoke the correct authentication check.
73+
74+
While it is possible to add a new authentication method to HBase, it cannot be done
75+
cleanly or sustainably. This is what is meant by "impossible".
76+
77+
## Proposal
78+
79+
HBase should expose interfaces which allow for pluggable authentication mechanisms
80+
such that HBase can authenticate against external systems. Because the RPC implementation
81+
can already support SASL, HBase can standardize on SASL, allowing any authentication method
82+
which is capable of using SASL to negotiate authentication. `KERBEROS` and `TOKEN` methods
83+
will naturally fit into these new interfaces, but `SIMPLE` authentication will not (see the following
84+
chapter for a tangent on SIMPLE authentication today)
85+
86+
### Tangent: on SIMPLE authentication
87+
88+
`SIMPLE` authentication in HBase today is treated as a special case. My impression is that
89+
this stems from HBase not originally shipping an RPC solution that had any authentication.
90+
91+
Re-implementing `SIMPLE` authentication such that it also flows through SASL (e.g. via
92+
the `PLAIN` SASL mechanism) would simplify the HBase codebase such that all authentication
93+
occurs via SASL. This was not done for the initial implementation to reduce the scope
94+
of the changeset. Changing `SIMPLE` authentication to use SASL may result in some
95+
performance impact in setting up a new RPC. The same conditional logic to determine
96+
`if (sasl) ... else SIMPLE` logic is propagated in this implementation.
97+
98+
## Implementation Overview
99+
100+
HBASE-23347 includes a refactoring of HBase RPC authentication where all current methods
101+
are ported to a new set of interfaces, and all RPC implementations are updated to use
102+
the new interfaces. In the spirit of SASL, the expectation is that users can provide
103+
their own authentication methods at runtime, and HBase should be capable of negotiating
104+
a client who tries to authenticate via that custom authentication method. The implementation
105+
refers to this "bundle" of client and server logic as an "authentication provider".
106+
107+
### Providers
108+
109+
One authentication provider includes the following pieces:
110+
111+
1. Client-side logic (providing a credential)
112+
2. Server-side logic (validating a credential from a client)
113+
3. Client selection logic to choose a provider (from many that may be available)
114+
115+
A provider's client and server side logic are considered to be one-to-one. A `Foo` client-side provider
116+
should never be used to authenticate against a `Bar` server-side provider.
117+
118+
We do expect that both clients and servers will have access to multiple providers. A server may
119+
be capable of authenticating via methods which a client is unaware of. A client may attempt to authenticate
120+
against a server which the server does not know how to process. In both cases, the RPC
121+
should fail when a client and server do not have matching providers. The server identifies
122+
client authentication mechanisms via a `byte authCode` (which is already sent today with HBase RPCs).
123+
124+
A client may also have multiple providers available for it to use in authenticating against
125+
HBase. The client must have some logic to select which provider to use. Because we are
126+
allowing custom providers, we must also allow a custom selection logic such that the
127+
correct provider can be chosen. This is a formalization of the logic already present
128+
in `org.apache.hadoop.hbase.security.token.AuthenticationTokenSelector`.
129+
130+
To enable the above, we have some new interfaces to support the user extensibility:
131+
132+
1. `interface SaslAuthenticationProvider`
133+
2. `interface SaslClientAuthenticationProvider extends SaslAuthenticationProvider`
134+
3. `interface SaslServerAuthenticationProvider extends SaslAuthenticationProvider`
135+
4. `interface AuthenticationProviderSelector`
136+
137+
The `SaslAuthenticationProvider` shares logic which is common to the client and the
138+
server (though, this is up to the developer to guarantee this). The client and server
139+
interfaces each have logic specific to the HBase RPC client and HBase RPC server
140+
codebase, as their name implies. As described above, an implementation
141+
of one `SaslClientAuthenticationProvider` must match exactly one implementation of
142+
`SaslServerAuthenticationProvider`. Each Authentication Provider implementation is
143+
a singleton and is intended to be shared across all RPCs. A provider selector is
144+
chosen per client based on that client's configuration.
145+
146+
A client authentication provider is uniquely identified among other providers
147+
by the following characteristics:
148+
149+
1. A name, e.g. "KERBEROS", "TOKEN"
150+
2. A byte (a value between 0 and 255)
151+
152+
In addition to these attributes, a provider also must define the following attributes:
153+
154+
3. The SASL mechanism being used.
155+
4. The Hadoop AuthenticationMethod, e.g. "TOKEN", "KERBEROS", "CERTIFICATE"
156+
5. The Token "kind", the name used to identify a TokenIdentifier, e.g. `HBASE_AUTH_TOKEN`
157+
158+
It is allowed (even expected) that there may be multiple providers that use `TOKEN` authentication.
159+
160+
N.b. Hadoop requires all `TokenIdentifier` implements to have a no-args constructor and a `ServiceLoader`
161+
entry in their packaging JAR file (e.g. `META-INF/services/org.apache.hadoop.security.token.TokenIdentifier`).
162+
Otherwise, parsing the `TokenIdentifier` on the server-side end of an RPC from a Hadoop `Token` will return
163+
`null` to the caller (often, in the `CallbackHandler` implementation).
164+
165+
### Factories
166+
167+
To ease development with these unknown set of providers, there are two classes which
168+
find, instantiate, and cache the provider singletons.
169+
170+
1. Client side: `class SaslClientAuthenticationProviders`
171+
2. Server side: `class SaslServerAuthenticationProviders`
172+
173+
These classes use [Java ServiceLoader](https://docs.oracle.com/javase/8/docs/api/java/util/ServiceLoader.html)
174+
to find implementations available on the classpath. The provided HBase implementations
175+
for the three out-of-the-box implementations all register themselves via the `ServiceLoader`.
176+
177+
Each class also enables providers to be added via explicit configuration in hbase-site.xml.
178+
This enables unit tests to define custom implementations that may be toy/naive/unsafe without
179+
any worry that these may be inadvertently deployed onto a production HBase cluster.

hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/AbstractRpcClient.java

Lines changed: 0 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -41,8 +41,6 @@
4141
import java.net.SocketAddress;
4242
import java.net.UnknownHostException;
4343
import java.util.Collection;
44-
import java.util.HashMap;
45-
import java.util.Map;
4644
import java.util.concurrent.Executors;
4745
import java.util.concurrent.ScheduledExecutorService;
4846
import java.util.concurrent.ScheduledFuture;
@@ -58,17 +56,13 @@
5856
import org.apache.hadoop.hbase.client.MetricsConnection;
5957
import org.apache.hadoop.hbase.codec.Codec;
6058
import org.apache.hadoop.hbase.codec.KeyValueCodec;
61-
import org.apache.hadoop.hbase.protobuf.generated.AuthenticationProtos.TokenIdentifier.Kind;
6259
import org.apache.hadoop.hbase.security.User;
6360
import org.apache.hadoop.hbase.security.UserProvider;
64-
import org.apache.hadoop.hbase.security.token.AuthenticationTokenSelector;
6561
import org.apache.hadoop.hbase.util.EnvironmentEdgeManager;
6662
import org.apache.hadoop.hbase.util.PoolMap;
6763
import org.apache.hadoop.hbase.util.Threads;
6864
import org.apache.hadoop.io.compress.CompressionCodec;
6965
import org.apache.hadoop.ipc.RemoteException;
70-
import org.apache.hadoop.security.token.TokenIdentifier;
71-
import org.apache.hadoop.security.token.TokenSelector;
7266

7367
import org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos;
7468

@@ -104,14 +98,6 @@ public abstract class AbstractRpcClient<T extends RpcConnection> implements RpcC
10498
private static final ScheduledExecutorService IDLE_CONN_SWEEPER = Executors
10599
.newScheduledThreadPool(1, Threads.newDaemonThreadFactory("Idle-Rpc-Conn-Sweeper"));
106100

107-
@edu.umd.cs.findbugs.annotations.SuppressWarnings(value="MS_MUTABLE_COLLECTION_PKGPROTECT",
108-
justification="the rest of the system which live in the different package can use")
109-
protected final static Map<Kind, TokenSelector<? extends TokenIdentifier>> TOKEN_HANDLERS = new HashMap<>();
110-
111-
static {
112-
TOKEN_HANDLERS.put(Kind.HBASE_AUTH_TOKEN, new AuthenticationTokenSelector());
113-
}
114-
115101
protected boolean running = true; // if client runs
116102

117103
protected final Configuration conf;

hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/BlockingRpcConnection.java

Lines changed: 45 additions & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -54,6 +54,7 @@
5454
import org.apache.hadoop.hbase.security.HBaseSaslRpcClient;
5555
import org.apache.hadoop.hbase.security.SaslUtil;
5656
import org.apache.hadoop.hbase.security.SaslUtil.QualityOfProtection;
57+
import org.apache.hadoop.hbase.security.provider.SaslClientAuthenticationProvider;
5758
import org.apache.hadoop.hbase.trace.TraceUtil;
5859
import org.apache.hadoop.hbase.util.EnvironmentEdgeManager;
5960
import org.apache.hadoop.hbase.util.ExceptionUtil;
@@ -361,9 +362,10 @@ private void disposeSasl() {
361362

362363
private boolean setupSaslConnection(final InputStream in2, final OutputStream out2)
363364
throws IOException {
364-
saslRpcClient = new HBaseSaslRpcClient(authMethod, token, serverPrincipal,
365-
this.rpcClient.fallbackAllowed, this.rpcClient.conf.get("hbase.rpc.protection",
366-
QualityOfProtection.AUTHENTICATION.name().toLowerCase(Locale.ROOT)),
365+
saslRpcClient = new HBaseSaslRpcClient(this.rpcClient.conf, provider, token,
366+
serverAddress, securityInfo, this.rpcClient.fallbackAllowed,
367+
this.rpcClient.conf.get("hbase.rpc.protection",
368+
QualityOfProtection.AUTHENTICATION.name().toLowerCase(Locale.ROOT)),
367369
this.rpcClient.conf.getBoolean(CRYPTO_AES_ENABLED_KEY, CRYPTO_AES_ENABLED_DEFAULT));
368370
return saslRpcClient.saslConnect(in2, out2);
369371
}
@@ -375,11 +377,10 @@ private boolean setupSaslConnection(final InputStream in2, final OutputStream ou
375377
* connection again. The other problem is to do with ticket expiry. To handle that, a relogin is
376378
* attempted.
377379
* <p>
378-
* The retry logic is governed by the {@link #shouldAuthenticateOverKrb} method. In case when the
379-
* user doesn't have valid credentials, we don't need to retry (from cache or ticket). In such
380-
* cases, it is prudent to throw a runtime exception when we receive a SaslException from the
381-
* underlying authentication implementation, so there is no retry from other high level (for eg,
382-
* HCM or HBaseAdmin).
380+
* The retry logic is governed by the {@link SaslClientAuthenticationProvider#canRetry()}
381+
* method. Some providers have the ability to obtain new credentials and then re-attempt to
382+
* authenticate with HBase services. Other providers will continue to fail if they failed the
383+
* first time -- for those, we want to fail-fast.
383384
* </p>
384385
*/
385386
private void handleSaslConnectionFailure(final int currRetries, final int maxRetries,
@@ -389,40 +390,44 @@ private void handleSaslConnectionFailure(final int currRetries, final int maxRet
389390
user.doAs(new PrivilegedExceptionAction<Object>() {
390391
@Override
391392
public Object run() throws IOException, InterruptedException {
392-
if (shouldAuthenticateOverKrb()) {
393-
if (currRetries < maxRetries) {
394-
if (LOG.isDebugEnabled()) {
395-
LOG.debug("Exception encountered while connecting to " +
396-
"the server : " + StringUtils.stringifyException(ex));
397-
}
398-
// try re-login
399-
relogin();
400-
disposeSasl();
401-
// have granularity of milliseconds
402-
// we are sleeping with the Connection lock held but since this
403-
// connection instance is being used for connecting to the server
404-
// in question, it is okay
405-
Thread.sleep(ThreadLocalRandom.current().nextInt(reloginMaxBackoff) + 1);
406-
return null;
407-
} else {
408-
String msg = "Couldn't setup connection for "
409-
+ UserGroupInformation.getLoginUser().getUserName() + " to " + serverPrincipal;
410-
LOG.warn(msg, ex);
411-
throw new IOException(msg, ex);
393+
// A provider which failed authentication, but doesn't have the ability to relogin with
394+
// some external system (e.g. username/password, the password either works or it doesn't)
395+
if (!provider.canRetry()) {
396+
LOG.warn("Exception encountered while connecting to the server : " + ex);
397+
if (ex instanceof RemoteException) {
398+
throw (RemoteException) ex;
412399
}
413-
} else {
414-
LOG.warn("Exception encountered while connecting to " + "the server : " + ex);
415-
}
416-
if (ex instanceof RemoteException) {
417-
throw (RemoteException) ex;
400+
if (ex instanceof SaslException) {
401+
String msg = "SASL authentication failed."
402+
+ " The most likely cause is missing or invalid credentials.";
403+
throw new RuntimeException(msg, ex);
404+
}
405+
throw new IOException(ex);
418406
}
419-
if (ex instanceof SaslException) {
420-
String msg = "SASL authentication failed."
421-
+ " The most likely cause is missing or invalid credentials." + " Consider 'kinit'.";
422-
LOG.error(HBaseMarkers.FATAL, msg, ex);
423-
throw new RuntimeException(msg, ex);
407+
408+
// Other providers, like kerberos, could request a new ticket from a keytab. Let
409+
// them try again.
410+
if (currRetries < maxRetries) {
411+
LOG.debug("Exception encountered while connecting to the server", ex);
412+
413+
// Invoke the provider to perform the relogin
414+
provider.relogin();
415+
416+
// Get rid of any old state on the SaslClient
417+
disposeSasl();
418+
419+
// have granularity of milliseconds
420+
// we are sleeping with the Connection lock held but since this
421+
// connection instance is being used for connecting to the server
422+
// in question, it is okay
423+
Thread.sleep(ThreadLocalRandom.current().nextInt(reloginMaxBackoff) + 1);
424+
return null;
425+
} else {
426+
String msg = "Failed to initiate connection for "
427+
+ UserGroupInformation.getLoginUser().getUserName() + " to "
428+
+ securityInfo.getServerPrincipal();
429+
throw new IOException(msg, ex);
424430
}
425-
throw new IOException(ex);
426431
}
427432
});
428433
}
@@ -459,7 +464,7 @@ private void setupIOstreams() throws IOException {
459464
if (useSasl) {
460465
final InputStream in2 = inStream;
461466
final OutputStream out2 = outStream;
462-
UserGroupInformation ticket = getUGI();
467+
UserGroupInformation ticket = provider.getRealUser(remoteId.ticket);
463468
boolean continueSasl;
464469
if (ticket == null) {
465470
throw new FatalConnectionException("ticket/user is null");

hbase-client/src/main/java/org/apache/hadoop/hbase/ipc/NettyRpcConnection.java

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -159,8 +159,8 @@ private void scheduleRelogin(Throwable error) {
159159
@Override
160160
public void run() {
161161
try {
162-
if (shouldAuthenticateOverKrb()) {
163-
relogin();
162+
if (provider.canRetry()) {
163+
provider.relogin();
164164
}
165165
} catch (IOException e) {
166166
LOG.warn("Relogin failed", e);
@@ -183,16 +183,16 @@ private void failInit(Channel ch, IOException e) {
183183
}
184184

185185
private void saslNegotiate(final Channel ch) {
186-
UserGroupInformation ticket = getUGI();
186+
UserGroupInformation ticket = provider.getRealUser(remoteId.getTicket());
187187
if (ticket == null) {
188188
failInit(ch, new FatalConnectionException("ticket/user is null"));
189189
return;
190190
}
191191
Promise<Boolean> saslPromise = ch.eventLoop().newPromise();
192192
final NettyHBaseSaslRpcClientHandler saslHandler;
193193
try {
194-
saslHandler = new NettyHBaseSaslRpcClientHandler(saslPromise, ticket, authMethod, token,
195-
serverPrincipal, rpcClient.fallbackAllowed, this.rpcClient.conf);
194+
saslHandler = new NettyHBaseSaslRpcClientHandler(saslPromise, ticket, provider, token,
195+
serverAddress, securityInfo, rpcClient.fallbackAllowed, this.rpcClient.conf);
196196
} catch (IOException e) {
197197
failInit(ch, e);
198198
return;

0 commit comments

Comments
 (0)