Skip to content

Sslbranch#3675

Closed
fredxfred wants to merge 7 commits intoapache:masterfrom
fredxfred:sslbranch
Closed

Sslbranch#3675
fredxfred wants to merge 7 commits intoapache:masterfrom
fredxfred:sslbranch

Conversation

@fredxfred
Copy link
Contributor

I have been working on identifying components in the Pinot code base where data (aside from metadata e.g. segment name) can be transmitted over the network. This PR is to enable SSL configurations to be passed into controllers and brokers, and then to configure grizzly to support https. I can provide examples of the full configs of an example https controller/broker + a script to generate self-signed certs for testing if desired.

Please let me know if there is anything missing that needs to be added. I am still testing these changes (and others, see below). I wasn't sure how to perform in-code tests of whether the data is being properly encrypted but I am definitely open to suggestions.

Currently, this code does not allow a component to be configured with both http and https, which is a feature that has also been requested. Also, this does not include code I have written to configure Netty Servers, only Grizzly. Fortunately, when we URI push Azure data, that data is encrypted by default, so currently we have no plans to work on encryption during the segmentfetcher/upload process. I am putting this code up now to get a head start on review, and I do plan on adding Netty encryption and the ability to allow both http and https.

Copy link
Contributor

@mcvsubbu mcvsubbu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My apologies for a late review.
Could we make it so that we have two configs: useHttps (false by default)and useHttp (true by default). So, the default controller will retain the current behavior. New applications can unset useHttp and set useHttps to start off with https. This will also allow migration of old installations from http to https by turning on http as well as https and then turning off http at a later date when all callers have migrated to https.

thanks

@fredxfred
Copy link
Contributor Author

Just added support for using both http and https. Hopefully this works.

We ended up not going forward with encrypting the netty servers since currently ours aren't exposed to the public internet. Unlike with controllers and brokers, we can't put the servers' apis behind load balancers as easily, so it makes certs too much of a hastle.

To keep the controller query portal working after the brokers' apis were encrypted, I added a configuration that allows the user to change the way query portal queries are executed by passing all queries directly to a load balancer.

@kishoreg
Copy link
Member

@mcvsubbu Can you take a look at this PR and see if there is anything pending? @fredxfred, can you please rebase this. I think the conflicts are mostly because of reformatting the code when we moved to Apache.

@mcvsubbu
Copy link
Contributor

@fredxfred I missed this again. I will resolve this today.
cc: @kishoreg

}

public void setSSLConfigs(String keyStoreFile, String keyStorePass, String trustStoreFile, String trustStorePass) {
_useHTTPS = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please fix indentation

_baseUri = URI.create("http://0.0.0.0:" + httpPort + "/");
_httpServer = GrizzlyHttpServerFactory.createHttpServer(_baseUri, this);

if (_useHTTPS) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I am reading this right, we start one server that serves either https or http. How will existing installations migrate to the new broker? Should we not be starting both http and https servers, so that the PQL clients can first move to https, and then the brokers can be configured to serve https only? So we need an http as well as https port.

The start() method must be maintained backward compatibile, so one way to do this is to introduce a new method that provides two ports. We may deprecate the existing method or just retain it, just not starting https if it is called.

Correct me if I am wrong.

beanConfig.setSchemes(new String[]{"http"});
if (_useHTTPS) {
beanConfig.setSchemes(new String[]{"https"});
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please move up the else to this line, as per coding conventions. See https://pinot.readthedocs.io/en/latest/dev_guide.html on how to set up your dev guide so that coding conventions are automatic. thanks.

final String instanceId = "localhost_helixController";
_pinotResourceManager =
new PinotHelixResourceManager(ZkStarter.DEFAULT_ZK_STR, HELIX_CLUSTER_NAME, instanceId, null, 10000L, true, /*isUpdateStateModel=*/false, true);
new PinotHelixResourceManager(ZkStarter.DEFAULT_ZK_STR, HELIX_CLUSTER_NAME, instanceId, null, 10000L, true, /*isUpdateStateModel=*/false, true, null);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this has been re-factored, so you won't need this I think

if (!containsKey(USE_HTTP)) {
return true;
} else {
return (boolean) getBoolean(USE_HTTP);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this cast?

});
_encryptedAdminApp.start(httpsJerseyPort);
LOGGER.info("Started Jersey API for https server on port {}", httpsJerseyPort);
LOGGER.info("Pinot controller ready and listening on port {} for API requests", httpsJerseyPort);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/port/https port/
thanks

}
String brokerLBurl = _pinotHelixResourceManager.getBrokerLoadBalancerAddress();
if (brokerLBurl != null) {
LOGGER.info("using load balancer address: {}", brokerLBurl);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This log in info level is too verbose.

return sendPQLRaw(url, pqlQuery, traceEnabled);
} else {
// Validate data access
AccessControl accessControl = _accessControlFactory.create();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to validate access control irrespective of https or http, right?

setProperty(KEYSTORE_FILE, keyStoreFile);
}

public String getBrokerLoadbalancerAddress() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There can be multiple brokers in the cluster, each serving some subset of the tables. Not sure what the admin is supposed to set for the broker load balancer here

if (!accessControl.hasDataAccess(httpHeaders, tableName)) {
return QueryException.ACCESS_DENIED_ERROR.toString();
}
String brokerLBurl = _pinotHelixResourceManager.getBrokerLoadBalancerAddress();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should get the list of broker instances for the table from pinot helix resource manager, and then use https on them if configured to do so. So, perhaps that calls for another config to use https when querying brokers.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternative is for brokers to set their own configs in Helix instance config. We have gone that route with server http ports for example.

@mcvsubbu
Copy link
Contributor

Can you split this into two PRs? The broker one should be relatively simple, so let us get that out first.

@fredxfred
Copy link
Contributor Author

fredxfred commented May 29, 2019

Hi, yes I will work on splitting the PRs and addressing the code review soon (sometime in the next few days)

@npawar npawar closed this May 22, 2020
@npawar
Copy link
Contributor

npawar commented May 22, 2020

Closed this PR due to 6 months inactivity. Reopen if needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants