Skip to content

Conversation

thelabdude
Copy link
Contributor

@thelabdude thelabdude commented Nov 6, 2020

Description

For collections with many shards (or aliases with many collections and some shards), CloudSolrStream will end up creating a new HttpSolrClient for every SolrStream it opens because the cache key is the full core URL, such as: http://127.0.0.1:63460/solr/collection4_shard4_replica_n6/

In addition, CloudSolrStream#getSlices was calling clusterState.getCollectionsMap() which pre-emptively loads all LazyCollectionRef from ZK unnecessarily. This could cause an issue with clusters with many collections and slow down the streaming expression execution.

Solution

In this PR, I've introduced a new ctor in SolrStream to pass the Replica's baseUrl and core as separate parameters. This leads to reusing the same HttpSolrClient for the same node because the cache key is now http://127.0.0.1:63460/solr/. I chose this new ctor approach because CloudSolrStream is not the only consumer of SolrStream and it knows how the list of URLs where constructed from cluster state, so it can safely make the decision about passing the core and reusing clients.

When the request is sent to the remote core, we need to add the core name to the path. This happens in SolrStream#constructParser. This method was public and takes a SolrClient (even though SolrStream already has an HttpSolrClient created in the open method); I've changed the signature to be private and use the client opened in the open method.

Tests

Added a new test testCloudStreamClientCache in StreamingTest to verify the SolrStreams created by the CloudStream have the correct baseUrl (without the core).

Checklist

Please review the following and check all that apply:

  • I have reviewed the guidelines for How to Contribute and my code conforms to the standards described there to the best of my ability.
  • I have created a Jira issue and added the issue ID to my pull request title.
  • I have given Solr maintainers access to contribute to my PR branch. (optional but recommended)
  • I have developed this patch against the master branch.
  • I have run ./gradlew check.
  • I have added tests for my changes.
  • I have added documentation for the Ref Guide (for Solr changes only).

@thelabdude thelabdude marked this pull request as draft November 6, 2020 22:08
@thelabdude
Copy link
Contributor Author

I don't think the current solution solves the problem of clusters many nodes, you'll still get a bunch of client objects, still investigating what can be done about that

@@ -334,11 +334,6 @@ private StreamComparator parseComp(String sort, String fl) throws IOException {
public static Slice[] getSlices(String collectionName, ZkStateReader zkStateReader, boolean checkAlias) throws IOException {
ClusterState clusterState = zkStateReader.getClusterState();

Map<String, DocCollection> collectionsMap = clusterState.getCollectionsMap();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

related: can we update the javadoc on clusterState.getCollectionsMap to be more explicit that it will make a call to zk, instead of the current may

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good idea


//TODO we should probably split collection by comma to query more than one
// which is something already supported in other parts of Solr

// check for alias or collection
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we cache the value of zkStateReader.getAliases below to avoid volatile reads?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved the call to getAliases out of the for loop

@@ -126,6 +135,17 @@ public void open() throws IOException {
}
}

private String getNodeUrl() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we precomute this in the constructor?

@thelabdude thelabdude marked this pull request as ready for review November 18, 2020 17:54
@thelabdude
Copy link
Contributor Author

Hi @joel-bernstein wasn't able to assign you as a reviewer on this, but would love for you to take a look when convenient.

}

// Check collection case insensitive
for(Entry<String, DocCollection> entry : collectionsMap.entrySet()) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed this b/c I don't think we should try to accommodate improperly cased collection names. No tests broke, but let me know if we need this for some reason I don't understand

} else {
// stream of replicas to reuse the same SolrHttpClient per baseUrl
// avoids re-parsing data we already have in the replicas
streamOfSolrStream = getReplicas(this.zkHost, this.collection, this.streamContext, mParams).stream()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we're keeping the Replica so we have direct access to its baseUrl and core name instead of parsing those out of the shardUrl

@@ -268,8 +275,7 @@ private Map mapFields(Map fields, Map<String,String> mappings) {
return fields;
}

// temporary...
public TupleStreamParser constructParser(SolrClient server, SolrParams requestParams) throws IOException, SolrServerException {
private TupleStreamParser constructParser(SolrParams requestParams) throws IOException, SolrServerException {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't seem like this method needed to be public and we already get a SolrClient in the open method, so no need to pass it. However, this breaks a public method signature, so is only for Solr 9.x and shouldn't be back-ported to 8.x

@thelabdude
Copy link
Contributor Author

Plan to merge this into master later today.

@thelabdude thelabdude merged commit 30e5e38 into apache:master Dec 7, 2020
thelabdude added a commit to thelabdude/lucene-solr that referenced this pull request Dec 7, 2020
ctargett pushed a commit to ctargett/lucene-solr that referenced this pull request Dec 16, 2020
epugh pushed a commit to epugh/lucene-solr-1 that referenced this pull request Jan 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants