Implement Multinode connection pool #139

srleyva · 2020-09-13T18:39:27Z

This commit adds a connection pool that takes a static
list of nodes and distributes the load.
trait for setting connection distribution.
Defaults to RoundRobin.

* This commit adds a connection pool that takes a static list of nodes and distributes the load. * trait for setting connection distribution. Defaults to RoundRobin.

russcam

The implementation looks good. Would sniffing connection pool be implemented on top of this? I think sniffing is a trickier implementation, but it is probably more valuable - the current connection pool implementation may need to change to accommodate interior mutability of a collection of Connection types

elasticsearch/src/http/transport.rs

srleyva · 2020-09-16T06:58:35Z

I believe your assessment is correct as far as changing the ConnectionPool interface to account for interior mutability. It seems the only way to do that across threads is a mutex type. I think there’s overhead in requiring every ConnectionPool type to require a Mutex even if it’s not dynamic but it’s only way I see how to do this. Any thoughts?

russcam · 2020-09-21T01:19:40Z

Thinking about sniffing further and discussing with folks on the clients team, the connection pool is responsible for managing one or more connections (connections being details about nodes in the cluster such as Uri) and data related to connections such as whether the pool supports sniffing, but the Transport is responsible for acting on this data i.e. the Transport should perform sniffing and (re)seed the pool if the pool supports it.

With this in mind, a sniffing connection pool would likely have a Arc<RwLock<Vec<Connection>>> (or similar construct) to allow multi-threaded reads through immutable references, with a single writer through a mutable reference. A question that I have is how this kind of connection pool advertises itself e.g. through the ConnectionPool trait with something like

pub trait ConnectionPool: Debug + dyn_clone::DynClone + Sync + Send {
    /// Gets a reference to the next [Connection]
    fn next(&self) -> &Connection;

    fn reseedable(&self) -> bool {
        false
    }

    fn reseed(&self, connections: Vec<Connection>) {}
}

Would need to play around to get the right type and function design for this. StaticNodeListConnectionPool might then be more aptly named MultiNodeConnectionPool with functions to construct sniffable and non-sniffable i.e. static versions.

* The connection should be owned by the current user of said connection

srleyva · 2020-09-21T06:29:01Z

elasticsearch/src/http/transport.rs

+            // NodesInfo::new(&self, NodesInfoParts::None)
+            //     .send()
+            //     .await
+            //     .expect("Could not retrieve nodes for refresh");


This is technically a recursive call, which requires returning a futures::future::BoxFuture, still playing around with making that work without breaking the current send implementation.

I think we can cheat here and use the reqwest client directly - the .NET/C# client does something similar:

https://github.com/elastic/elasticsearch-net/blob/0c0b9e77dacc90e07a01bd5ec190fd9c2017d903/src/Elasticsearch.Net/Transport/Pipeline/RequestPipeline.cs#L426-L459

I understand reseeding the pool would be synchronous with the request? We may have a race condition if two threads have reseedable() == true and start reseeding.

It would be better for reseeding to be a background periodic operation and send() can then use the current list of nodes.

My current understanding (concurrency is hard so it may be wrong :D), is that if two threads get the reseedable() == true it'll kick off two serial reseeds as the reseed routine acquires a write lock (I don't predict this to be a likely event but definitely needs to be accounted for. I thought this trade off would be ok accounting for complexity of adding a subroutine for triggering reseeds). The problem I see is reseeding would block the current ConnectionPool requests until the Write lock is dropped. If you think kicking off multiple serial requests may be problematic I am more than willing to change the implementation to a background operation. :)

I envisage the implementation doing something like the following:

An API request is initiated

The connection pool indicates if it is reseedable and needs reseeding

If reseeding is needed, a thread-safe signal is set to indicate reseeding is in progress

A reseed request is started, either in synchronous flow of the request, or on a separate thread. A separate thread is probably preferable as the request that initiates reseeding is not then impacted by the reseeding process. It does make the implementation more complex though.

Upon receiving the reseed response, a write lock is acquired to update the connection pool. Failure in the reseeding process, should be logged (I think we can add a TODO for this for now and skip, as the client does not currently have logging in place, but should have in future).

A thread-safe signal is set to indicate reseeding is not in progress

elasticsearch/src/http/transport.rs

srleyva · 2020-09-21T06:42:53Z

WIP as I still need to implement the NodeInfo parts, but thought I'd get feedback on the current reseed implementation and the changes to the ConnectionPool trait. :)

russcam · 2020-09-21T07:25:21Z

Really appreciate your effort, @srleyva! I've left some comments

swallez · 2020-09-21T19:51:50Z

elasticsearch/src/http/transport.rs

+            // NodesInfo::new(&self, NodesInfoParts::None)
+            //     .send()
+            //     .await
+            //     .expect("Could not retrieve nodes for refresh");


I understand reseeding the pool would be synchronous with the request? We may have a race condition if two threads have reseedable() == true and start reseeding.

It would be better for reseeding to be a background periodic operation and send() can then use the current list of nodes.

elasticsearch/src/http/transport.rs

elasticmachine · 2020-09-21T20:29:09Z

Since this is a community submitted pull request, a Jenkins build has not been kicked off automatically. Can an Elastic organization member please verify the contents of this patch and then kick off a build manually?

srleyva · 2020-09-23T03:33:33Z

First pass at node sniff request following elastic docs. I thought even if aren't sure of the final design, having the request parts up would be easier.

Manual tests for round robin load balancing on a local docker-compose ES cluster look quite promising when seeded with http://localhost:9200 😄

Ping Response: 200 OK Url: http://172.23.0.3:9200/
Ping Response: 200 OK Url: http://172.23.0.6:9200/
Ping Response: 200 OK Url: http://172.23.0.5:9200/
Ping Response: 200 OK Url: http://172.23.0.2:9200/
Ping Response: 200 OK Url: http://172.23.0.3:9200/
Ping Response: 200 OK Url: http://172.23.0.6:9200/
Ping Response: 200 OK Url: http://172.23.0.5:9200/
Ping Response: 200 OK Url: http://172.23.0.2:9200/
Ping Response: 200 OK Url: http://172.23.0.3:9200/

* make review edits

elasticsearch/src/http/transport.rs

srleyva · 2020-09-24T06:35:15Z

I'll play around with using a background thread to reseed over the next few days. Thanks for your patience and guidance on this as I am still very new to rust. 😄

* Style changes per code review

srleyva · 2021-01-18T23:11:18Z

Hey @russcam is there any update on this PR? Apologies for the lack of communication. Its feature complete according to the original requirements. If there's anything else that's needed please let me know 😄

russcam · 2021-01-27T01:14:42Z

Thanks for the ping, @srleyva 👍 Hoping to get a chance to look at this again this week, unless @swallez beats me to it!

mfelsche · 2021-11-18T14:08:15Z

What is holding this PR back? We would like to use the official client, but without at least a way to target multiple elastic nodes, this is not feasible for us. Is there any way I could help out to get this over the finish line?

tommilligan · 2021-11-19T08:32:50Z

I'm also interested in getting this merged. The branch looks stale and based on old dependencies - I'll rebase it on master and update to newer dependencies, and open as a new PR.

mfelsche · 2021-11-19T08:47:00Z

Great! @tommilligan ping me, if you need another hand or pair of eyes. :)

tommilligan · 2021-11-22T16:11:27Z

@mfelsche I've opened the rebased PR at #189

swallez · 2024-08-23T16:45:36Z

Superseded by PR #189

Implement static node list connection pool

db472bd

* This commit adds a connection pool that takes a static list of nodes and distributes the load. * trait for setting connection distribution. Defaults to RoundRobin.

russcam reviewed Sep 15, 2020

View reviewed changes

elasticsearch/src/http/transport.rs Outdated Show resolved Hide resolved

elasticsearch/src/http/transport.rs Outdated Show resolved Hide resolved

Fail fast if iterations fail for url parsing on static nodes

21b563d

Rename connection and return cloned object

2590242

* The connection should be owned by the current user of said connection

srleyva force-pushed the StaticNodeList branch from 6609113 to 1e8c4df Compare September 21, 2020 06:27

srleyva commented Sep 21, 2020

View reviewed changes

elasticsearch/src/http/transport.rs Show resolved Hide resolved

swallez reviewed Sep 21, 2020

View reviewed changes

Allow reseeding of nodes on MultiNodeConnection

2705663

srleyva force-pushed the StaticNodeList branch from 1e8c4df to 4de791d Compare September 21, 2020 20:29

srleyva force-pushed the StaticNodeList branch 2 times, most recently from 0ccc20d to fedd490 Compare September 23, 2020 03:26

srleyva added 2 commits September 22, 2020 22:26

Implement Sniff Nodes request

61c7b0b

* make review edits

Create constructors for static and sniffing node pool transports

dbfb7c5

srleyva force-pushed the StaticNodeList branch from 7556a70 to dbfb7c5 Compare September 23, 2020 05:26

srleyva commented Sep 23, 2020

View reviewed changes

elasticsearch/src/http/transport.rs Outdated Show resolved Hide resolved

srleyva changed the title ~~Implement static node list connection pool~~ Implement Multinode connection pool Sep 23, 2020

russcam mentioned this pull request Sep 23, 2020

Extract credentials passed in Url #147

Merged

Introduce AtomicBool to prevent multiple reseeds across threads

214ef0a

srleyva commented Sep 23, 2020

View reviewed changes

elasticsearch/src/http/transport.rs Show resolved Hide resolved

russcam reviewed Sep 24, 2020

View reviewed changes

elasticsearch/src/http/transport.rs Outdated Show resolved Hide resolved

elasticsearch/src/http/transport.rs Outdated Show resolved Hide resolved

elasticsearch/src/http/transport.rs Outdated Show resolved Hide resolved

elasticsearch/src/http/transport.rs Show resolved Hide resolved

russcam reviewed Sep 24, 2020

View reviewed changes

elasticsearch/src/http/transport.rs Outdated Show resolved Hide resolved

elasticsearch/src/http/transport.rs Outdated Show resolved Hide resolved

elasticsearch/src/http/transport.rs Outdated Show resolved Hide resolved

Add regex parsing bound_address to URL

50f8084

* Style changes per code review

srleyva force-pushed the StaticNodeList branch from ff48166 to 50f8084 Compare September 24, 2020 16:04

srleyva and others added 2 commits September 26, 2020 14:49

Reseed connections on seperate thread

7df36ef

Merge branch 'master' into StaticNodeList

ffb0832

SpyrosRoum mentioned this pull request Mar 12, 2021

Replace current elasticsearch client with the official one tremor-rs/tremor-runtime#832

Closed

tommilligan mentioned this pull request Nov 22, 2021

feature: Add multi-node connection pool #189

Merged

swallez closed this Aug 23, 2024

Implement Multinode connection pool #139

Implement Multinode connection pool #139

Uh oh!

Conversation

srleyva commented Sep 13, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

russcam left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

srleyva commented Sep 16, 2020

Uh oh!

russcam commented Sep 21, 2020

Uh oh!

srleyva Sep 21, 2020

Choose a reason for hiding this comment

Uh oh!

russcam Sep 21, 2020

Choose a reason for hiding this comment

Uh oh!

swallez Sep 21, 2020

Choose a reason for hiding this comment

Uh oh!

srleyva Sep 21, 2020

Choose a reason for hiding this comment

Uh oh!

russcam Sep 24, 2020

Choose a reason for hiding this comment

Uh oh!

Uh oh!

srleyva commented Sep 21, 2020

Uh oh!

russcam commented Sep 21, 2020

Uh oh!

swallez Sep 21, 2020

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

elasticmachine commented Sep 21, 2020

Uh oh!

srleyva commented Sep 23, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

srleyva commented Sep 24, 2020

Uh oh!

srleyva commented Jan 18, 2021

Uh oh!

russcam commented Jan 27, 2021

Uh oh!

mfelsche commented Nov 18, 2021

Uh oh!

tommilligan commented Nov 19, 2021

Uh oh!

mfelsche commented Nov 19, 2021

Uh oh!

tommilligan commented Nov 22, 2021

Uh oh!

swallez commented Aug 23, 2024

Uh oh!

Uh oh!

srleyva commented Sep 13, 2020 •

edited

Loading

srleyva commented Sep 23, 2020 •

edited

Loading