Connection Pooling

### Poolboy

Poolboy works by providing exclusive access to a given resource in a "transaction".

Pros:
- Dynamic connection allocation is taken care of for you.

Cons:
- Connection affinity is required for cursors, which means a transaction may be required for the full length of the cursor. For long running change feeds, this will require one connection per changefeed.
- Pipelining is not utilized as each connection is checked out until a response is provided.

While my assumption was to use Poolboy, it seems that there are better approaches available. There are too many tradeoffs to be made for not a whole lot of benefit.
### Hand rolled

An ideal connection pool will do the following:
1. Connect to N clients in a cluster.
2. Load balance queries between connections.
3. Route query to master replica for the target table.
4. If a connection dies, remove that connection from the rotation.
5. If a query fails because of a server connectivity issue, retry query on another host.

Initial design:

A supervisor tree will be over N connections and a coordinator.

Client processes send queries directly to the connection, just like we currently do. In order to know which connection to contact, the client can request a connection from the coordinator. When the client is finished with the query it informs the coordinator (in order for the coordinator to properly balance connections).

This would require zero changes to `Exrethinkdb.Connection`. The connection would be dumb to pooling. We could provide a `Exrethinkdb.ConnectionPool` module with `run`/`next`/`close` that wrap the logic for interacting with the coordinator. The end API for the user would be the same. We will also provide a `use Exrethinkdb.ConnectionPool` macro. Ideally you would replace `use Exrethinkdb.Connection` with `use Exrethinkdb.ConnectionPool` and your application would still work perfectly.

Routing queries to proper replicas can be done fairly well in the coordinator by looking at `table("foo_table") |> config |> run`. The coordinator can do this on a timer (once every minute is probably sufficient). It can then use this information to route the connection properly. In the event of stale_reads, the coordinator can load balance between non-master connections. We'll have to add `table` to the `Query` struct, but that's pretty straightforward.

`Exrethinkdb.ConnectionPool.run` will report back to the coordinator the success or failure of a query.  The coordinator will also be monitoring the Connection processes. There will also be logic to retry on a different server if a failure occurs (in the event of stale_reads. otherwise, retrying won't be useful until a new master is selected).

I think this custom approach will be fairly simple. It will be designed so that failure leads to a worst case scenario of routing requests to the less optimal replica.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Connection Pooling #32

Poolboy

Hand rolled

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Connection Pooling #32

Description

Poolboy

Hand rolled

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions