Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shard MySQL for connect four context #119

Closed
marein opened this issue Apr 19, 2022 · 0 comments · Fixed by #164 or #167
Closed

Shard MySQL for connect four context #119

marein opened this issue Apr 19, 2022 · 0 comments · Fixed by #164 or #167

Comments

@marein
Copy link
Owner

marein commented Apr 19, 2022

The documentation already describes it, see here. When doing load tests, see #118, MySQL is the next bottleneck. There're a few sharding strategies for MySQL, such as explicitly considering shards in the application code or using different storage engines, e.g. MySQL Cluster or using a proxy in front of MySQL, e.g. ProxySQL, Vitess or ShardingSphere. However, perhaps a switch to a different database can be considered too, e.g. TiDB or YugabyteDB.

To be sure this is the next bottleneck, I've implemented it in the application code with mod sharding and two separate MySQL instances. The test server had a throughput of ~9k inserts/updates per second for one MySQL instance. With the sharding implementation and two MySQL instances, the throughput was doubled to ~18k inserts/updates per seconds. The test run was made with a script that plays infinite games without delay against itself. To generate enough load to utilize both MySQL servers this was started 200-300 times which yielded about ~15k req/s. The remaining inserts/updates came from background processes (e.g. the referee). During the test run, the application still responded very quickly with ~50-75ms (full internet roundtrip).

@marein marein linked a pull request Mar 8, 2023 that will close this issue
@marein marein self-assigned this Mar 8, 2023
@marein marein linked a pull request Mar 18, 2023 that will close this issue
marein added a commit that referenced this issue Apr 3, 2023
Apply schema-based sharding for the connect for database #119.
The application itself only knows about logical shards and relies
on a proxy, such as ProxySQL, to forward queries to physical shards.

* The env variable APP_CONNECT_FOUR_DOCTRINE_DBAL_SHARDS
defines a comma-separated list of active shards. As games can
be sharded across multiple physical databases, the transactional scope
(technical-wise) is moved to the repository layer, and removed from the
application layer. The changes to the repository, which is now very
persistence oriented, aim to make this explicit. The shard selection
happens based on the game id. 
* The env variable APP_CONNECT_FOUR_DOCTRINE_DBAL_DATABASE
is introduced to allow selecting a shard for database creation and
migration, as doctrine commands currently don't allow passing
the database name via cli arguments.
* Besides the repository, the event store access must also be
aware of sharding, since this is read when the query model is not
yet populated. For that matter, to query the events from the
write model, the repository is used, which encapsulates the
EventStore.
* Remove query model's GameNotFoundException
This is an aside refactoring, not related. This indirection
wasn't needed in the past. Additionally, an instance of
GameId is required to find a game, not a loose string.
* The current implementation doesn't take resharding into
consideration.
marein added a commit that referenced this issue Apr 3, 2023
Applies schema-based sharding for the connect four database #119.
The application itself only knows about logical shards and relies
on a proxy, such as ProxySQL, to forward queries to physical shards.

* The env variable APP_CONNECT_FOUR_DOCTRINE_DBAL_SHARDS
defines a comma-separated list of active shards. As games can
be sharded across multiple physical databases, the transactional scope
(technical-wise) is moved to the repository layer, and removed from the
application layer. The changes to the repository, which is now very
persistence oriented, aim to make this explicit. The shard selection
happens based on the game id.
* The env variable APP_CONNECT_FOUR_DOCTRINE_DBAL_DATABASE
is introduced to allow selecting a shard for database creation and
migration, as doctrine commands currently don't allow passing
the database name via cli arguments.
* Besides the repository, the event store access must also be
aware of sharding, since this is read when the query model is not
yet populated. For that matter, to query the events from the
write model, the repository is used, which encapsulates the
EventStore.
* Remove query model's GameNotFoundException
This is an aside refactoring, not related. This indirection
wasn't needed in the past. Additionally, an instance of
GameId is required to find a game, not a loose string.
* The current implementation doesn't take resharding into
consideration.
marein added a commit that referenced this issue Apr 5, 2023
Adds ProxySQL to the stack, which acts as a connection pool
and provides the ability to route queries to different physical
MySQL machines, e.g. for schema-based sharding #119.

Applications communicate with ProxySQL via unix domain sockets,
which means that it will be deployed as a sidecar.
There are no query rules defined for the environments yet.
ProxySQL will pay of during load tests #118.

ProxySQL is optionally configurable via the PROXYSQL_CONFIG
environment variable, which is required for the Play with Docker
environment because the stack file must be self-contained.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
1 participant