Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set latest optimization feature for YugabyteDB #5

Merged
merged 2 commits into from
Jul 11, 2022
Merged

Set latest optimization feature for YugabyteDB #5

merged 2 commits into from
Jul 11, 2022

Conversation

FranckPachot
Copy link
Contributor

Many improvement have been made to the latest version of YugabyteDB. Some features must always be on (they are not enabled by default to allow rolling upgrades - must be off until all the cluster is on the new version). Some allow some optimizations that are relevant in a IoT ingest context

Always good to set

  • ysql_enable_packed_row set when starting the cluster enable the new way of storing rows with all columns in the same document. Always good to set as soon as all nodes of the cluster are in version 2.15
  • yb_enable_expression_pushdown set for postgresql sessions allow some filtering to be done at storage level. Always good to set when all nodes are in a version supporting pushdowns

Specific to the IoT workload

  • yb_disable_transactional_writes releases the atomicity of transactions. It is acceptable for bulk load into a table with no secondary index, where we don't need that the visibility of the batch insert is all-or-nothing
  • yb_enable_upsert_mode saves the read that must be done before the writ in order to detect duplicate keys. The behavior is updating the previous row if there's already one, which cannot happen as the primary key is generated

With those, the performance of the YSQL API should be closer to the performance of the YCQL API

Note that you mention some timeouts during the queries, do not hesitate to contact me to check this. The default timeouts are suited for OLTP and can be increased for analytics

yb_enable_upsert_mode=on
yb_disable_transactional_writes=on
ysql_enable_packed_row=true
@swoehrl-mw
Copy link
Collaborator

Hi @FranckPachot. Thanks for taking the time and contributing these improvements and also for explaining what they do. I'll try to redo the tests for YugabyteDB in the next few days and update the results.

Note that you mention some timeouts during the queries, do not hesitate to contact me to check this.

Should the timeouts also happen with the newest version I'll take you up on this, thanks.

@swoehrl-mw swoehrl-mw merged commit fbda3c4 into MaibornWolff:main Jul 11, 2022
@FranckPachot
Copy link
Contributor Author

Should the timeouts also happen with the newest version I'll take you up on this, thanks.

The best would be to see the error and when it happens. I had a look at the queries, and submitted a PR with an optimisation (which whould be good for all postgres-compatible DB as well)

There's one thing in the latest version of YugabyteDB: auto-splitting runs in the background. This may help queries but may slow the ingest (depending how long it runs, and the final size of the table). It is possible to add a parameter to pre-split at table creation (for example ysql_num_shards_per_tserver: 8 which is the first target of auto-split)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants