Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[improve][io] JDBC sinks: implement JDBC Batch API #18017

Merged
merged 6 commits into from
Oct 20, 2022

Conversation

nicoloboschi
Copy link
Contributor

@nicoloboschi nicoloboschi commented Oct 12, 2022

Motivation

Usually in high load systems the JDBC drivers are meant to be used with the JDBC Batch API. Currently it's not implemented in the jdbc sink

Modifications

  • New flag useJdbcBatch to turn on this feature (default to false)
  • If useJdbcBatch is on, then the sink will use ´addBatch()´ and executeBatch() methods
  • In case in the same messages batch group (defined by batchSize or timeoutMs) there are different kind of operations, the batches are split in sub-batches by operation type but they're still in the same transaction. This means that if you use transactions, the behavior is semantically the same. If transactions are disabled (autocommit=true) the behavior could be a little different since more statements could be executed than without JDBC batch enabled. For example, if in the same batch you have two messages and first statement fails:
    • A) no jdbc batch -> the first fails, the second fails.
    • B) with jdbc batch -> the first goes in, the second fails. this is due to the JDBC batch nature. However it's recommended to turn on transactions if the driver has support for them.
  • Added more specific tests to verify error scenario behaviors and ensure that commits and msg ack are executed correctly for each message.
  • Improved doc regarding how to configure batch properly.

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

@github-actions github-actions bot added the doc Your PR contains doc changes, no matter whether the changes are in markdown or code files. label Oct 12, 2022
@nicoloboschi
Copy link
Contributor Author

@eolivelli please review it again

Copy link
Contributor

@dlg99 dlg99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall LGTM. I left couple of suggestions.

@codecov-commenter
Copy link

codecov-commenter commented Oct 18, 2022

Codecov Report

Merging #18017 (9a9e72b) into master (6c65ca0) will increase coverage by 9.52%.
The diff coverage is 37.91%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master   #18017      +/-   ##
============================================
+ Coverage     34.91%   44.44%   +9.52%     
- Complexity     5707    16777   +11070     
============================================
  Files           607     1548     +941     
  Lines         53396   126139   +72743     
  Branches       5712    13891    +8179     
============================================
+ Hits          18644    56059   +37415     
- Misses        32119    64222   +32103     
- Partials       2633     5858    +3225     
Flag Coverage Δ
unittests 44.44% <37.91%> (+9.52%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...org/apache/pulsar/broker/ServiceConfiguration.java 52.07% <ø> (ø)
.../service/SystemTopicBasedTopicPoliciesService.java 59.49% <0.00%> (+7.90%) ⬆️
.../pulsar/broker/stats/BrokerOperabilityMetrics.java 94.64% <ø> (+1.99%) ⬆️
...g/apache/pulsar/compaction/CompactedTopicImpl.java 69.28% <0.00%> (+58.57%) ⬆️
...va/org/apache/pulsar/client/impl/ConsumerBase.java 21.95% <0.00%> (ø)
...ache/pulsar/functions/utils/io/ConnectorUtils.java 0.00% <0.00%> (ø)
...va/org/apache/pulsar/io/jdbc/JdbcAbstractSink.java 3.22% <0.00%> (-0.97%) ⬇️
...java/org/apache/pulsar/io/jdbc/JdbcSinkConfig.java 20.00% <0.00%> (-5.00%) ⬇️
...main/java/org/apache/pulsar/io/jdbc/JdbcUtils.java 0.00% <0.00%> (ø)
...nator/impl/MLTransactionMetadataStoreProvider.java 0.00% <0.00%> (ø)
... and 1168 more

Copy link
Contributor

@dlg99 dlg99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@nicoloboschi nicoloboschi merged commit 7b52a92 into apache:master Oct 20, 2022
@nicoloboschi nicoloboschi added this to the 2.12.0 milestone Oct 20, 2022
@nicoloboschi nicoloboschi deleted the jdbc-batches branch October 20, 2022 15:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/connector doc Your PR contains doc changes, no matter whether the changes are in markdown or code files. ready-to-test
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants