Skip to content

[core] Support MULTI_WRITE mode for KeyValueFileStoreWrite when multiple jobs are writing to one table simultaneously… #5301

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

xiangyuf
Copy link
Contributor

Purpose

Linked issue: close #5214

Tests

MergeTreeCompactManagerTest#testSyncOrphanFiles()
KeyValueFileStoreWriteTest#testMultiWriteModeEnabled()
KeyValueFileStoreWriteTest#testMultiWriteModeDisabled()

API and Format

Documentation

<td><h5>write.mode</h5></td>
<td style="word-wrap: break-word;">SINGLE_WRITE</td>
<td><p>Enum</p></td>
<td>Specify the write mode for table. When multiple jobs are writing to one table, the MULTI_WRITE mode should be used for the write-and-compact job. In this case, the CompactManager can scan files generated by other users. Notice: This option does not work for dedicated compaction job.<br /><br />Possible values:<ul><li>"SINGLE_WRITE"</li><li>"MULTI_WRITE"</li></ul></td>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on my understanding of this PR :
1: Specify the write mode for table. -> Specifies the write mode for the current write job.
2: the MULTI_WRITE mode should be used for the write-and-compact job that wirte-only=false and other write jobs should be write-only=true.


/** Multiple writes are creating new files for current table. */
MULTI_WRITE
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion : SINGLE_WRITER / MULTI_WRITER will be better.


KeyValueFileStoreWrite write1 = (KeyValueFileStoreWrite) store1.newWrite(user1);
KeyValueFileStoreWrite write2 = (KeyValueFileStoreWrite) store2.newWrite(user2);

Copy link
Contributor

@LinMingQiang LinMingQiang Mar 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried adding a writer-3 in this test, the result is not as expected, the file written by the writer-1 is not be compacted. Please correct me if I understand wrongly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

writer-1 is the write and compact job, writer-3 should be write only. The file added by writer-3 should be compacted by writer-1.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, my expression is wrong. the file written by the writer-2 is not be compacted. write-1 and write-3 will be compacted.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@LinMingQiang There is a debug in the previous implementable. Pls check the latest version. Also I've added the writer-3 in the test.

…ple jobs are writing to one table simultaneously
@xiangyuf
Copy link
Contributor Author

xiangyuf commented Apr 2, 2025

@LinMingQiang I've discussed with @JingsongLi about this feature. We need to avoid continuously file scan in writer. So this pr will only be a PoC temporarily.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature] Support periodically refresh current levels in MergeTreeCompactManager
2 participants