Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

curvebs/chunkserver: optimize sync chunk when write in bulk #1912

Merged
merged 1 commit into from
Dec 7, 2022

Conversation

fansehep
Copy link
Member

@fansehep fansehep commented Sep 10, 2022

Signed-off-by: fan yfan3763@gmail.com

What problem does this PR solve?

Issue Number: #1887

Problem Summary: see issue.

What is changed and how it works?

What's Changed:
I use SyncChunkThread instread of the synctimer, Let each write determine on its own whether to notify. and the sumWrite is shared. Default 30s or sum of write >= 4 MB(maybe 4MB is not better) will force sync.
We need some tests or benchmarks for this changes to verify this design is better than old synctimer design.


And i think 4 MB as the default threshold is not accurate, Maybe we can use binary search to find the best thredhold for most machines.

@fansehep
Copy link
Member Author

recheck

1 similar comment
@fansehep
Copy link
Member Author

recheck

@fansehep fansehep force-pushed the opm_flush_log branch 3 times, most recently from d37de94 to 0a60829 Compare September 13, 2022 16:47
@fansehep
Copy link
Member Author

recheck

1 similar comment
@fansehep
Copy link
Member Author

recheck

@fansehep fansehep force-pushed the opm_flush_log branch 2 times, most recently from 262c718 to 8e7ac71 Compare September 16, 2022 13:03
conf/chunkserver.conf Outdated Show resolved Hide resolved
@fansehep fansehep force-pushed the opm_flush_log branch 2 times, most recently from d57cffc to 724fdc4 Compare September 29, 2022 05:00
@fansehep fansehep changed the base branch from master1 to master October 19, 2022 03:52
@fansehep
Copy link
Member Author

recheck

@fansehep fansehep force-pushed the opm_flush_log branch 2 times, most recently from 34dd3e2 to a1b842f Compare November 24, 2022 08:44
@fansehep
Copy link
Member Author

recheck

1 similar comment
@wu-hanqing
Copy link
Contributor

recheck

@ilixiaocui
Copy link
Contributor

recheck

@cw123
Copy link
Contributor

cw123 commented Nov 28, 2022

recheck

@ilixiaocui
Copy link
Contributor

recheck

@ilixiaocui
Copy link
Contributor

image

combined into one commit.

@fansehep
Copy link
Member Author

image

combined into one commit.

should review it firstly?

@@ -88,6 +88,8 @@ copyset.raft_snapshot_uri=curve://./0/copysets # __CURVEADM_TEMPLATE__ curve://
copyset.recycler_uri=local://./0/recycler # __CURVEADM_TEMPLATE__ local://${prefix}/data/recycler __CURVEADM_TEMPLATE__
# chunkserver启动时,copyset并发加载的阈值,为0则表示不做限制
copyset.load_concurrency=10
# chunkserver 使用多少个线程来完成copysetsync, 不能为0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please use english comment.

@@ -111,6 +111,10 @@ struct CopysetNodeOptions {

// 限制chunkserver启动时copyset并发恢复加载的数量,为0表示不限制
uint32_t loadConcurrency = 0;
// chunkserver sync_thread_pool 的线程数量
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

// syncChunkLimit default limit
uint64_t syncChunkLimit = 2 * 1024 * 1024;
// syncHighChunkLimit default limit = 64k
uint64_t syncThreshold = 65536;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be better to use 64 * 1024 instead of 65536.

@@ -54,6 +56,8 @@ using curve::fs::FileSystemInfo;

const char *kCurveConfEpochFilename = "conf.epoch";

uint32_t CopysetNode::syncTriggerSeconds_ = 25;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why use global variables?

/**
* each copyset use copyset_pool to sync
*/
void setSharedSyncPool(const std::shared_ptr<TaskThreadPool<>>& threadpool);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not initialize syncTriggerSeconds_ in CopysetNode::Init

@@ -223,6 +223,8 @@ class CopysetNodeManager : public curve::common::Uncopyable {
CopysetNodeOptions copysetNodeOptions_;
// 控制copyset并发启动的数量
std::shared_ptr<TaskThreadPool<>> copysetLoader_;
// copysetmap 的sync操作全部通过一个线程池来完成
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use English comments for new code.

*chunkrate_ += length;
uint64_t res = *chunkrate_;
auto actualSyncChunkLimits = syncChunkLimits_;
// if single write size > syncThreshold, for cache friend to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is best to change the rate at the entry of CSChunkFile::Write. And update of syncChunkLimits_ can be put into a function like MayUpdateLimits(uint64_t writelen)

private:
uint64_t syncChunkLimits_;
uint64_t syncThreshold_;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are these two variables syncChunkLimits_ and syncThreshold_ contained in three classes CSMetaCacheCSChunkFile and CopysetNode?

@fansehep fansehep force-pushed the opm_flush_log branch 2 times, most recently from bb58d5c to e2c3cb8 Compare December 1, 2022 02:48
@fansehep
Copy link
Member Author

fansehep commented Dec 1, 2022

recheck

@fansehep fansehep force-pushed the opm_flush_log branch 2 times, most recently from a0b4982 to 7a015d4 Compare December 1, 2022 07:10
@wu-hanqing
Copy link
Contributor

@fansehep you can find whether https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=428b1d6b29ca599c5700d4bc4f4ce4c5880369bf is useful in our case, according to the description the logical is almost same with ours.

@fansehep
Copy link
Member Author

fansehep commented Dec 2, 2022

@fansehep you can find whether https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=428b1d6b29ca599c5700d4bc4f4ce4c5880369bf is useful in our case, according to the description the logical is almost same with ours.

Ok. thanks

Signed-off-by: fan <yfan3763@gmail.com>
@ilixiaocui
Copy link
Contributor

recheck

1 similar comment
@fansehep
Copy link
Member Author

fansehep commented Dec 7, 2022

recheck

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants