Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

txn: use tokio_threadpool for txn scheduler #4628

Conversation

fredchenbj
Copy link
Member

@fredchenbj fredchenbj commented May 5, 2019

Signed-off-by: fredchenbj cfworking@163.com

What have you changed? (mandatory)

use tokio-threadpool to replace threadpool in tikv_util used by txn scheduler.

The currently defined types are listed below, please pick one of the types for this PR by removing the others:

  • Improvement (change which is an improvement to an existing feature)

How has this PR been tested? (mandatory)

manual tests

Signed-off-by: fredchenbj <cfworking@163.com>
@sre-bot
Copy link
Contributor

sre-bot commented May 5, 2019

Hi contributor, thanks for your PR.

This patch needs to be approved by someone of admins. They should reply with "/ok-to-test" to accept this PR for running test automatically.

@siddontang
Copy link
Contributor

Great, PTAL @breeswish @hicqu

@siddontang
Copy link
Contributor

@fredchenbj
can you show us the benchmark results?

@zhangjinpeng87
Copy link
Member

@fredchenbj Do you test the binary with grafana and check if these metrics work well?

@fredchenbj
Copy link
Member Author

fredchenbj commented May 6, 2019

I use go-ycsb tool with command: ./bin/go-ycsb load/run tikv -P workloads/workloada -p tikv.pd="10.136.16.2:2379" -p tikv.type="txn" -p tikv.conncount=32 --threads=32 --target=3000. The result of test case is followed:

load
master: Avg(us): 6779, Min(us): 4925, Max(us): 393967, 95th(us): 9000, 99th(us): 13000

tokio:  Avg(us): 6608, Min(us): 4743, Max(us): 363260, 95th(us): 8000, 99th(us): 13000

run
master: Avg(us): 2748, Min(us): 1898, Max(us): 229763, 95th(us): 4000, 99th(us): 5000

tokio:  Avg(us): 2841, Min(us): 1929, Max(us): 210304, 95th(us): 4000, 99th(us): 5000

From above, the latency under certain load is not decreased obviously, and the context switch decreased about 5%.
@siddontang

@fredchenbj
Copy link
Member Author

@fredchenbj Do you test the binary with grafana and check if these metrics work well?

I have tested the binary, I would check metrics more carefully and comment here later.

@fredchenbj fredchenbj changed the title txn: use tokio_thread for txn scheduler txn: use tokio_threadpool for txn scheduler May 6, 2019
@zhangjinpeng87
Copy link
Member

and the context switch decreased about 5%

@fredchenbj I wonder why the context switch decreased, do you have any idea?

@breezewish breezewish added component/performance Component: Performance contribution This PR is from a community contributor. labels May 6, 2019
.build()
}

#[inline]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove all of the inline hints, let the compiler decide to optimize the code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ngaut The inline hint still letting the compiler decide inline or not. (note that it is not #[inline(always)])

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know, we don't need to do that, usually the compiler is smart enough to make the decision.

@@ -0,0 +1,106 @@
// Copyright 2019 TiKV Project Authors. Licensed under Apache-2.0.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to call it xxx_impl?

.build()
}

#[inline]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ngaut The inline hint still letting the compiler decide inline or not. (note that it is not #[inline(always)])

m.borrow_mut()
.command_keyread_duration
.with_label_values(&[cmd])
.observe(count);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this duration is called count?

m.borrow_mut()
.processing_read_duration
.with_label_values(&[cmd])
.start_coarse_timer();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This timer does not record any time.

@fredchenbj
Copy link
Member Author

@fredchenbj Do you test the binary with grafana and check if these metrics work well?

I have check these metrics work as before.

and the context switch decreased about 5%

@fredchenbj I wonder why the context switch decreased, do you have any idea?

I don't have any ideas about this.

}
}

pub fn build_sched_pool(pool_size: usize, name_prefix: &str) -> FuturePool {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name is quite confusing.. build_sched_pool does not build a SchedPool but a FuturePool instead.

@@ -5,6 +5,9 @@ use prometheus_static_metric::*;

make_static_metric! {
pub label_enum CommandKind {
get,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this change is not necessary?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, here I tested wrong before, and will remove it.

Executor {
scheduler: Some(scheduler),
pool: Some(pool),
schedpool: Some(pool),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prefer sched_pool

@breezewish
Copy link
Member

@zhangjinpeng1987 A work-stealing pool have local task queues, which may help with reducing context switch.

Signed-off-by: fredchenbj <cfworking@163.com>
@breezewish
Copy link
Member

/run-integration-tests

@fredchenbj
Copy link
Member Author

friendly ping @siddontang @breeswish

.processing_read_duration
.with_label_values(&[tag])
.start_coarse_timer();
let read_duration = tikv_util::time::Instant::now_coarse();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we use Instant::now_coarse() here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed it.

Signed-off-by: fredchenbj <cfworking@163.com>
breezewish
breezewish previously approved these changes May 13, 2019
Copy link
Member

@breezewish breezewish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly fine. @zhangjinpeng1987 PTAL

@siddontang
Copy link
Contributor

PTAL @hicqu

please fix the confliect @fredchenbj

@hicqu hicqu force-pushed the fredchenbj/use-tokio-threadpool-for-txn-scheduler-and-process branch from e0113ce to 3b84fae Compare May 17, 2019 06:12
@hicqu
Copy link
Contributor

hicqu commented May 17, 2019

I merged master and resolved some conflicts. PTAL again @breeswish thanks!

@hicqu hicqu force-pushed the fredchenbj/use-tokio-threadpool-for-txn-scheduler-and-process branch from 3b84fae to 9274fd6 Compare May 17, 2019 06:14
@hicqu
Copy link
Contributor

hicqu commented May 17, 2019

LGTM

hicqu
hicqu previously approved these changes May 17, 2019
Signed-off-by: qupeng <qupeng@pingcap.com>
@breezewish breezewish merged commit 93146c2 into tikv:master May 17, 2019
jswh pushed a commit to jswh/tikv that referenced this pull request May 27, 2019
Signed-off-by: fredchenbj <cfworking@163.com>
sticnarf pushed a commit to sticnarf/tikv that referenced this pull request Oct 27, 2019
Signed-off-by: fredchenbj <cfworking@163.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/performance Component: Performance contribution This PR is from a community contributor.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants