Skip to content
This repository was archived by the owner on Jun 20, 2025. It is now read-only.

Fixed utilizing all cores for runForAsync #82

Merged
merged 4 commits into from
Aug 30, 2018

Conversation

segabriel
Copy link
Member

fixed #57

@segabriel segabriel added the ready for review ready for review label Aug 25, 2018
ronenhamias
ronenhamias previously approved these changes Aug 25, 2018
Copy link
Member

@ronenhamias ronenhamias left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1+

@artem-v artem-v self-requested a review August 26, 2018 09:28
@artem-v
Copy link
Contributor

artem-v commented Aug 26, 2018

@segabriel - plz provide an evidence that this PR is fixing issue #57. Screenshots, statistics, anything.

Copy link
Contributor

@artem-v artem-v left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What was wrong and how it improved.

@artem-v
Copy link
Contributor

artem-v commented Aug 27, 2018

I run them locally. Looks like latency sky-rockets. But call/sec are indeed stay the same.

RequestOneBenchmarks
Before changes:

-- Timers ----------------------------------------------------------------------
i.s.s.b.s.RequestOneBenchmarks-timer
             count = 2531157
         mean rate = 48016.27 calls/second
     1-minute rate = 32050.53 calls/second
     5-minute rate = 14846.40 calls/second
    15-minute rate = 10908.45 calls/second
               min = 0.07 milliseconds
               max = 84.53 milliseconds
              mean = 4.69 milliseconds
            stddev = 4.82 milliseconds
            median = 4.35 milliseconds
              75% <= 5.95 milliseconds
              95% <= 9.56 milliseconds
              98% <= 12.09 milliseconds
              99% <= 17.46 milliseconds
            99.9% <= 84.05 milliseconds

After changes:

i.s.s.b.s.RequestOneBenchmarks-timer
             count = 2602931
         mean rate = 43357.48 calls/second
     1-minute rate = 31523.79 calls/second
     5-minute rate = 14471.27 calls/second
    15-minute rate = 10274.76 calls/second
               min = 8.36 milliseconds
               max = 279.26 milliseconds
              mean = 22.77 milliseconds
            stddev = 14.71 milliseconds
            median = 23.07 milliseconds
              75% <= 26.19 milliseconds
              95% <= 38.81 milliseconds
              98% <= 44.84 milliseconds
              99% <= 50.48 milliseconds
            99.9% <= 278.05 milliseconds

RequestManyLatencyBenchmarks
Before changes:

i.s.s.b.s.RequestManyLatencyBenchmarks-timer
             count = 10261
         mean rate = 170.99 calls/second
     1-minute rate = 131.04 calls/second
     5-minute rate = 73.33 calls/second
    15-minute rate = 59.14 calls/second
               min = 244.06 milliseconds
               max = 4173.48 milliseconds
              mean = 1471.23 milliseconds
            stddev = 623.58 milliseconds
            median = 1355.94 milliseconds
              75% <= 1658.75 milliseconds
              95% <= 2618.99 milliseconds
              98% <= 3202.16 milliseconds
              99% <= 4131.71 milliseconds
            99.9% <= 4173.36 milliseconds

After changes:

i.s.s.b.s.RequestManyLatencyBenchmarks-timer
             count = 10240
         mean rate = 170.40 calls/second
     1-minute rate = 101.52 calls/second
     5-minute rate = 28.15 calls/second
    15-minute rate = 9.94 calls/second
               min = 3296.01 milliseconds
               max = 10954.49 milliseconds
              mean = 5705.71 milliseconds
            stddev = 1298.11 milliseconds
            median = 5513.07 milliseconds
              75% <= 6015.06 milliseconds
              95% <= 9966.12 milliseconds
              98% <= 10380.58 milliseconds
              99% <= 10591.16 milliseconds
            99.9% <= 10954.48 milliseconds

@segabriel
Copy link
Member Author

First of all, we have a test inside this repo ExampleBenchmarksRunner and after these changes we use all threads
screenshot from 2018-08-28 13-30-10

and here are results before:

i.s.b.e.ExampleBenchmarksRunner-timer
             count = 39752000
         mean rate = 660009.97 calls/second
     1-minute rate = 634213.54 calls/second
     5-minute rate = 598581.66 calls/second
    15-minute rate = 589695.60 calls/second
               min = 874.00 nanoseconds
               max = 19053.00 nanoseconds
              mean = 922.57 nanoseconds
            stddev = 436.17 nanoseconds
            median = 895.00 nanoseconds
              75% <= 907.00 nanoseconds
              95% <= 999.00 nanoseconds
              98% <= 1217.00 nanoseconds
              99% <= 1252.00 nanoseconds
            99.9% <= 1447.00 nanoseconds

and after

i.s.b.e.ExampleBenchmarksRunner-timer
             count = 82648922
         mean rate = 1372473.15 calls/second
     1-minute rate = 1284752.33 calls/second
     5-minute rate = 1174771.28 calls/second
    15-minute rate = 1146330.90 calls/second
               min = 904.00 nanoseconds
               max = 17540.00 nanoseconds
              mean = 1376.60 nanoseconds
            stddev = 417.21 nanoseconds
            median = 1308.00 nanoseconds
              75% <= 1376.00 nanoseconds
              95% <= 1657.00 nanoseconds
              98% <= 1729.00 nanoseconds
              99% <= 1758.00 nanoseconds
            99.9% <= 3771.00 nanoseconds

@segabriel
Copy link
Member Author

The next point is the benchmarks from scalecube-services and I achieved the following results before these changes:

i.s.s.b.s.RequestOneBenchmarks-timer
             count = 3318549
         mean rate = 55300.53 calls/second
     1-minute rate = 41550.65 calls/second
     5-minute rate = 24186.04 calls/second
    15-minute rate = 19724.22 calls/second
               min = 0.06 milliseconds
               max = 39.82 milliseconds
              mean = 4.04 milliseconds
            stddev = 3.26 milliseconds
            median = 3.76 milliseconds
              75% <= 5.32 milliseconds
              95% <= 8.73 milliseconds
              98% <= 11.74 milliseconds
              99% <= 13.74 milliseconds
            99.9% <= 39.70 milliseconds

I added imagic number as a concurrency, I began to change it and I achieved the following results:

for concurrency=16

i.s.s.b.s.RequestOneBenchmarks-timer
             count = 3143932
         mean rate = 52379.40 calls/second
     1-minute rate = 39373.92 calls/second
     5-minute rate = 21218.18 calls/second
    15-minute rate = 16703.84 calls/second
               min = 0.31 milliseconds
               max = 16.50 milliseconds
              mean = 1.20 milliseconds
            stddev = 1.53 milliseconds
            median = 0.76 milliseconds
              75% <= 1.24 milliseconds
              95% <= 3.47 milliseconds
              98% <= 6.25 milliseconds
              99% <= 9.36 milliseconds
            99.9% <= 16.47 milliseconds

for concurrency=32

i.s.s.b.s.RequestOneBenchmarks-timer
             count = 3298677
         mean rate = 54952.61 calls/second
     1-minute rate = 40712.38 calls/second
     5-minute rate = 21750.25 calls/second
    15-minute rate = 16966.28 calls/second
               min = 0.64 milliseconds
               max = 18.31 milliseconds
              mean = 2.30 milliseconds
            stddev = 2.19 milliseconds
            median = 1.64 milliseconds
              75% <= 2.54 milliseconds
              95% <= 6.11 milliseconds
              98% <= 10.40 milliseconds
              99% <= 13.15 milliseconds
            99.9% <= 18.29 milliseconds

for concurrency=64

i.s.s.b.s.RequestOneBenchmarks-timer
             count = 5448622
         mean rate = 56760.70 calls/second
     1-minute rate = 48935.09 calls/second
     5-minute rate = 24088.93 calls/second
    15-minute rate = 16249.14 calls/second
               min = 1.30 milliseconds
               max = 30.04 milliseconds
              mean = 4.34 milliseconds
            stddev = 2.80 milliseconds
            median = 3.98 milliseconds
              75% <= 5.10 milliseconds
              95% <= 9.57 milliseconds
              98% <= 12.76 milliseconds
              99% <= 14.45 milliseconds
            99.9% <= 29.85 milliseconds

for concurrency=128

i.s.s.b.s.RequestOneBenchmarks-timer
             count = 6468648
         mean rate = 53895.13 calls/second
     1-minute rate = 48372.55 calls/second
     5-minute rate = 25672.21 calls/second
    15-minute rate = 17003.80 calls/second
               min = 3.72 milliseconds
               max = 46.32 milliseconds
              mean = 9.40 milliseconds
            stddev = 4.63 milliseconds
            median = 9.31 milliseconds
              75% <= 11.40 milliseconds
              95% <= 16.98 milliseconds
              98% <= 22.20 milliseconds
              99% <= 26.16 milliseconds
            99.9% <= 46.10 milliseconds

@artem-v please suppose your opinion/solution for it

() -> {
long start = i * countPerThread;
Flux.fromStream(LongStream.range(start, start + countPerThread).boxed())
.flatMap(unitOfWork::apply, 64, Integer.MAX_VALUE)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool finding!
But why not just runtime.available_processoors ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By default, this value is 256, and how I understand this value specifies how many publishers can be processed simultaneously in flatMap operator (other words, flatMap subscribe to these publishers)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I think we should add an opportunity to specify it as a benchmark settings param.

@artem-v
Copy link
Contributor

artem-v commented Aug 29, 2018

Let's add parameter concurrency to the benchmark settings and make it default to 16

@segabriel
Copy link
Member Author

Ok, but the adding new param will be in a new PR.

@artem-v
Copy link
Contributor

artem-v commented Aug 29, 2018

Ok, but the adding new param will be in a new PR.

Let's add in this PR

@artem-v artem-v merged commit 5fb1ef6 into develop Aug 30, 2018
@artem-v artem-v deleted the feature/issue-57-runForAsync branch August 30, 2018 10:47
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
ready for review ready for review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BenchmarksState.runForAsync() is not utilizing all cores
3 participants