Enable HW clock by default#3103
Conversation
Signed-off-by: Daniil Kashapov <daniil.kashapov.ykt@gmail.com>
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## unstable #3103 +/- ##
============================================
+ Coverage 74.30% 74.86% +0.55%
============================================
Files 129 129
Lines 71013 71185 +172
============================================
+ Hits 52767 53290 +523
+ Misses 18246 17895 -351
🚀 New features to boost your workflow:
|
zuiderkwast
left a comment
There was a problem hiding this comment.
LGTM. Small and simple.
|
Benchmark ran on this commit: Benchmark Comparison: unstable vs e47cc4a (averaged) - rps metricsRun Summary:
Statistical Notes:
Note: Values with (n=X, σ=Y, CV=Z%, CI99%=±W%, PI99%=±V%) indicate averages from X runs with standard deviation Y, coefficient of variation Z%, 99% confidence interval margin of error ±W% of the mean, and 99% prediction interval margin of error ±V% of the mean. CI bounds [A, B] and PI bounds [C, D] show the actual interval ranges. Configuration:
Configuration:
|
|
I think awesome results for such change, definitely worth it! |
Yes, it's very impressive.
I have a guess: We "calibrate" the TSC clock by counting clock ticks and comparing it with a sleep and the monotonic clock, like this: /* Calibrate TSC against CLOCK_MONOTONIC */
struct timespec start, end;
uint64_t tsc_start, tsc_end;
clock_gettime(CLOCK_MONOTONIC, &start);
tsc_start = __rdtsc();
usleep(10000); /* Sleep for 10ms */
tsc_end = __rdtsc();
clock_gettime(CLOCK_MONOTONIC, &end);
uint64_t elapsed_us = (end.tv_sec - start.tv_sec) * 1000000ULL + (end.tv_nsec - start.tv_nsec) / 1000;
uint64_t tsc_elapsed = tsc_end - tsc_start;
long sample_ticksPerMicrosecond = tsc_elapsed / elapsed_us;
/* Use the maximum out of TSC_CALIBRATION_ITERATIONS iterations for accuracy */
if (sample_ticksPerMicrosecond > mono_ticksPerMicrosecond) {
mono_ticksPerMicrosecond = sample_ticksPerMicrosecond;
}This calibration might be inexact, so we get a too high value for The calibration code above compares it to clock_gettime, but clock_gettime takes some time itself to run. Perhaps we should instead:
To compare it over a very long time, we could do asyncronously after the server has started, for exaple in the serverCron. We can start up the server using posix clock and after a long calibration of one minute or so, compare to clock_gettime and then switch to TSC clock... |
|
It may also the that /proc/cpuinfo reported a model name with a frequency but the precision of this frequency is not great. (We use this if it contains a frequency in GHz, such as "Intel(R) Core(TM) i7-8550U CPU @ 1.80GHz"). Maybe using the calibration code would gives us a more exact value than the reported frequency in the model name. (?) |
Because x86 runner was most likely Intel, somehow I think this is closer to the truth, I will investigate how both methods compare against each other for AMD/Intel |
This is problematic. If someone has started a timer, and is currently timing something, we can't just change the clock asynchronously. |
Signed-off-by: Daniil Kashapov <daniil.kashapov.ykt@gmail.com>
|
On some Intel systems it was observed that TSC |
After #3103 time sensitive `test-ubuntu-reclaim-cache` started to fail because now startup always includes 30ms of calibration of HW clock, that's why we get this output: ``` Run echo "test SAVE doesn't increase cache" test SAVE doesn't increase cache 2460491776 Could not connect to Valkey at 127.0.0.1:8080: Connection refused ``` Added waits for server to start, locally run, it helps --------- Signed-off-by: Daniil Kashapov <daniil.kashapov.ykt@gmail.com>
JimB123
left a comment
There was a problem hiding this comment.
LGTM. It was interesting to find some of the results of this research.
After valkey-io#3103 time sensitive `test-ubuntu-reclaim-cache` started to fail because now startup always includes 30ms of calibration of HW clock, that's why we get this output: ``` Run echo "test SAVE doesn't increase cache" test SAVE doesn't increase cache 2460491776 Could not connect to Valkey at 127.0.0.1:8080: Connection refused ``` Added waits for server to start, locally run, it helps --------- Signed-off-by: Daniil Kashapov <daniil.kashapov.ykt@gmail.com>
After valkey-io#3103 time sensitive `test-ubuntu-reclaim-cache` started to fail because now startup always includes 30ms of calibration of HW clock, that's why we get this output: ``` Run echo "test SAVE doesn't increase cache" test SAVE doesn't increase cache 2460491776 Could not connect to Valkey at 127.0.0.1:8080: Connection refused ``` Added waits for server to start, locally run, it helps --------- Signed-off-by: Daniil Kashapov <daniil.kashapov.ykt@gmail.com> Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
After valkey-io#3103 time sensitive `test-ubuntu-reclaim-cache` started to fail because now startup always includes 30ms of calibration of HW clock, that's why we get this output: ``` Run echo "test SAVE doesn't increase cache" test SAVE doesn't increase cache 2460491776 Could not connect to Valkey at 127.0.0.1:8080: Connection refused ``` Added waits for server to start, locally run, it helps --------- Signed-off-by: Daniil Kashapov <daniil.kashapov.ykt@gmail.com> Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
After valkey-io#3103 time sensitive `test-ubuntu-reclaim-cache` started to fail because now startup always includes 30ms of calibration of HW clock, that's why we get this output: ``` Run echo "test SAVE doesn't increase cache" test SAVE doesn't increase cache 2460491776 Could not connect to Valkey at 127.0.0.1:8080: Connection refused ``` Added waits for server to start, locally run, it helps --------- Signed-off-by: Daniil Kashapov <daniil.kashapov.ykt@gmail.com> Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
After valkey-io#3103 time sensitive `test-ubuntu-reclaim-cache` started to fail because now startup always includes 30ms of calibration of HW clock, that's why we get this output: ``` Run echo "test SAVE doesn't increase cache" test SAVE doesn't increase cache 2460491776 Could not connect to Valkey at 127.0.0.1:8080: Connection refused ``` Added waits for server to start, locally run, it helps --------- Signed-off-by: Daniil Kashapov <daniil.kashapov.ykt@gmail.com> Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
After valkey-io#3103 time sensitive `test-ubuntu-reclaim-cache` started to fail because now startup always includes 30ms of calibration of HW clock, that's why we get this output: ``` Run echo "test SAVE doesn't increase cache" test SAVE doesn't increase cache 2460491776 Could not connect to Valkey at 127.0.0.1:8080: Connection refused ``` Added waits for server to start, locally run, it helps --------- Signed-off-by: Daniil Kashapov <daniil.kashapov.ykt@gmail.com> Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
After valkey-io#3103 time sensitive `test-ubuntu-reclaim-cache` started to fail because now startup always includes 30ms of calibration of HW clock, that's why we get this output: ``` Run echo "test SAVE doesn't increase cache" test SAVE doesn't increase cache 2460491776 Could not connect to Valkey at 127.0.0.1:8080: Connection refused ``` Added waits for server to start, locally run, it helps --------- Signed-off-by: Daniil Kashapov <daniil.kashapov.ykt@gmail.com> Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
After valkey-io#3103 time sensitive `test-ubuntu-reclaim-cache` started to fail because now startup always includes 30ms of calibration of HW clock, that's why we get this output: ``` Run echo "test SAVE doesn't increase cache" test SAVE doesn't increase cache 2460491776 Could not connect to Valkey at 127.0.0.1:8080: Connection refused ``` Added waits for server to start, locally run, it helps --------- Signed-off-by: Daniil Kashapov <daniil.kashapov.ykt@gmail.com> Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
On some Intel systems it was observed that TSC `ticksPerMicrosecond` derived from `/proc/cpuinfo` was different from calibrated `ticksPerMicrosecond` or kernel derived one, so it was decided to always calibrate and look for `constant_tsc` flag. Resolves valkey-io#2597 --------- Signed-off-by: Daniil Kashapov <daniil.kashapov.ykt@gmail.com>
After valkey-io#3103 time sensitive `test-ubuntu-reclaim-cache` started to fail because now startup always includes 30ms of calibration of HW clock, that's why we get this output: ``` Run echo "test SAVE doesn't increase cache" test SAVE doesn't increase cache 2460491776 Could not connect to Valkey at 127.0.0.1:8080: Connection refused ``` Added waits for server to start, locally run, it helps --------- Signed-off-by: Daniil Kashapov <daniil.kashapov.ykt@gmail.com>
After valkey-io#3103 time sensitive `test-ubuntu-reclaim-cache` started to fail because now startup always includes 30ms of calibration of HW clock, that's why we get this output: ``` Run echo "test SAVE doesn't increase cache" test SAVE doesn't increase cache 2460491776 Could not connect to Valkey at 127.0.0.1:8080: Connection refused ``` Added waits for server to start, locally run, it helps --------- Signed-off-by: Daniil Kashapov <daniil.kashapov.ykt@gmail.com> Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
On some Intel systems it was observed that TSC
ticksPerMicrosecondderived from/proc/cpuinfowas different from calibratedticksPerMicrosecondor kernel derived one, so it was decided to always calibrate and look forconstant_tscflag.Resolves #2597