Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SpinLock Improvements #415

Merged
merged 7 commits into from
Dec 7, 2020

Conversation

jsuereth
Copy link
Contributor

@jsuereth jsuereth commented Dec 2, 2020

Level-up the spinlock implementation.

  • Use std::atomic instead of std::atomic_flag for the core
  • Use an incremental back-off strategy instead of just a busy loop
    • First we busy-loop
    • Next we use std::current_thread::yield() in a loop (single-core cpu friendly)
    • Next we legitimately sleep.
  • Adds a try_lock method (as previously requested) that does NOT do any spinning.

A few caveats in the current implementation:

  • The constants for # of iterations or sleep ms are NOT optimised, and likely could use tuning.
  • We do NOT issue a PAUSE/YIELD instruction during the busy loop, as we should. I stuck to as many std:: functions as I could to keep this simple to evaluate
  • I'm sure someone better in C++ arcana could do loop unrolling or something to make this even better. Please do! This is just an incremental improvement over the initial version.

@jsuereth jsuereth requested a review from a team December 2, 2020 19:42
@codecov
Copy link

codecov bot commented Dec 2, 2020

Codecov Report

Merging #415 (e07751c) into master (daab565) will decrease coverage by 0.10%.
The diff coverage is 23.07%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #415      +/-   ##
==========================================
- Coverage   94.70%   94.59%   -0.11%     
==========================================
  Files         179      179              
  Lines        7666     7677      +11     
==========================================
+ Hits         7260     7262       +2     
- Misses        406      415       +9     
Impacted Files Coverage Δ
api/include/opentelemetry/common/spin_lock_mutex.h 33.33% <23.07%> (-66.67%) ⬇️
sdk/test/metrics/counter_aggregator_test.cc 100.00% <0.00%> (+1.78%) ⬆️

}
std::this_thread::yield();
}
// Sleep and then start the whole process again.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As there is an unconditional loop anyway, wondering why not do that in the last Spin-Yield instead of retrying Spin-Fast? Is there some benefit to do Spin-Fast after Spin-Yield timeout?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is that (theoretically) spin fast is on the order of 10s of ns, while yield is 100s of NS and the full sleep is 1ms, i.e. each is an order of magnitude slower. The goal is to "catch" the lock when you're scheduled and no need to give up to another thread if you only need to wait a few ns, not a big deal.

This theory could be wrong and we should validate with a benchmark the behavior we want.

Copy link
Member

@lalitb lalitb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
Thanks for coming up with these changes in improving coherent cache misses, cpu contention and better performance on single core. We may need benchmarking to come up with right values for the loop constants, and/or backoff logic ( exponential or incremental ) in subsequent iteration.

@jsuereth
Copy link
Contributor Author

jsuereth commented Dec 3, 2020

LGTM.
Thanks for coming up with these changes in improving coherent cache misses, cpu contention and better performance on single core. We may need benchmarking to come up with right values for the loop constants, and/or backoff logic ( exponential or incremental ) in subsequent iteration.

Yes, I'd like to get a set of benchmarks going. Have you seen the recent update to the specification around benchmarks?

@lalitb
Copy link
Member

lalitb commented Dec 4, 2020

Yes, I'd like to get a set of benchmarks going. Have you seen the recent update to the specification around benchmarks?

Not really, the only spec document I know around benchmark is the one written by @ThomsonTan
(https://github.com/open-telemetry/opentelemetry-specification/blob/0a2c727e2e6c7e20aecd107db42e60bd9fae79b1/specification/performance-benchmark.md), but not sure if any SIG team has come up with actual performance data around it.

@lalitb lalitb merged commit 11cc6cd into open-telemetry:master Dec 7, 2020
@jsuereth jsuereth deleted the wip-spinlock-improvements branch December 12, 2020 18:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants