-
Notifications
You must be signed in to change notification settings - Fork 418
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lock-free executor #77
Comments
I'm going to look at abstracting out |
Here is a naive lock-free executor implementation with "n" threads that uses a lock-free queue: Start one writer thread and n - 1 reader threads. Writer thread:
Reader thread:
This implementation relies on two fundamental assumptions:
We think that 2 is a reasonable assumption, but it might not be true, given the difficulty we've had with MultiThreadedExecutor in Opensplice. |
Before considering a dependency on boost we should make sure that it doesn't affect using the real time approach on a embedded devices. What are the effects on the library/executable size and/or memory requirements when adding it? |
I would ask @codebot to weigh in here about Boost and embedded devices. From what I know, for the class of 32-bit microcontrollers that Morgan is focusing on, you generally have one core and probably don't even have an operating system, so multithreading is both impractical and unavailable. Therefore you wouldn't need a lock-free executor and should just use the SingleThreadedExecutor instead. The lock-free executor is for deterministic synchronization in a multi-threaded setting. As for the case of multithreading on an embedded device with multiple cores, I don't know what the numbers are like off the top of my head. From referencing Morgan's 2015 ROSCon talk, the next larger class of microcontrollers from 32-bit MCUs, which includes small multicore systems, probably has at least 1GB of RAM. From briefly looking at Stack Overflow and other sources, compiling a program with Boost can have an overhead on the order of 1 MB. http://stackoverflow.com/questions/2839172/why-my-c-output-executable-is-so-big Adding 1MB executable size when you have 1GB of RAM sounds safe to me, but then again I don't have any embedded systems experience. This also means that the lock-free executor using Boost must be in a separate library from other stuff that would be interesting to an embedded developer, since the STM32-class of microcontrollers will have RAM in the order of KB. The Boost lockfree queue is designed to be allocation free. From http://www.boost.org/doc/libs/1_59_0/doc/html/lockfree.html: |
As a matter of fact, there's a problem with using the Boost lockfree queue here that points out a fundamental problem with this outline. boost::lockfree::queue requires that T be trivially constructible/destructible, i.e. a Plain Old Datatype. executor::AnyExecutable does not fit this bill, since it contains shared pointers as member variables and its destructor compares and exchanges an atomic boolean (thus it is not trivial). I asked about this on Stack Overflow and first suggestion was to use Intel TBB concurrent_queue: http://stackoverflow.com/questions/34009485/non-pod-types-in-lock-free-data-structures which is not a lock free, but is very fast. A discussion on the TBB forums leads me to believe that most lock-free data structures will be restricted in the size of the types because you need to use a DCAS (double word compare and swap) instruction to remove a node from a singly linked list. I won't pretend to understand everything that's going on here, but it sounds like to use a truly lock-free queue I would have to store raw pointers instead of shared pointers, which would require a bit more refactoring and forethought. |
There maybe other issues that you cannot work around but for the trivially constructable issue you could try swapping those to weak pointers. |
It appears that weak_ptr does not have a trivial destructor. |
repost from closed PR: Ideas discussed today (12/1/15): We can only put POD structures into lock-free queues. That's how we got to using raw pointers here. To make a decision on how to proceed here, we need to get a better idea of the requirements. @jacquelinekay, please write up a proposal for what is and what is not supported in a real-time-safe / lock-free manner, e.g.:
|
For what it's worth, the RTM executor works in the same way as Orocos's. A real fine component is limited to periodic invocation. Being able to have an event-driven node that is real-time safe would be great, but I suspect that the majority use case for real-time is periodically executing control loops. Even a (relatively linear) data processing pipeline executing in such an environment seems likely to be executed at a fixed rate with each step executed in turn in a fixed sequence. Some data on the commonality of real-time event-based systems would be useful here; RTM doesn't have any amongst its users but that's because it doesn't support it. |
How should we expose synchronization primitives to implement a lock-free executor for real-time execution?
The text was updated successfully, but these errors were encountered: